Search This Blog

Starting and stopping Ambari agents

In this article we will learn how to work with Ambari agents.We will learn how to start, stop , restart and some more operations of ambari agents from command line.
 If you are not using root user, you have to prefix sudo to all commands listed in below steps.

 1) Check the status of Ambari agents.

This command tells us wether Ambari agent is in running state or not. If  Ambari agent is in running state , Command also gives us pid of the Ambari agent process.

Command :

ambari-agent status
OR

sudo ambari-agent status

OR

service ambari-agent status



2) Stopping  Ambari agent

You can use any one of the commands to stop ambari agent.

sudo ambari-agent stop

OR

ambari-agent stop

OR

service ambari-agent stop



If ambari agent is stopped, Ambari server shows heart beat lost message for the node where ambari-agent is stopped.


3) Starting  Ambari agent

You can use any one of the commands to start ambari agent.

ambari-agent start

OR

sudo ambari-agent start

OR

service ambari-agent  start



Once ambari agent is started, heart beat issues in Ambari user interface will be fixed.

4) Restarting Ambari agent

You can use any one of the commands to restart ambari agent.

ambari-agent restart

OR

sudo ambari-agent restart

OR

service ambari-agent restart






5) Othe options

Ambari agent process also comes with some more commands, You can check them all using --help option.

ambari-agent --help



Starting and stopping Ambari-server

In this article we will learn how to work with Ambari server.We will learn how to start, stop , restart and some more operations of ambari server from command line. If you are not using root user, you have to prefix sudo to all commands listed in below steps.

1) Check the status of Ambari server.

You can use any one of the commands to check status of ambari server.

ambari-server status

OR

sudo ambari-server status

OR

service ambari-server status

This command tells us wether ambari server is in running state or not. If Ambari server is in running state , Command also gives us pid of the Ambari server process.



2) Stop Ambari server

You can use any one of the commands to stop ambari server.

sudo ambari-server stop

OR

ambari-server stop

OR

service ambari-server stop



3) Start Ambari server

You can use any one of the commands to start ambari server.

ambari-server start

OR

sudo -server start

OR

service ambari-server start



4) Restart Ambari server

You can use any one of the commands to restart ambari server.

ambari-server restart

OR

sudo ambari-server restart

OR

service ambari-server restart




5) Skip database check

While ambari server starting  , Ambari server checks consistency of database . If database has any issues, Ambari server fails to start.

We can skip database consistency check while starting ambari server.

Command:

 ambari-server start --skip-database-check




6) Other options 

Ambari server comes with several other commands, You can use --help option to list all other commands.

ambari-server --help



And every command comes with some options, We can also use --help to see all options of a command.

Example :

ambari-server stop --help




How to create Hive table for Parquet data format file ?

In this article we will learn How to create Hive table for parquet file format data. We  need to use stored as Parquet  to create a hive table for Parquet file  format data.


1) Create  hive table without location.

We can create hive table for Parquet data without location.  And we can load data into that table later.

Command :

create table employee_parquet(name string,salary int,deptno int,DOJ date)  row format delimited fields terminated by ',' stored as Parquet ;




2) Load data into hive table .

We can use regular insert query to load data into parquet file format table. Data will be converted into parquet file format implicitely while loading the data.

 insert into table employee_parquet select * from employee;



3) Create hive table with location

We can  also create hive table for parquet file data  with location. Specified location should have parquet file format data.

Command :

create table employee_parquet(name string,salary int,deptno int,DOJ date)  row format delimited fields terminated by ',' 
stored as parquet location '/data/in/employee_parquet' ;



How to create hive table for RC file format ?


In this article we will learn How to create Hive table for RC file format data. We  need to use stored as RCFILE to create a hive table for RCFILE format data.


1) Create  hive table without location.

We can create hive table for RCFILE data without location.  And we can load data into that table later.

Command :

create table employee_rc(name string,salary int,deptno int,DOJ date)  row format delimited fields terminated by ',' stored as RCFILE ;



2) Load data into hive table .

We can use regular insert query to load data into RC file format table. Data will be converted into RC file format implicitely while loading the data.

 insert into table employee_rc select * from employee;



3) Create hive table with location

We can  also create hive table for RC file data  with location. Specified location should have RC file format data.

Command :

create table employee_rc(name string,salary int,deptno int,DOJ date)  row format delimited fields terminated by ',' 
stored as RCFILE location '/data/in/employee_rc' ;


How to create Hive table for sequece file format data ?

In this article we will learn How to create Hive table for sequence file format data. We  need to use stored as SequenceFile to create a hive table for sequence file format data.


1) Create  hive table without location.

We can create hive table for sequence file data without location.  And we can load data into that table later.

Command :

create table employee_seq(name string,salary int,deptno int,DOJ date)  row format delimited fields terminated by ',' stored as SequenceFile ;



2) Load data into hive table .

We can use normal insert query to load data into sequence file format table. Data will be converted into sequence file format while loading the data.

 insert into table employee_seq select * from employee;



3) Create hive table with location

We can  also create hive table for sequence file data  with location. Specified location should have sequence file format data.

Command :

create table employee_seq(name string,salary int,deptno int,DOJ date)  row format delimited fields terminated by ',' 
stored as SequenceFile location '/data/in/employee_seq' ;


Enabling debug logs in Ambari agents


In this article we will learn how to enabled debug logs in Ambari agent.
Logging properties for Abari agent are available in ambari-agent.ini file under /etc/ambari-agent/conf folder.
We will learn how to modify ambari-agent.ini file to enable debug logs.

1) Check current log level.

By default ambari-agent comes with INFO loglevel That will not expose much internal calls.

Command :

grep loglevel /etc/ambari-agent/conf/ambari-agent.ini



2) Stop Ambari agent 

We need to prefix sudo command if we are not running command as a root user.

Command :

ambari-agent stop

OR

sudo ambari-agent stop





3) Modify ambari-agent.ini file 

Modify ambari-agent.ini file to replace INFO with DEBUG using VI editor.

4) Start Ambari agent

Command :

 ambari-agent start

OR

sudo  ambari-agent start



5) Confirm DEBUG logs in log file ambari-agent.log.

command :

tail -f /var/log/ambari-agent/ambari-agent.log


We can also run grep command like in first step to confirm latest loglevel in configuration file.

6) Repeat above steps for all nodes.

We have to repeat above steps for all nodes if we want to collect DEBUG logs from all nodes.

Working with databases in Apache Hive

In this article We will learn how to work on databases in Apache Hive. We will learn how to create, drop, change and use database in Apache Hive.

1) Check existing databases.

We check existing databases in Hive using show databases command. Apache Hive comes with a database called default.

Command :

show databases;


2) Creating a new database;

We can create a new database in Apache Hive using create command.

Command syntax:

create database [if not exist] {database-name};




3) Switching databases

By default , Queries will be run on default databases. If we want to run a query on different database, We have to change the current database.

We use use command to change the current database in Apache Hive.

Command :

use test;

The picture below changes current database to test database and creates a new table called dummy in test database.


4) Drop database

We can drop databases using drop command. We need to drop all tables in the database before droping the database. Otherwise we get database is not empty, one or more tables exist  error.

Command :

drop database test;

The picture below shows droping databse called test after deleting it's table dummy.

5) Using database name in a hive query

If we do not specify any database name in hive query, Hive runs it on current database.

We have to prefix database name to table name if we want to run the hive query on a particular database.

The query below runs on test database.

Select count(*) from test.dummy;





Enabling debug logs in Ambari server

Debug logs will help us troubleshoot ambari issues better and faster. Debug logs will contain more number of internal calls those will help us understanding the problem better.

In this article we will learn how to enabled debug logs in Ambari server. Logging properties for Ambari server are available in log4j.porperties file under /etc/ambari-server-conf folder. We will learn how to modify log4j properties to enable debug logs.

1) Check current log level in log4j.properties file

Check log4j.rootLogger property value in log4j.properties file.

Command:

grep  rootLogger /etc/ambari-server/conf/log4j.properties


In the above picture rootLogger value shown as INFO,file , We need to change it to DEBUG,file.

INFO is the default log level in Ambari server.

We can also check ambari-server.log file for log level.

Command :

tail -f   tail -f /var/log/ambari-server/ambari-server.log



2) Stop ambari-server

Command:

ambari-server stop

OR

service ambari-server stop


3) Modify log4j.properties file

Update log4j.rootLogger property value to DEBUG,info using VI editor.

Command :

vi /etc/ambari-server/conf/log4j.properties

4) Start ambari-server 


Command:

ambari-server start

OR

service ambari-server start



5) Check DEBUG logs in ambari-server.log file.

Command :

tail -f /var/log/ambari-server/ambari-server.log




6)  Revert loglevel to INFO

Please revert log level to INFO once debug logs collected using same steps. Debug logs take lot of space, can also cause service failures sometimes.


Modifying Ambari server configuration properties

In this article, We will learn how to modify Ambari server configuration properties. Ambari server contains most of it's configuration properties in ambari.properties. ambari.properties file is present under /etc/ambari-server/conf folder.

Now we will learn how to  modify pid directory property (pid.dir) in ambari.properties file. pid.dir property contains directory path where ambari server pid file is stored. It's default value is /var/run/ambari-server.

1) Check current value of pid.dir property using grep command.

grep pid.dir /etc/ambari-server/conf/ambari.properties


2) Stop ambari server .

If you are not running the command using root user, You may have to prepend sudo to the command.

Command:

ambari-server stop
service ambari-server stop


3) Modify the property pid.dir using vi editor.

The picture below shows pid.dir property pointing to /var/log/ambari-server.

Commnad:

vi /etc/ambari-server/conf/ambari.properties


4) Start ambari server

Command:

ambari-server start

service ambari-server start

The picture below also shows pid directory now pointing to /var/log/ambari-server path.


You can also run grep command to check pid.dir's latest value like in first step.

Starting HDFS,MAPREDUCE2 and YARN processes manually

We start HDFS,MAPREDUCE2 and YARN services using either Cloudera manager or Apache Ambari if we use CDH or HDP.Many a times Cloudera manager and Apache Ambari do not display complete error message if services fail to start.We can go to log directories and search for startup errors. Searching startup errors in log files is also not easy as log files are huge.

One easy way to find startup errors is starting processes manually from command line to locate errors easily on the console.

In this article we will learn how to start HDFS, MAPREDUCE2 and YARN services manually.

HDFS (Hadoop Distributed File System)

HDFS has processes data node, name node, Zookeeper failover controller and Journal nodes.
We will learn how to start them manually.


Starting datanode manually

We can start data node manually using HDFS datanode command. This needs to be run as hdfs user.

Command:

hdfs datanode &


Starting Namenode manually

We can start name node manually using HDFS namenode command. This also needs to be run as hdfs user.

Command :

HDFS namenode &
















Starting Zookeeper failover controller manually

We can start zookeeper failover controller manually using hdfs zkfc command.

Command :

hdfs zkfc &



Starting Journal node manually

We can start journal node manually using hdfs journalnode command. This also needs to be run as hdfs user.

Command :

hdfs journalnode &



MAPREDUCE2

Mapreduce2 has history server process, We can start history server process using mapred historyserver command. It is better to run as mapred user.

Command :

mapred historyserver &



YARN (Yet Another Resource Negotiator)

Yarn has node manager, resource manger and app timeline server processes.

Starting resource manager manually

We can start resource manager using yarn resource manager command. It is recommened to run as yarn user.

Command :

yarn resourcemanager &



Staring nodemanager manually

We can start node manager manually using yarn nodemanager command. This command also needs to be run as yarn user.

Command :

yarn nodemanager &



Staring app timeline server manually

We can start app timeline server manually using yarn timelineserver command. This command also needs to be run as yarn user.

Command :

yarn timelineserver &





These commands are useful only for trouble shooting startup errors, Apache ambari and Cloudera manager may not recognize your services if you start manually using above commands.

Happy Hadooping !!!!