Search This Blog

Enabling namenode HA using Apache ambari

In this article we will learn how to enable high availability for name node.  Nam node high availability has more than one name node. One of the name nodes will be active and it will responsible for serving user requests. Other namenodes will be in stand by mode, Standby namenodes will read meta data of active namenode continuously to be in sync with active namenode . If active namenode goes down , One of standby namenodes will become active to serve user requests that to without failing running jobs.


1) Confirm no HA enabled for Name node

By default, Hortonworks Data Platform setup will include namenode and Secondary namenode in HDFS service. In this scenario if namenode goes down , entire cluster will down and running jobs would fail. To address these issues ,We need to enable High availability (HA) for Name node.

Namenode HA will fail over to other name nodes  automatically to avoid cluster down scenarios .

The picture below confirms we have namenode and Secondary namenode in the cluster.

We need to enable Namenode HA to have active namenode and standby namenode.





2) Click enable Namenode HA under service actions

Click Enable Namenode HA to enable HA for name node. This will open namenode HA wizard.

Go to HDFS -----> click Service Actions -------> click Enable Namenode HA




3) Getting started

We need to enter nameservice ID in the first step. Nameservice ID will resolve to active namenode automatically.

All hadoop clients should use nameservice ID rather than hard coding active namenode.























4)  Select Hosts

Namenode HA wizard will install

  •     additional namenode
  •     3 journal nodes
  •     2 Zookeeper failover controllers


In this step we need to select hosts for additional namenode and journal nodes.






5) Review

This step provides complete information about what wizard is going to install and what configurations wizard is going to add/modify.

We can go back and modify things at this step if we want. Click Next to go to next step.





6) Create checkpoint

In this step , Wizard asks us to perform two things.

  •     entering namenode into safemode.
  •     creating checkpoint for namenode.


**** Please note we need to run given commands only on specified node.

Once these commands are run successful, Next button will be enabled.







7) Configure components

This step performs
  •     Stopping all services
  •     Installing additional namenode on specified host
  •     Installing journal nodes on specified hosts.
  •     Modifying configurations with required properties for Namdenode HA.
  •     Starting journal nodes
  •     Disabling secondary namenode

Click next once all these operations are completed.





8) Initialize journal nodes

This step asks us to run initializeSharedEdits command on first master node.

Once command is run on specified node , click Next



9) Start components

This step performs two things.

  •     Starts zookeeper servers
  •     Starts Namenode


Click Next once two operations completed.





10) Initialize metadata

This step asks us to run two commands on two master nodes.

  •     We need to run formatZK command on first master.
  •     We need to run bootstrapStandby command on second master.

**** Please note we need to run given commands only on specified nodes.






11) Finalize HA setup

This step performs below things.

  •   Starts additional namnode on specified node.
  •   Install Zookeeper Failover controllers on two master nodes.
  •   Starts Zookeeper Failover controllers on two master nodes.
  •   Configures AMS
  •   Deletes secondary namendoe as it is not required in namenode HA.
  •   Stops HDFS
  •   Starts all services


Click Done once above operations are completed.



12 ) Confirm name node HA is enabled

Apache Ambari reloads automatically after enabling HA for namenode, displays active namenode , standby namenode, journal nodes and Zookeeper failover controllers.

The picture below shows all of them.



Let me know if you are struck anywhere while enabling HA for namenode.

Enabling Resource manager HA using Ambari

In this article , We will learn how to enable Resource manager (RM) High availability (HA) using Apache Ambari. In resource manager high availability, the Hadoop cluster will have two or more resource managers. One resource manager will be active and other resource managers will be standby.

The active resource manager is responsible for serving user requests. if the active resource manager goes down, one of the Standby resource managers will become an active resource manager and will serve the user requests without failing running jobs.
 
By default HDP comes with a single resource manager without HA, We need to install one more resource manager and modify/add properties to enable Resource manager HA.

1) Confirm No HA is enabled

Go to Ambari home page ---> click on YARN --->Summary

If no HA is enabled for resource manager, You will see only one resource manager as shown in the below picture.





2) Click enable RM HA

Click on enable Resource Manager HA under service actions to initiate enabling RM HA.


Go to Ambari home page ------->click YARN-------> click Service actions -------> click enable Resource Manager HA

The picture below shows enable Resource Manager HA option.




3) Getting started

Ambari opens a new HA wizard that will walk us through enabling  RM HA. We need downtime for  cluster to enable HA.  1 hour  down time is recommended .depending on the cluster size, You can plan for more down time.


This is information step , read and click next.



4) Select Host

We need to have one more resource manager for HA. We will select a node to install one more resource manager  in this step. Click next once a node is selected.

In the picture below , I have selected node master2 to install one more resource manager.




5) Review

This is a review step. We can go back to previous step if we want to modify anything. The picture below show additional resource manager is going to be installed on master2 node.

We can go back and modify if we want to modify that node to something else. Else just click Next.

Some new properties need to be added/modified to YARN to enable resource manager HA. This step will show you all properties to be modified/added.



6) Configure components

This step performs 5 operations .

    Amabri stops all required services to enable HA.

    Installs new resource manager on selected node.

    Adds/modifies YARN configuration for HA

   Adds/modifies HDFS configurations for RM HA

   Start all services.

You can click on operations to see logs.


Once all operations are completed, Click Next.

7) Confirm active and standby

Amabri reloads after completing RM HA , We can see two resource managers. One resource manager will be active and other will be standby.

The picture below shows two resource mangers and we can confirm RM HA is enabled.





Let me know if you have any questions about enabling RM HA using Ambari.

Log files in Hadoop eco system

In this article , We will learn how to check log files of hadoop daemons and how to read log files of  applications and jobs.

1)  Locate log directory in Apache Ambari

First We need to know log directory for hadoop daemons.  We can use Apache ambari to find out log directory for a service. default log directory for any service is /var/log/.  Many companies may not use default log directory , so it is better to know log directory using either Ambari or Cloudera Manager. We can even use Unix command to find log directories.

The picture below shows log directory for HDFS service.

Click on HDFS  ----> Configs -------> type log in filter box.



The picture below shows how to locate log directory for Apache Oozie using grep command of Unix.




2)  Types of log files

log directories will have three types of files .

.log

Logs of running daemons will be available here in this .log file.

The picture below shows logs of running active name node . tail -f command is used to see live logs of a daemon.




.out

.out file will have startup messages of a daemon. These messages will be useful to trouble shoot startup failures of a daemon.

The picture below show startup messages of active resource manegr.




.log.[date]

Old log files will have date in their name. by default log rotation is daily, so we will see one log per day.

The picture below shows old ranger log files have date in their name.




3) Command for applications logs

We have seen how to check logs of  hadoop daemons, Now we will learn how to check logs of an application.

We need to have application ID to check logs of an application.

The picture below shows how to get logs of an application id application_1513741463894_0007.

Command used : yarn logs -applicationId application_1513741463894_0007



4) Command for job logs.

We can even check the logs of a hadoop job if we have job id.

The picture below shows how to job logs for job_id job_1513741463894_0001.

Command used : mapred job -logs [job_id]



5) Logs from Resource manager UI.

We can even get application logs from resource manager UI. go to Resource manager UI and click on application ID for which you want to check logs.

The picture below shows logs link in Resource manager UI for application application_1513741463894_0007.



Let me know if you have any questions on how to check log files for any service.



Enabling rack awareness for Hadoop cluster

In this article , We will learn how to enable rack awareness in hadoop clusters. Assume that cluster has large  number of nodes and nodes are placed in more than one rack. If we enable rack awareness , all replicas of block will not be stored in one rack so that we can have at least one replica of block is available for data processing in case of rack failures.

Goal of rack awareness is to improve data availability and decrease network bandwidth.


1) Enabling rack awareness without Apache Ambari.


In old versions of HDP we used to enable rack awareness manually. Latest versions of Apache Ambari  supports rack awareness in GUI.

Check the link on how to enable rack awareness manually , You will not require this as most of the latest versions of Apache Ambari are supporting in GUI.





2) Enabling rack awareness using Apache Ambari

Now we are going to see how to enable rack awareness using Apache Ambari . We have a five node cluster and by default we have got all nodes in default-rack.




Now we will modify rack for datanode3.

go to --> hosts in ambari -----> click on host where you want to modify rack------>go to host actions -----> click set rack




Modify rack name to rack-1 and click OK.



Go back to hosts page in Ambari to see rack name for datanode3 is changed.


You can see that nodes are placed in two different racks they are default-rack and rack-1.

3) Confirm rack awareness enabled

We can also confirm from fsck command and also from hdfs dfsadmin -report  commands.

The picture below is the output of command hdfs fsck / and it shows number of racks is 2.





Let me know if you have any questions on above article.


Creating and configuring home directory for a user in HDFS.

In this article , We will learn how to create home directory in HDFS for a new user.

Every user should have home directory in HDFS if he/she wants to access HDFS. Some hadoop jobs use user's home directory to store intermediate/temporary data . Jobs will fail if no home directory for user.
On Local file system , user's home directory is created under /home directory and On HDFS, User's home directory is created under /user folder.



1) Create a user on local file system 

First we need to create a user on local file system (i.e. Operating System) using useradd command. And user should be created on all nodes in the cluster.

The picture below shows new user nirupam is created and nirupam's home directory in local file system is /home/nirupam.


By default , user does not have a password , You can set password using passwd command if you want.

2) Create a directory in HDFS for new user.

We need to create a directory under /user in HDFS for new user. This directory needs to be created using hdfs user as hdfs user is super user for hadoop cluster.

The picture below shows a new directory is created for nirupam user under /user directory in HDFS.



3) Check the owner 

As new directory is created by hdfs user, hdfs user will be the owner of the directory. We need to change the owner of this directory to new user.

The picture below shows owner of the /user/nirupam directory in HDFS.





4) Change the owner

Change the owner of new directory created in HDFS to new user created in local file system. chown  command can be used to change the owner.

The picture below shows changing the owner of HDFS directory /user/nirupam to nirupam user from hdfs user.




5) Change the permissions 

We need to change the permissions of this newly created so that no other users can have  read,write and execute permissions except owner.

The picture below modifies permissions of /user/nirupam directory to 700 so that only own can have read, write and execute permissions.




6) Test the user's HDFS home directory.

We have successfully created home directory in HDFS for new user. We need to test it now.

We will try to upload a new file to HDFS without specifying destination directory. File will be uploaded to user's home directory if no destination is specified.

The picture below shows new file is uploaded to nirupam's user home directory as destination directory is not specified.



Let me know if you have any questions.



Decommissioning of Node manager in Hadoop cluster

In this article , We will learn how to perform decommissioning of the node managers in Hadoop clusters.

Decommissioning process will ensure running jobs moved to different node managers without failing them.

1) Check Ambari UI

If you are using HDP (Hortonworks Data Platform) , You can check Ambari UI to see how many node managers are present in your cluster.

The picture below shows cluster has 3 node managers. We would like to decommission one node manager from the cluster.




2) Check yarn.resourcemanager.nodes.exclude-path property 

Cluster should have yarn.resourcemanager.nodes.exclude-path property in yarn-site.xml file . If property not present , We should add it.



3) Update exclude file

Update /etc/hadoop/conf/yarn.exclude file with hostname on which you want to perform decommissioning of the node manager.

I have updated the file with master2 hostname to decommission node manager on master2 node.




4) Run refreshNodes command

Run yarn rmadmin -refreshNodes command to initiate decommissioning of nodemanagers.
This command needs to be run as yarn user.

The picture below shows refreshNodes command is run.




5) Check Ambari UI 

Login into Amabri GUI  and click on YARN service to check decommissioned nodemanagers.

The picture below show 1 nodemanager is decommissioned, I have highlighted it in yellow.




Trouble shooting:

Check logs of node manager which you are decommissioning, logs of active resource manager and also logs of the active namenode if decommissioning of node managers is not working.


Decommissioning datanodes in Hadoop cluster

In this article , We will learn how to perform decommissioning of data nodes in a Hadoop cluster.
Decommissioning process of the data node ensures that data is transferred to other nodes so that the existing replication factor is not disturbed.


1) Check NameNode UI for available data nodes and their status.

The picture below shows We have three data nodes in the cluster. All of them have had admin state as In Service that means all of them are in working state.

We will try to decommission the data node on the master2 host.





2) dfs.hosts.exclude property

Ensure dfs.hosts.exclude property there in hdfs-site.xml, If not there please add it. This property is required to perform decommissioning of the data node.

The picture below shows property in Apache Ambari. You can also check the same property in Cloudera Manager if you are using CDH.





3) Update dfs.exclude file

Update dfs.exclude file with hostname where you want to decommission the datanode. We would like to decommission datanode on the host master2. This file name is the value of property dfs.hosts.exclude.

 Perform this step on the host where active name node is running.




4) Run refreshNodes command

Run refreshNodes command on active name node to decommission the data node .

hdfs dfs -refreshNodes.



5) Check decommissioning status.

Check name node UI to see master2 host under decommissioning category.




6) Check Decommissioned status

Check name node UI to see one node is marked as decommissioned.
The picture below shows data node in master2 is decommissioned.





Trouble shooting : 


1)  Check name node logs

 If you are unable to decommission the data node,  Check active namenode logs for errors. These logs will have errors related to decommissioning.


2)  Unable to decommission the data node

In small clusters, It is difficult to perform decommissioning of data nodes if  replication factor is less than number of available  data nodes.

For example :

If you have 3 data nodes  and replication factor is 3, You can not decommission one data node because cluster can not achieved replication factor 3 if one data node is decommissioned.

Reduce replication factor to 2 and perform decommissioning.


Let me know if you have any questions.

Installation and configuration of Ranger using Apache Ambari

In this article , We will learn how to install and configure Apache Ranger using Apache Ambari.Apache Ranger is a security technology that provides policy based security for all hadoop eco system tools.

1) Login into Ambari GUI.

2) Click on Add service

Click on Actions in the bottom of services and choose add service, Ambari will launch add service wizard.




3) Choose Ranger

Choose Ranger in the list of services and click Next. Ranger KMS will also be present in the list of services, do not select it for now.


4) Database setup

Ranger uses database to store it's data, we need to have at least one RDBMS system setup.
Ambari already using Postgres, We can click on I have met all  the requirements above and click proceed.





5) Assign Masters

We have two master components in Ranger called Ranger Usersync and Ranger admin.  We need to select hostnames for Ranger Usersync and Ranger Admin in this step.

Click on next once hostnames are selected.



6) Assign Slaves

Ranger has a slave component called Ranger Tagsync . By default it will be selected, just click next in this step.





7) Customize services

We need to configure the database and Ranger audit in this step. Ambari displays meroon colored circles where user inputs required.




Select the database you are using , I have choosen postgres as I am using postgres .  Enter hostname where your database is installed.

Enter password for database user rangeradmin. I have entered ranger as the password.


Enter user name and password for database administrator. Postgres comes with postgres/postgres user credentials. I have entered postgres as username and postgres as password.

After entering the login credentials , Click on test connection to ensure that all database details provided are working.



Ranger audits are stored in database, HDFS and also in SolR. We do not have SolR installed, So I am disabling audit to SolR. Click Next now.



8) Configure identities

If cluster is Kerberized , You can configure keytabs and principals in this step. If cluster is not Kerberized , this step is skipped automatically.



9) Review

Before installing and configuring Ranger , This step provides last opportunity for reviewing the information provided so far. If you wish , you can go back and modify already provided information.

just click on deploy to continue installation.





10) Install test and start

This step performs three operations .

  • Installing Ranger
  • Starting ranger components
  • Testing ranger components by running service checks



11) Summary

This is final step in which Ambari displays summary about recent installation. just click on complete to see Ranger in the services section of Ambari.

This step also informs us that we need to restart services  those have restart symbol after this step.




12) Restart required

The picture below shows restart symbol next to services HDFS,YARN,Hive and Knox. We need to restart them. and also we can see new service Ranger is installed.



This is how we can install and configure Ranger using Apache Ambari. Let me know if you have any questions.