Search This Blog

Ambari agent installation

In this article , We will learn how to install Ambari agent for Ambari on different operating systems.

1) Installing ambari agent

We use yum command if operating system is CentOS or Redhat. We use zypper command if operating system is SLES or apt-get command if operating system is Ubuntu.

Commands below needs to be run as root user.

CentOS or RedHat :

yum install ambari-agent

SLES (Suse Linux )

zypper install ambari-agent

Ubuntu

apt-get install ambari-agent


The picture below show how to install ambari-agent on CentOS.






2) Modifying ambari-agent.ini file

We need to inform ambari-agent about ambari server's hostname. hostname property in ambari-agent.ini needs to be updated for the same reason.

The picture below shows hostname property in ambari-agent.ini file.




3) Starting ambari agent

Now ambari-agent needs to be started with ambari-agent start command.





4) Check ambari agent status

Now confirm ambari-agent is running using ambari-agent status command.



HDFS setfacl and getfacl commands examples

In this article , We will learn setfacl and getfacl commands in HDFS.


1) chmod command  can not provide advanced permissions in HDFS.

The following are some use cases where chmod usage is not possible.



  • Providing less/more permissions to one user in a group.



  •  Providing less/more permissions to a specific user 



2) ACL (Access Control Lists) commands setfacl and getfacl provide advanced permissions in HDFS.


3) ACLs in HDFS are disabled by default, We need to enable them by setting below property tp true.

dfs.namenode.acls.enabled

Check how to enable ACLs in Ambari.

4) setfacl command is used to provide advanced permissions in HDFS. getfacl command is used to check ACLs provided on a directory in HDFS.

Type below commands to see commands usage.

hdfs dfs -setfacl

hdfs dfs -getfacl

The pictures below show commands usage .






5)getfacl commands displays ACLs available on an HDFS directory. -R option will display ACLs of a directory and its all sub-directories and all files.

Example :

hdfs dfs -getfacl /data

The picture below shows usage of getfacl command.





6) -m option in setfacl command modifies permissions for an HDFS directory. We can add/remove new ACL/permission to an existing file.

For example :

/data directory has only read access to group members. setfacl  -m option can provide write permissions to one group  member (hive).

The picture below shows how to use -m option.





7) default keyword defines default ACLs on a directory. if any sub directories are created under that directory in future, sub-directories will get default ACLs automatically.

Example :

hdfs dfs -setfacl -m default:user:sqoop:rwx /data

The picture below shows newly created sub directory under /data directory gets default ACLs automatically.



8) + symbol in ls command output indicates a file has ACL defined on it.

The picture below shows plus symbol on  /data directory as /data directory has ACLs defined on it.



9) -k option in setfacl command  will remove default ACLs.

Example :

hdfs dfs -setfacl -k /data

The picture below shows how to remove default ACLs on /data directory in HDFS.




10) -b option in setfacl command removes all ACLs entries except base (user,group and others) ACLs.

Example :

hdfs dfs -setfacl -b /data

The picture below show how to retain base ACLs using -b option.





11) -x option in setfacl command will remove specified ACLs'

Example :

hdfs dfs -setfacl -x user:hive /data

The picture below shows removing user hive permissions on /data directory.



12) --set in setfacl command  replaces all existing ACLs with new ACLs specified.



Limitations

1) ACLs on snapshot directories are not allowed.

2) Only 32 ACLs entries per file allowed  as of now.

3) ACLs information is maintained in memory by namenode. Large number of ACLs will increase load on the Namenode


Exploring snapshots in HDFS

HDFS snapshot is saved copy of an existing directory. Snapshots will be useful for restoring the corrupt data. In this article we will learn how to manage HDFS snapshots.

Practice below commands to get practical understanding of HDFS snapshots.


1) Create a local file with sample numbers.



2) Create a folder on hdfs and upload local file to HDFS directory

The following commands create a folder called numbers in HDFS directory /user/hdfs and upload local file called numbers to HDFS directory.

hdfs dfs -mkdir /user/hdfs/numbers
hdfs dfs -ls /user/hdfs/numbers
hdfs dfs -put numbers /user/hdfs/numbers







3) Try to create a snapshot on  an HDFS directory

Snapshots can not be created on a folder directly. We need to enable snapshot on the directory before creating snapshots on it.

Directory is not a snapshottable directory error is thrown if snapshots are not enabled.

hdfs dfs -createSnapshot /user/hdfs/numbers





4) Allow snapshots and create snapshots

allowSnapshot command enables snapshots on a HDFS directory.

The folowing commands first enable snapshot on /user/hdfs/numbers and create snapshot on the same.


hadoop dfsadmin -allowSnapshot /user/hdfs/numbers
hdfs dfs -createSnapshot /user/hdfs/numbers


5) List snapshots using ls command

We can check snapshots in a directory using ls command. Snapshots of a directory will be stored in .snapshot directory of the folder.

 hdfs dfs -ls /user/hdfs/numbers/.snapshot
  hdfs dfs -ls /user/hdfs/numbers/.snapshot/s20170902-133455.787

The picture below shows HDFS directory /user/hdfs/numbers has a file called numbers that is also saved in snapshot diretcory /user/hdfs/numbers/.snapshot/s20170902-133455 .

If numbers file in /user/hdfs/numbers is corrupted , We can restore numbers file from /user/hdfs/numbers/.snapshot/s20170902-133455 directory.




6) List snapshottable directories in entire HDFS

lsSnapshottableDir  command lists all HDFS directory those have snapshots enabled.

hdfs lsSnapshottableDir




8) Create  snapshot with a specific name

By default snapshots are created with timestamp as a folder name. We can even name snapshot of directory at the time of creatinng snapshots.

The command below creates a snapshot called secondSS on HDFS directory /user/hdfs/numbers.

hdfs dfs -createSnapshot /user/hdfs/numbers secondSS




9) Delete file from HDFS folder

The command below deletes file numbers from directory /user/hdfs/numbers to see how to restore it.

hdfs dfs -rm /user/hdfs/numbers/numbers



10) Restore snapshot from HDFS directory

Snapshots will be restored using HDFS command cp.

 hdfs dfs -cp /user/hdfs/numbers/.snapshot/secondSS/numbers /user/hdfs/numbers




11) Try to disable snapshots

We need to delete all snapshots before disabling snapshots on a HDFS directory.

hdfs dfsadmin -disallowSnapshot /user/hdfs/numbers





12) Delete snapshots and disallow snapshot

The commands below first delete all snapshots before disabling snapshots.

 hdfs dfs -deleteSnapshot /user/hdfs/numbers secondSS
 hdfs dfsadmin -disallowSnapshot /user/hdfs/numbers




13) Rename a snapshot

renameSnapshot  command is used to change the name of a snapshot.

  hdfs dfs -renameSnapshot /user/hdfs/numbers secondSS thirdSS



Hope you migh learned HDFS snapshots with this article.

Happy Hadooping.