Search This Blog

Environment for practising Hadoop

Most of hadoop aspirants would face common problem.That is how to setup all hadoop components so that they can practice hadoop ecosystem components like hive ,pig and oozie etc.Leading hadoop providers already have solution for this problem.
They have created virtual machines with all hadoop components installed. And you can directly use them.
But these need to be setup on top of existing operating system. In this article I am going to cover how to setup these VMs.
Below are prerequisites for this setup.


1). RAM 4GB and above


2). Hard disk 50 GB free space


3). Virtual technology .

By default most of operating systems come with virtual technology enabled.You can go to BIOS settings to check .You should see virtual technology enabled option under advanced settings. This might vary depending on vendors.If you see disabled please enable it.


4). 64 bit operating system


In this article I am talking about below vendors


Cloudera
Hortonworks


1. Download Oracle virtualbox.


 download Oracle Virtual Box .
This will enable you to create virtual machines on top your existing Operating System.


2. Download Virtual VM


Download virtual vm that has all Hadoop components are already setup and You need not worry about installation of them.
Leading vendors provide their virtual vms for this purpose.
Download either Cloudera quick start vm or Hortonworks Sandbox .
For both ,Download vm related to only Oracle virtual box .
If you have different virtual machine softwares You can choose different version accordingly.


3. Extract virtual vm to a folder

4. Install virtual box


If you are on windows You can just double click it and It would take care of everything.
I mean to say it is straight forward.




4. Setup virtual machine

    4.1 Click on New
        specify some name
        Select Type as Linux
        Select version as Other Linux (64 bit)


       4.2 allocate around 2RAM to virual machine.




       4.3 Point to Virtual Image file extracted in above step.




Now You see one more vm added 






5. Start virtual machine


Click on Start Button it would start your virtual VM and You can use any hadoop component now.




All the best . Happy Hadooping.