Search This Blog

Writing Macro in Pig Latin

We can develop more reusable scripts in Pig Latin Using Macros also.Macro is a kind of function written in Pig Latin.

We will learn macro in this Post.

We will take sample emp data like below.

eno,ename,sal,dno
10,Balu,10000,15
15,Bala,20000,25
30,Sai,30000,15
40,Nirupam,40000,35

using above data I would like to have employee data who belong to department number 15.

First we will write Piglatin code without using Macro.


1. Example without Macro

Write below code in a file called filterwitoutmacro.pig

emp = load '/data/employee'using PigStorage(',') as (eno,ename,sal,dno);
empdno15 =filter emp by $3==15;
dump empdno15;

run pig latin code from file.

pig -f /path/to/filterwitoutmacro.pig

Now we will create a macro for filter logic

2. Same example with macro

Define is the keyword used to create a macro ,It will also have returns statement.

return relation/varibale declared within macro should be the last variable/relation within macro code.

2.1 Create a macro

DEFINE myfilter(relvar,colvar) returns x{
$x = filter $relvar by $colvar==15;
};

Above macro takes two values as input,one is relation variable (relvar) and second is column variable (colvar)

macro checks if colvar equals to 15 or not.

2.2  Usage of  macro

we can use myfilter macro like below.

emp = load '/data/employee'using PigStorage(',') as (eno,ename,sal,dno);
empdno15 =myfilter( emp,dno);
dump empdno15;

we can write macro creation code and macro usage code in same file ,can run file with -f option.

pig -f /path/to/myfilterwithembeddedmacro.pig



3. same example with external macro.

macro code even be in a separate file ,so that we can use it in different pig latin scripts.

to use external macro file in pig latin code we use IMPORT statement.

3.1 write  above macro in separate file called myfilter.macro

--myfilter.macro
DEFINE myfilter(relvar,colvar) returns x{
$x = filter $relvar by $colvar==15;
}

3.2 Import macro file in another pig latin script file.


IMPORT '/path/to/myfilter.macro'
emp = load '/data/employee'using PigStorage(',') as (eno,ename,sal,dno);
empdno15 =myfilter( emp,dno);
dump empdno15;


and we run pig latin script file using -f option.

pig -f /path/to/myfilterwithexternalmacro.pig


So what is the use of macro ?

we can use macro as many times as we wish and on different inputs also.

for example ,with respect to above example ,If I want to check employee details who has 15 as their employee number.

then we can write pig latin code like below.



IMPORT '/path/to/myfilter.macro'
emp = load '/data/employee'using PigStorage(',') as (eno,ename,sal,dno);
eno15 =myfilter( emp,eno);
dump eno15;


So That concludes ,we can write highly reusable scripts in Pig latin using macros.

also visit pig official documentation on macro.

upto some extent , we can write reusable scripts in Pig latin using parameter substitution also.




No comments:

Post a Comment