Apache pig Tutorial 3: Batch Mode


You can run Pig in batch mode using Pig scripts and the "pig" command (in local or hadoop mode).

Example

The Pig Latin statements in the Pig script (id.pig) extract all user IDs from the /etc/passwd file. First, copy the /etc/passwd file to your local working directory. Next, run the Pig script from the command line (using local or mapreduce mode). The STORE operator will write the results to a file (id.out).
/* id.pig */

A = load 'passwd' using PigStorage(':');  -- load the passwd file 
B = foreach A generate $0 as id;  -- extract the user IDs 
store B into 'id.out';  -- write the results to a file name id.out
Local Mode
$ pig -x local id.pig
Tez Local Mode
$ pig -x tez_local id.pig
Spark Local Mode
$ pig -x spark_local id.pig
Mapreduce Mode
$ pig id.pig
or
$ pig -x mapreduce id.pig
Tez Mode
$ pig -x tez id.pig
Spark Mode

$ pig -x spark id.pig

Comments

Popular posts from this blog

Hive Tutorial 31 : Analytic Functions

Hive Tutorial 37 : Performance Tuning

How to change sqoop saved job parameters