Tuesday, February 9, 2016

Apache Pig - Stock Market Analytics

Apache Pig - Stock Market Analytics

Here I am having large dummy data-set of Stock Exchange. We will do the maximum closing price analysis of stock’s shares by using Apache Pig. Please follow the below step for achieving the target.




















Please download the dummy data-set from Stock Market Dummy Data

Here I am assuming Apache Hadoop and Pig is up running onto your machine.

Create Directory and Load stock_dummy_data Data into the HDFS

Navigate to dummy data file directory

$ hadoop fs -mkdir /user/training/PIGDATA

$ hadoop fs -mkdir /user/training/PIGDATA/PIG_UDF_DATA

$ hadoop fs -copyFromLocal stock_dummy_data /user/training/PIGDATA/PIG_UDF_DATA

$ pig -x mapreduce

Processing Data using Apache Pig

Step: 1 - Load dataset with column names and datatypes

grunt> stock_records= LOAD '/user/training/PIGDATA/PIG_UDF_DATA/stock_dummy_data' USING PigStorage(',') as (exchange:chararray, symbol:chararray, date:datetime, open:float, high:float, low:float, close:float,volume:int, adj_close:float);

Step: 2 - Group records by symbol

grunt> group_by_symbol = GROUP stock_records BY symbol;

Step: 3 - Calculate the maximum closing price

grunt> max_closing_price = FOREACH group_by_symbol GENERATE group, MAX(stock_records.close) as maxclose;

Step: 4 - Store output by using STORE command

grunt> STORE max_closing_price INTO '/home/training/Desktop/output/pig/stocks' USING PigStorage(',');

























Output :
Input(s):
Successfully read 540 records (33176 bytes) from: "/user/training/PIGDATA/PIG_UDF_DATA/stock_dummy_data"
Output(s):
Successfully stored 1 records (12 bytes) in: "/home/training/Desktop/output/pig/stocks"
Counters:
Total records written : 1
Total bytes written : 12
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201601290600_0015
2016-02-04 02:30:22,583 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!

Step: 5 - Check the output

grunt> cat /home/training/Desktop/output/pig/stocks;

Output:
SZIAS, 35.67
























Hope you have enjoyed the article.

Author : Iqubal Mustafa Kaki, Technical Specialist

Want to connect with me
If you want to connect with me, please connect through my email - 
iqubal.kaki@gmail.com

4 comments: