Apache Pig - Stock Market Analytics
Here I am having large dummy data-set of Stock Exchange. We will do the maximum closing price analysis of stock’s shares by using Apache Pig. Please follow the below step for achieving the target.
Please download the dummy data-set from Stock Market
Dummy Data
Here I am assuming Apache Hadoop and Pig is up running
onto your machine.
Create
Directory and Load stock_dummy_data Data into the HDFS
Navigate to dummy data file directory
$ hadoop fs -mkdir /user/training/PIGDATA
$ hadoop fs -mkdir /user/training/PIGDATA/PIG_UDF_DATA
$ hadoop fs -copyFromLocal stock_dummy_data /user/training/PIGDATA/PIG_UDF_DATA
$ pig -x mapreduce
Processing Data using Apache Pig
Step: 1 - Load
dataset with column names and datatypes
grunt> stock_records=
LOAD '/user/training/PIGDATA/PIG_UDF_DATA/stock_dummy_data' USING
PigStorage(',') as (exchange:chararray, symbol:chararray, date:datetime,
open:float, high:float, low:float, close:float,volume:int, adj_close:float);
Step: 2 - Group records by symbol
grunt> group_by_symbol = GROUP stock_records
BY symbol;
Step: 3 - Calculate
the maximum closing price
grunt> max_closing_price = FOREACH
group_by_symbol GENERATE group, MAX(stock_records.close) as maxclose;
Step: 4 - Store output by
using STORE command
grunt> STORE max_closing_price INTO
'/home/training/Desktop/output/pig/stocks' USING PigStorage(',');
Output :
Input(s):
Successfully read 540 records (33176 bytes) from:
"/user/training/PIGDATA/PIG_UDF_DATA/stock_dummy_data"
Output(s):
Successfully stored 1 records (12 bytes) in: "/home/training/Desktop/output/pig/stocks"
Counters:
Total records written : 1
Total bytes written : 12
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201601290600_0015
2016-02-04 02:30:22,583 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
Step: 5 - Check the
output
grunt> cat
/home/training/Desktop/output/pig/stocks;
Output:
SZIAS, 35.67
Hope you have enjoyed the article.
Author : Iqubal Mustafa
Kaki, Technical Specialist
Want to connect with me
If you want to connect with me, please connect through my email - iqubal.kaki@gmail.com
Want to connect with me
If you want to connect with me, please connect through my email - iqubal.kaki@gmail.com
Thanks for sharing this valuable information to our vision. Power BI Online Online course India
ReplyDeleteThank you.Well it was nice post and very helpful information onHadoop Admin Online Course Hyderabad
ReplyDeleteit is very excellent blog and useful article thank you for sharing with us , keep posting Big Data Hadoop Online Course Bangalore
ReplyDeleteAmazing blog.
ReplyDeleteBig Data and Hadoop Online Training