Tuesday, October 1, 2013

How to create AWS job flow for HBase from CLI?

1. Download the Amazon Elastic MapReduce CLI from the location below
wget http://elasticmapreduce.s3.amazonaws.com/elastic-mapreduce-ruby.zip
2. Unzip it
unzip elastic-mapreduce-ruby.zip
3. Create a shell script with following code (create_jf.sh)
 ruby elastic-mapreduce \  
 -v \  
 —create \  
 —alive \  
 —region “us-east-1” \  
 —access-id <your_access_id> \  
 —private-key <your_private_kay> \  
 —key-pair <your_key_pair> \  
 —ami-version latest \  
 —visible-to-all-users \  
 —hbase \  
 —name “HBASE from CLI” \  
 —instance-group MASTER \  
 —instance-count 1 \  
 —instance-type m1.large \  
 —instance-group CORE \  
 —instance-count 1 \  
 —instance-type m1.large \  
 —pig-interactive \  
 —pig-versions latest \  
 —hive-interactive \  
 —hive-versions latest \  
 —bootstrap-action “s3://elasticmapreduce/bootstrap-actions/configure-hadoop” \  
 —args “-m,mapred.tasktracker.map.tasks.maximum=6,-m,mapred.tasktracker.reduce.tasks.maximum=2”  
4. Create job flow
bash create_jf.sh

5. Monitor job flow from AWS console
Copy the job flow Id you get after successful run of create_jf.sh and search it with AWS EMR console.

No comments:

Post a Comment