Real time analytics using Hadoop and Elasticsearch (Big Mountain Data 2014)

I recently spoke at Big mountain data 2014 conference. Here are the slides and steps for the demo.

Link to the slides : http://www.slideshare.net/abhishek376/real-time-analytics-using-hadoop-and-elasticsearch

Hadoop 2.4.0
Hive 0.13.0
Elasticsearch 1.3.4 (http://www.elasticsearch.org/download/)
HDP 2.1 (http://hortonworks.com/products/hortonworks-sandbox/)

HDP doesn’t come with elasticsearch installed.

INSTALLING ELASTICSEARCH :
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.4.noarch.rpm
rpm -Uvh elasticsearch-1.3.4.noarch.rpm
chkconfig –add elasticsearch
service elasticsearch start
cd /usr/share/elasticsearch/

INSTALLING HEAD AND DESK PLUGINS:
bin/plugin –install mobz/elasticsearch-head
bin/plugin –install lukas-vlcek/bigdesk

INSERTING DATA IN TO MYSQL:
drop database bigmountaindata;
create database bigmountaindata;

use bigmountaindata;
create table gender(id int, height int, gender varchar(50));
insert into gender values(1,6,’Male’);
insert into gender values(2,5,’Male’);
insert into gender values(3,6,’Female’);

select * from gender;

USING SQOOP TO IMPORT DATA IN TO HIVE:
Hive:
create database bigmountaindata;

SQOOP :
sqoop import –connect ‘jdbc:mysql://127.0.0.1:3306/bigmountaindata’ –username=root –table gender -m 1 –class-name bigmountaindata.gender –outdir /tmp/src/generated –target-dir /tmp/sqoop-import/bigmountaindata.gender –hive-import –hive-drop-import-delims –hive-table bigmountaindata.gender

USING HIVE EXTERNAL TABLES TO PUSH DATA IN TO ELASTICSEARCH:
add jar /usr/lib/hive/lib/aux/elasticsearch-hadoop-hive-2.1.0.jar;
use bigmountaindata;
CREATE EXTERNAL TABLE if not exists es_gender (
id INT,
height INT,
gender STRING)
STORED BY ‘org.elasticsearch.hadoop.hive.EsStorageHandler’
TBLPROPERTIES(‘es.resource’ = ‘bigmountaindata/gender’);
insert overwrite table es_gender select id, height, gender from the;

ELASTICSEARCH :

Number of male and Female
http://hortonworks:9200/bigmountaindata/gender/_search?search_type=count
{
“aggs”: {
“genders”: {
“terms”: {
“field”: “gender”
}
}
}
}

Average height of males and females
http://hortonworks:9200/bigmountaindata/gender/_search?search_type=count
{
“aggs”: {
“genders”: {
“terms”: {
“field”: “gender”
},
“aggs”: {
“avg_height”: {
“avg”: {
“field”: “height”
}
}
}
}
}
}

Feel free to reach me if you have any question @abhishek376/ abhishek376@gmail.com

Advertisements

2 Comments

  1. I am facing an error while creating an External Table to push the data from Hive to ElasticSearch.

    What I have done so far:

    1) Successfully set up ElasticSearch-1.4.4 and is running.

    2) Successfully set up Hadoop1.2.1, all the daemons are up and running.

    3) Successfully set up Hive-0.10.0.

    4) Configured elasticsearch-hadoop-1.2.0.jar in both Hadoop/lib and Hive/lib as well.

    5) Successfully created few internal tables in Hive.

    Error coming when executing following command:

    CREATE EXTERNAL TABLE drivers_external (id BIGINT, firstname STRING, lastname STRING, vehicle STRING, speed STRING) STORED BY ‘org.elasticsearch.hadoop.hive.EsStorageHandler’ TBLPROPERTIES(‘es.nodes’=’localhost’,’es.resource’ = ‘drivers/driver’);

    Error is:

    Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.org.elasticsearch.hadoop.hive.EsStorageHandler FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

    Any Help!

    1. trying adding jar manually in the hive console using “add jar /path/to/es-hadoop.jar”; or add it to the aux path in the hive-site.xml. You could upload to hdfs and in the aux path use hdfs://path/to/jar

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s