How to add UDF permanently to Hive

Hive
Hive

To learn more about Hive UDF (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF)

I recently developed a bunch of hive UDF’s and to call the function I have to add the jar files and create the temporary functions for every hive session. I started digging in to the code to find out If I can modify the java files and then rebuild hive. This post describes how I did it.

1. Download the source code from

 http://www.apache.org/dyn/closer.cgi/hive/

0.10.0 is the most stable version of hive when I’m writing this blog post.

2. Extract the tar ball and copy your udf  java files to

 \hive-0.10.0\src\ql\src\java\org\apache\hadoop\hive\ql\udf

if the udf is a generic udf copy to it

 hive-0.10.0\src\ql\src\java\org\apache\hadoop\hive\ql\udf\generic

3. Before you copy the java files to the ql folder you have to change the package of the java files to

 package org.apache.hadoop.hive.ql.udf.generic;  or package org.apache.hadoop.hive.ql.udf;

4. Now that you copied files to the udf folder you have to tell hive on how to find these functions. To do this you have change FunctionRegistry.java. You can find FunctionRegistry.java in

 \hive-0.10.0\src\ql\src\java\org\apache\hadoop\hive\ql\exec

5. You have to make the following changes to FunctionRegistry.java

i)  import the udf class

 ex: import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFToMap;

ii)  if its a UDAF

 registerGenericUDAF("to_map", new GenericUDAFToMap());

else

 registerGenericUDF("rank", GenericUDFRank.class);

6. Once you have made all the changes navigate to the src folder and build using ant ( ant package) once its build you will have a build folder. Even if the build fails (To completely build hive you need thrift compiler and many others installed, as long as it build hive-exec-*.jar you are good). It is recommended to deploy the entire build but only the hive-exec-*.jar  needs to be replaced.

I also observed that when I build hive on the windows machine and copy the jar to centos box it fails.

Hope this helps some one. Let me know if you have any trouble. You can reach me at @abhishek376 on twitter

Advertisements

4 Comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s