# hive-third-functions
[![Author](https://img.shields.io/badge/Author-%E4%B8%AD%E9%BE%84%E7%A8%8B%E5%BA%8F%E5%91%98-blue.svg)](https://www.shanruifeng.win)
[![Build Status](https://travis-ci.org/aaronshan/hive-third-functions.svg?branch=master)](https://travis-ci.org/aaronshan/hive-third-functions)
[![Documentation Status](https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat)](https://github.com/aaronshan/hive-third-functions/tree/master/README.md)
[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://github.com/aaronshan/hive-third-functions/tree/master/README-zh.md)
[![Release](https://img.shields.io/github/release/aaronshan/hive-third-functions.svg)](https://github.com/aaronshan/hive-third-functions/releases)
[![Stars](https://img.shields.io/github/stars/aaronshan/hive-third-functions.svg?label=Stars&style=social)](https://github.com/aaronshan/hive-third-functions)
## Introduction
Some useful custom hive udf functions, especial array and json functions.
> Note:
> hive-third-functions support hive-0.11.0 or higher.
## Build
### 1. install dependency
Now, jdo2-api-2.3-ec.jar not available in the maven central repository, so we have to manually install it into our local maven repository.
```
wget http://www.datanucleus.org/downloads/maven2/javax/jdo/jdo2-api/2.3-ec/jdo2-api-2.3-ec.jar -O ~/jdo2-api-2.3-ec.jar
mvn install:install-file -DgroupId=javax.jdo -DartifactId=jdo2-api -Dversion=2.3-ec -Dpackaging=jar -Dfile=~/jdo2-api-2.3-ec.jar
```
### 2. mvn package
```
cd ${project_home}
mvn clean package
```
If you want to skip unit tests, please run:
```
cd ${project_home}
mvn clean package -DskipTests
```
It will generate hive-third-functions-${version}-shaded.jar in target directory.
You can also directly download file from [release page](https://github.com/aaronshan/hive-third-functions/releases).
> current latest version is `2.2.1`
## Maven
Now, I had already release `hive-third-functions` to maven repositories. To add a dependency on `hive-third-functions` using Maven, use the following:
```
<dependency>
<groupId>com.github.aaronshan</groupId>
<artifactId>hive-third-functions</artifactId>
<version>2.2.1</version>
</dependency>
```
## Functions
### 1. string functions
| function| description |
|:--|:--|
|pinyin(string) -> string | convert chinese to pinyin|
|md5(string) -> string | md5 hash|
|sha256(string) -> string |sha256 hash|
|codepoint(string) -> integer | Returns the Unicode code point of the only character of string.|
|hamming_distance(string1, string2) -> bigint | Returns the Hamming distance of string1 and string2.|
|levenshtein_distance(string1, string2) -> bigint | Returns the Levenshtein edit distance of string1 and string2.|
|normalize(string, form) -> varchar | Transforms string with the specified normalization form. form must be be one of the following keywords: <span id="jump">Normalize Form Description</span> |
|strpos(string, substring) -> bigint | Returns the starting position of the first instance of substring in string. Positions start with 1. If not found, 0 is returned.|
|split_to_map(string, entryDelimiter, keyValueDelimiter) -> map<varchar, varchar> | Splits string by entryDelimiter and keyValueDelimiter and returns a map. entryDelimiter splits string into key-value pairs. keyValueDelimiter splits each pair into key and value.|
|split_to_multimap(string, entryDelimiter, keyValueDelimiter) -> map(varchar, array(varchar)) | Splits string by entryDelimiter and keyValueDelimiter and returns a map containing an array of values for each unique key. entryDelimiter splits string into key-value pairs. keyValueDelimiter splits each pair into key and value. The values for each key will be in the same order as they appeared in string.|
[Normalize Form Description](#jump)
| Form | Description |
|:--|:--|
| NFD | Canonical Decomposition |
| NFC | Canonical Decomposition, followed by Canonical Composition |
| NFKD | Compatibility Decomposition |
| NFKC | Compatibility Decomposition, followed by Canonical Composition |
### 2. array functions
| function| description |
|:--|:--|
|array_contains(array<E>, E) -> boolean | whether array contains value or not.|
|array_equals(array<E>, array<E>) -> boolean | whether two array equals or not.|
|array_intersect(array, array) -> array | returns the two array's intersection, without duplicates.|
|array_max(array<E>) -> E | returns the maximum value of input array.|
|array_min(array<E>) -> E | returns the minimum value of input array.|
|array_join(array, delimiter, null_replacement) -> string | concatenates the elements of the given array using the delimiter and an optional `null_replacement` to replace nulls.|
|array_distinct(array) -> array | remove duplicate values from the array.|
|array_position(array<E>, E) -> long | returns the position of the first occurrence of the element in array (or 0 if not found).|
|array_remove(array<E>, E) -> array | remove all elements that equal element from array.|
|array_reverse(array) -> array | reverse the array element.|
|array_sort(array) -> array | sorts and returns the array. The elements of array must be orderable.|
|array_concat(array, array) -> array | concatenates two arrays.|
|array_value_count(array<E>, E) -> long | count array's element number that element value equals given value.|
|array_slice(array, start, length) -> array | subsets array starting from index start (or starting from the end if start is negative) with a length of length.|
|array_element_at(array<E>, index) -> E | returns element of array at given index. If index < 0, element_at accesses elements from the last to the first.|
|array_shuffle(array) -> array | Generate a random permutation of the given array x.|
|sequence(start, end) -> array<Long> | Generate a sequence of integers from start to stop.|
|sequence(start, end, step) -> array<Long> | Generate a sequence of integers from start to stop, incrementing by step.|
|sequence(start_date_string, end_data_string, step) -> array<String> | Generate a sequence of date string from start to stop, incrementing by step.|
|array_value_count(array<E>, E) -> long | count array's element number that element value equals given value..|
### 3. map functions
| function| description |
|:--|:--|
|map_build(x<K>, y<V>) -> map<K, V>| returns a map created using the given key/value arrays.|
|map_concat(x<K, V>, y<K, V>) -> map<K,V> | returns the union of two maps. If a key is found in both `x` and `y`, that key’s value in the resulting map comes from `y`.|
|map_element_at(map<K, V>, key) -> V | returns value for given `key`, or `NULL` if the key is not contained in the map.|
|map_equals(x<K, V>, y<K, V>) -> boolean | whether map x equals with map y or not.|
### 4. date functions
| function| description |
|:--|:--|
|day_of_week(date_string \| date) -> int | day of week,if monday,return 1, sunday return 7, error return null.|
|day_of_year(date_string \| date) -> int | day of year. The value ranges from 1 to 366.|
|zodiac_en(date_string \| date) -> string | convert date to zodiac|
|zodiac_cn(date_string \| date) -> string | convert date to zodiac chinese |
|type_of_day(date_string \| date) -> string | for chinese. 获取日期的类型(1: 法定节假日, 2: 正常周末, 3: 正常工作日 4:攒假的工作日),错误返回-1. |
### 5. json functions
| function| description |
|:--|:--|
|json_array_get(json, jsonPath) -> array(varchar) |returns the element at the specified index into the `json_array`. The index is zero-based.|
|json_array_length(json, jsonPath) -> array(varchar) |returns the array length of `json` (a string containing a JSON array).|
|json_array_extract(json, jsonPath) -> array(varchar) |extract json array by given jsonPath.|
|json_array_extract_scalar(json, jsonPath) -> array(varchar) |like `json_array_extract`, but returns the
没有合适的资源?快使用搜索试试~ 我知道了~
一些有用的自定义配置单元udf函数、特殊数组、json、数学、字符串函数。___下载.zip
共145个文件
java:136个
md:3个
config:2个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 141 浏览量
2023-04-18
00:20:37
上传
评论
收藏 281KB ZIP 举报
温馨提示
一些有用的自定义配置单元udf函数、特殊数组、json、数学、字符串函数。___下载.zip
资源推荐
资源详情
资源评论
收起资源包目录
一些有用的自定义配置单元udf函数、特殊数组、json、数学、字符串函数。___下载.zip (145个子文件)
china_p_c_a.config 141KB
china_day_type.config 4KB
.gitignore 295B
UnicodeTables.java 204KB
Parser.java 62KB
RE2.java 29KB
DFA.java 21KB
Matcher.java 14KB
NFAMachine.java 12KB
Regexp.java 12KB
CharClass.java 12KB
Compiler.java 12KB
Pattern.java 10KB
JsonExtract.java 9KB
Simplify.java 9KB
Unicode.java 8KB
UDFArrayIntersect.java 7KB
Utils.java 7KB
GeoUtils.java 7KB
Prog.java 6KB
UDFMathCosineSimilarity.java 6KB
DFAMachine.java 6KB
UDFArrayJoin.java 6KB
Re2JRegexp.java 5KB
CardUtils.java 5KB
UDFMapEquals.java 5KB
UDFArrayConcat.java 5KB
UDFArraySlice.java 5KB
CharGroup.java 5KB
UDFArrayContains.java 5KB
UDFArrayPosition.java 4KB
UDFArrayRemove.java 4KB
UDFArrayValueCount.java 4KB
UDFStringSplitToMultimap.java 4KB
Options.java 4KB
UDFArrayDistinct.java 4KB
UDFMapConcat.java 4KB
SliceUtils.java 4KB
UDFArrayEquals.java 4KB
UDFArrayShuffle.java 4KB
UDFArrayElementAt.java 4KB
UDFRe2JRegexpExtractAll.java 4KB
UDFStringLevenshteinDistance.java 4KB
UDFArraySort.java 4KB
UDFSequence.java 4KB
JsonPathTokenizer.java 4KB
UDFArrayMax.java 4KB
UDFArrayMin.java 4KB
UDFStringSplitToMap.java 4KB
UDFMapBuild.java 4KB
UDFJsonArrayExtractScalar.java 3KB
UDFJsonArrayExtract.java 3KB
UDFArrayReverse.java 3KB
Inst.java 3KB
UDFMapElementAt.java 3KB
UDFRe2JRegexpSplit.java 3KB
IntArrays.java 3KB
JsonUtils.java 3KB
UDFTypeOfDay.java 3KB
UDFMathCosineSimilarityTest.java 3KB
ConfigUtils.java 3KB
UDFMapEqualsTest.java 2KB
UDFStringHammingDistance.java 2KB
ArrayUtils.java 2KB
UDFZodiacSignCn.java 2KB
UDFMapConcatTest.java 2KB
UDFZodiacSignEn.java 2KB
UDFArrayIntersectTest.java 2KB
MachineInput.java 2KB
UDFMapBuildTest.java 2KB
UDFDayOfWeek.java 2KB
DFAState.java 2KB
UDFMapElementAtTest.java 2KB
UDFDayOfYear.java 2KB
UDFChineseToPinYin.java 2KB
UDFArrayContainsTest.java 2KB
UDFArrayValueCountTest.java 2KB
UDFRe2JRegexpReplace.java 2KB
PatternSyntaxException.java 2KB
UDFBitCount.java 1KB
UDFRe2JRegexpExtractAllTest.java 1KB
UDFCodePoint.java 1KB
UDFMathIsInfinite.java 1KB
MapUtils.java 1KB
UDFStringSplitToMultimapTest.java 1KB
UDFRe2JRegexpExtract.java 1KB
UDFJsonSize.java 1KB
UDFArrayShuffleTest.java 1KB
UDFMathFromBase.java 1KB
UDFUrlEncode.java 1KB
DFAStateKey.java 1KB
SparseSet.java 1KB
UDFStringSplitToMapTest.java 1KB
UDFStringNormalize.java 1KB
UDFStringPosition.java 1KB
UDFJsonExtractScalar.java 1KB
UDFMathInverseNormalCdf.java 1KB
UDFMathNormalCdf.java 1KB
UDFJsonExtract.java 1KB
JsonPath.java 1KB
共 145 条
- 1
- 2
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9154
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功