Built-in Functions
Aggregate Functions
Function | Description |
---|---|
any(expr) | Returns true if at least one value of `expr` is true. |
any_value(expr[, isIgnoreNull]) | Returns some value of `expr` for a group of rows. If `isIgnoreNull` is true, returns only non-null values. |
approx_count_distinct(expr[, relativeSD]) | Returns the estimated cardinality by HyperLogLog++. `relativeSD` defines the maximum relative standard deviation allowed. |
approx_percentile(col, percentage [, accuracy]) | Returns the approximate `percentile` of the numeric or ansi interval column `col` which is the smallest value in the ordered `col` values (sorted from least to greatest) such that no more than `percentage` of `col` values is less than the value or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. |
array_agg(expr) | Collects and returns a list of non-unique elements. |
avg(expr) | Returns the mean calculated from values of a group. |
bit_and(expr) | Returns the bitwise AND of all non-null input values, or null if none. |
bit_or(expr) | Returns the bitwise OR of all non-null input values, or null if none. |
bit_xor(expr) | Returns the bitwise XOR of all non-null input values, or null if none. |
bitmap_construct_agg(child) | Returns a bitmap with the positions of the bits set from all the values from the child expression. The child expression will most likely be bitmap_bit_position(). |
bitmap_or_agg(child) | Returns a bitmap that is the bitwise OR of all of the bitmaps from the child expression. The input should be bitmaps created from bitmap_construct_agg(). |
bool_and(expr) | Returns true if all values of `expr` are true. |
bool_or(expr) | Returns true if at least one value of `expr` is true. |
collect_list(expr) | Collects and returns a list of non-unique elements. |
collect_set(expr) | Collects and returns a set of unique elements. |
corr(expr1, expr2) | Returns Pearson coefficient of correlation between a set of number pairs. |
count(*) | Returns the total number of retrieved rows, including rows containing null. |
count(expr[, expr...]) | Returns the number of rows for which the supplied expression(s) are all non-null. |
count(DISTINCT expr[, expr...]) | Returns the number of rows for which the supplied expression(s) are unique and non-null. |
count_if(expr) | Returns the number of `TRUE` values for the expression. |
count_min_sketch(col, eps, confidence, seed) | Returns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to a `CountMinSketch` before usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space. |
covar_pop(expr1, expr2) | Returns the population covariance of a set of number pairs. |
covar_samp(expr1, expr2) | Returns the sample covariance of a set of number pairs. |
every(expr) | Returns true if all values of `expr` are true. |
first(expr[, isIgnoreNull]) | Returns the first value of `expr` for a group of rows. If `isIgnoreNull` is true, returns only non-null values. |
first_value(expr[, isIgnoreNull]) | Returns the first value of `expr` for a group of rows. If `isIgnoreNull` is true, returns only non-null values. |
grouping(col) | indicates whether a specified column in a GROUP BY is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.", |
grouping_id([col1[, col2 ..]]) | returns the level of grouping, equals to `(grouping(c1) << (n-1)) + (grouping(c2) << (n-2)) + ... + grouping(cn)` |
histogram_numeric(expr, nb) | Computes a histogram on numeric 'expr' using nb bins. The return value is an array of (x,y) pairs representing the centers of the histogram's bins. As the value of 'nb' is increased, the histogram approximation gets finer-grained, but may yield artifacts around outliers. In practice, 20-40 histogram bins appear to work well, with more bins being required for skewed or smaller datasets. Note that this function creates a histogram with non-uniform bin widths. It offers no guarantees in terms of the mean-squared-error of the histogram, but in practice is comparable to the histograms produced by the R/S-Plus statistical computing packages. Note: the output type of the 'x' field in the return value is propagated from the input value consumed in the aggregate function. |
hll_sketch_agg(expr, lgConfigK) | Returns the HllSketch's updatable binary representation. `lgConfigK` (optional) the log-base-2 of K, with K is the number of buckets or slots for the HllSketch. |
hll_union_agg(expr, allowDifferentLgConfigK) | Returns the estimated number of unique values. `allowDifferentLgConfigK` (optional) Allow sketches with different lgConfigK values to be unioned (defaults to false). |
kurtosis(expr) | Returns the kurtosis value calculated from values of a group. |
last(expr[, isIgnoreNull]) | Returns the last value of `expr` for a group of rows. If `isIgnoreNull` is true, returns only non-null values |
last_value(expr[, isIgnoreNull]) | Returns the last value of `expr` for a group of rows. If `isIgnoreNull` is true, returns only non-null values |
max(expr) | Returns the maximum value of `expr`. |
max_by(x, y) | Returns the value of `x` associated with the maximum value of `y`. |
mean(expr) | Returns the mean calculated from values of a group. |
median(col) | Returns the median of numeric or ANSI interval column `col`. |
min(expr) | Returns the minimum value of `expr`. |
min_by(x, y) | Returns the value of `x` associated with the minimum value of `y`. |
mode(col[, deterministic]) | Returns the most frequent value for the values within `col`. NULL values are ignored. If all the values are NULL, or there are 0 rows, returns NULL. When multiple values have the same greatest frequency then either any of values is returned if `deterministic` is false or is not defined, or the lowest value is returned if `deterministic` is true. |
mode() WITHIN GROUP (ORDER BY col) | Returns the most frequent value for the values within `col` (specified in ORDER BY clause). NULL values are ignored. If all the values are NULL, or there are 0 rows, returns NULL. When multiple values have the same greatest frequency only one value will be returned. The value will be chosen based on sort direction. Return the smallest value if sort direction is asc or the largest value if sort direction is desc from multiple values with the same frequency. |
percentile(col, percentage [, frequency]) | Returns the exact percentile value of numeric or ANSI interval column `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The value of frequency should be positive integral |
percentile(col, array(percentage1 [, percentage2]...) [, frequency]) | Returns the exact percentile value array of numeric column `col` at the given percentage(s). Each value of the percentage array must be between 0.0 and 1.0. The value of frequency should be positive integral |
percentile_approx(col, percentage [, accuracy]) | Returns the approximate `percentile` of the numeric or ansi interval column `col` which is the smallest value in the ordered `col` values (sorted from least to greatest) such that no more than `percentage` of `col` values is less than the value or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. |
percentile_cont(percentage) WITHIN GROUP (ORDER BY col) | Return a percentile value based on a continuous distribution of numeric or ANSI interval column `col` at the given `percentage` (specified in ORDER BY clause). |
percentile_disc(percentage) WITHIN GROUP (ORDER BY col) | Return a percentile value based on a discrete distribution of numeric or ANSI interval column `col` at the given `percentage` (specified in ORDER BY clause). |
regr_avgx(y, x) | Returns the average of the independent variable for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_avgy(y, x) | Returns the average of the dependent variable for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_count(y, x) | Returns the number of non-null number pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_intercept(y, x) | Returns the intercept of the univariate linear regression line for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_r2(y, x) | Returns the coefficient of determination for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_slope(y, x) | Returns the slope of the linear regression line for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_sxx(y, x) | Returns REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_sxy(y, x) | Returns REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
regr_syy(y, x) | Returns REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable. |
skewness(expr) | Returns the skewness value calculated from values of a group. |
some(expr) | Returns true if at least one value of `expr` is true. |
std(expr) | Returns the sample standard deviation calculated from values of a group. |
stddev(expr) | Returns the sample standard deviation calculated from values of a group. |
stddev_pop(expr) | Returns the population standard deviation calculated from values of a group. |
stddev_samp(expr) | Returns the sample standard deviation calculated from values of a group. |
sum(expr) | Returns the sum calculated from values of a group. |
try_avg(expr) | Returns the mean calculated from values of a group and the result is null on overflow. |
try_sum(expr) | Returns the sum calculated from values of a group and the result is null on overflow. |
var_pop(expr) | Returns the population variance calculated from values of a group. |
var_samp(expr) | Returns the sample variance calculated from values of a group. |
variance(expr) | Returns the sample variance calculated from values of a group. |
Examples
-- any
SELECT any(col) FROM VALUES (true), (false), (false) AS tab(col);
+--------+
|any(col)|
+--------+
| true|
+--------+
SELECT any(col) FROM VALUES (NULL), (true), (false) AS tab(col);
+--------+
|any(col)|
+--------+
| true|
+--------+
SELECT any(col) FROM VALUES (false), (false), (NULL) AS tab(col);
+--------+
|any(col)|
+--------+
| false|
+--------+
-- any_value
SELECT any_value(col) FROM VALUES (10), (5), (20) AS tab(col);
+--------------+
|any_value(col)|
+--------------+
| 10|
+--------------+
SELECT any_value(col) FROM VALUES (NULL), (5), (20) AS tab(col);
+--------------+
|any_value(col)|
+--------------+
| NULL|
+--------------+
SELECT any_value(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
+--------------+
|any_value(col)|
+--------------+
| 5|
+--------------+
-- approx_count_distinct
SELECT approx_count_distinct(col1) FROM VALUES (1), (1), (2), (2), (3) tab(col1);
+---------------------------+
|approx_count_distinct(col1)|
+---------------------------+
| 3|
+---------------------------+
-- approx_percentile
SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col);
+-------------------------------------------------+
|approx_percentile(col, array(0.5, 0.4, 0.1), 100)|
+-------------------------------------------------+
| [1, 1, 0]|
+-------------------------------------------------+
SELECT approx_percentile(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col);
+--------------------------------+
|approx_percentile(col, 0.5, 100)|
+--------------------------------+
| 7|
+--------------------------------+
SELECT approx_percentile(col, 0.5, 100) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '1' MONTH), (INTERVAL '2' MONTH), (INTERVAL '10' MONTH) AS tab(col);
+--------------------------------+
|approx_percentile(col, 0.5, 100)|
+--------------------------------+
| INTERVAL '1' MONTH|
+--------------------------------+
SELECT approx_percentile(col, array(0.5, 0.7), 100) FROM VALUES (INTERVAL '0' SECOND), (INTERVAL '1' SECOND), (INTERVAL '2' SECOND), (INTERVAL '10' SECOND) AS tab(col);
+--------------------------------------------+
|approx_percentile(col, array(0.5, 0.7), 100)|
+--------------------------------------------+
| [INTERVAL '01' SE...|
+--------------------------------------------+
-- array_agg
SELECT array_agg(col) FROM VALUES (1), (2), (1) AS tab(col);
+-----------------+
|collect_list(col)|
+-----------------+
| [1, 2, 1]|
+-----------------+
-- avg
SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col);
+--------+
|avg(col)|
+--------+
| 2.0|
+--------+
SELECT avg(col) FROM VALUES (1), (2), (NULL) AS tab(col);
+--------+
|avg(col)|
+--------+
| 1.5|
+--------+
-- bit_and
SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col);
+------------+
|bit_and(col)|
+------------+
| 1|
+------------+
-- bit_or
SELECT bit_or(col) FROM VALUES (3), (5) AS tab(col);
+-----------+
|bit_or(col)|
+-----------+
| 7|
+-----------+
-- bit_xor
SELECT bit_xor(col) FROM VALUES (3), (5) AS tab(col);
+------------+
|bit_xor(col)|
+------------+
| 6|
+------------+
-- bitmap_construct_agg
SELECT substring(hex(bitmap_construct_agg(bitmap_bit_position(col))), 0, 6) FROM VALUES (1), (2), (3) AS tab(col);
+--------------------------------------------------------------------+
|substring(hex(bitmap_construct_agg(bitmap_bit_position(col))), 0, 6)|
+--------------------------------------------------------------------+
| 070000|
+--------------------------------------------------------------------+
SELECT substring(hex(bitmap_construct_agg(bitmap_bit_position(col))), 0, 6) FROM VALUES (1), (1), (1) AS tab(col);
+--------------------------------------------------------------------+
|substring(hex(bitmap_construct_agg(bitmap_bit_position(col))), 0, 6)|
+--------------------------------------------------------------------+
| 010000|
+--------------------------------------------------------------------+
-- bitmap_or_agg
SELECT substring(hex(bitmap_or_agg(col)), 0, 6) FROM VALUES (X '10'), (X '20'), (X '40') AS tab(col);
+----------------------------------------+
|substring(hex(bitmap_or_agg(col)), 0, 6)|
+----------------------------------------+
| 700000|
+----------------------------------------+
SELECT substring(hex(bitmap_or_agg(col)), 0, 6) FROM VALUES (X '10'), (X '10'), (X '10') AS tab(col);
+----------------------------------------+
|substring(hex(bitmap_or_agg(col)), 0, 6)|
+----------------------------------------+
| 100000|
+----------------------------------------+
-- bool_and
SELECT bool_and(col) FROM VALUES (true), (true), (true) AS tab(col);
+-------------+
|bool_and(col)|
+-------------+
| true|
+-------------+
SELECT bool_and(col) FROM VALUES (NULL), (true), (true) AS tab(col);
+-------------+
|bool_and(col)|
+-------------+
| true|
+-------------+
SELECT bool_and(col) FROM VALUES (true), (false), (true) AS tab(col);
+-------------+
|bool_and(col)|
+-------------+
| false|
+-------------+
-- bool_or
SELECT bool_or(col) FROM VALUES (true), (false), (false) AS tab(col);
+------------+
|bool_or(col)|
+------------+
| true|
+------------+
SELECT bool_or(col) FROM VALUES (NULL), (true), (false) AS tab(col);
+------------+
|bool_or(col)|
+------------+
| true|
+------------+
SELECT bool_or(col) FROM VALUES (false), (false), (NULL) AS tab(col);
+------------+
|bool_or(col)|
+------------+
| false|
+------------+
-- collect_list
SELECT collect_list(col) FROM VALUES (1), (2), (1) AS tab(col);
+-----------------+
|collect_list(col)|
+-----------------+
| [1, 2, 1]|
+-----------------+
-- collect_set
SELECT collect_set(col) FROM VALUES (1), (2), (1) AS tab(col);
+----------------+
|collect_set(col)|
+----------------+
| [1, 2]|
+----------------+
-- corr
SELECT corr(c1, c2) FROM VALUES (3, 2), (3, 3), (6, 4) as tab(c1, c2);
+------------------+
| corr(c1, c2)|
+------------------+
|0.8660254037844387|
+------------------+
-- count
SELECT count(*) FROM VALUES (NULL), (5), (5), (20) AS tab(col);
+--------+
|count(1)|
+--------+
| 4|
+--------+
SELECT count(col) FROM VALUES (NULL), (5), (5), (20) AS tab(col);
+----------+
|count(col)|
+----------+
| 3|
+----------+
SELECT count(DISTINCT col) FROM VALUES (NULL), (5), (5), (10) AS tab(col);
+-------------------+
|count(DISTINCT col)|
+-------------------+
| 2|
+-------------------+
-- count_if
SELECT count_if(col % 2 = 0) FROM VALUES (NULL), (0), (1), (2), (3) AS tab(col);
+-------------------------+
|count_if(((col % 2) = 0))|
+-------------------------+
| 2|
+-------------------------+
SELECT count_if(col IS NULL) FROM VALUES (NULL), (0), (1), (2), (3) AS tab(col);
+-----------------------+
|count_if((col IS NULL))|
+-----------------------+
| 1|
+-----------------------+
-- count_min_sketch
SELECT hex(count_min_sketch(col, 0.5d, 0.5d, 1)) FROM VALUES (1), (2), (1) AS tab(col);
+---------------------------------------+
|hex(count_min_sketch(col, 0.5, 0.5, 1))|
+---------------------------------------+
| 00000001000000000...|
+---------------------------------------+
-- covar_pop
SELECT covar_pop(c1, c2) FROM VALUES (1,1), (2,2), (3,3) AS tab(c1, c2);
+------------------+
| covar_pop(c1, c2)|
+------------------+
|0.6666666666666666|
+------------------+
-- covar_samp
SELECT covar_samp(c1, c2) FROM VALUES (1,1), (2,2), (3,3) AS tab(c1, c2);
+------------------+
|covar_samp(c1, c2)|
+------------------+
| 1.0|
+------------------+
-- every
SELECT every(col) FROM VALUES (true), (true), (true) AS tab(col);
+----------+
|every(col)|
+----------+
| true|
+----------+
SELECT every(col) FROM VALUES (NULL), (true), (true) AS tab(col);
+----------+
|every(col)|
+----------+
| true|
+----------+
SELECT every(col) FROM VALUES (true), (false), (true) AS tab(col);
+----------+
|every(col)|
+----------+
| false|
+----------+
-- first
SELECT first(col) FROM VALUES (10), (5), (20) AS tab(col);
+----------+
|first(col)|
+----------+
| 10|
+----------+
SELECT first(col) FROM VALUES (NULL), (5), (20) AS tab(col);
+----------+
|first(col)|
+----------+
| NULL|
+----------+
SELECT first(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
+----------+
|first(col)|
+----------+
| 5|
+----------+
-- first_value
SELECT first_value(col) FROM VALUES (10), (5), (20) AS tab(col);
+----------------+
|first_value(col)|
+----------------+
| 10|
+----------------+
SELECT first_value(col) FROM VALUES (NULL), (5), (20) AS tab(col);
+----------------+
|first_value(col)|
+----------------+
| NULL|
+----------------+
SELECT first_value(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
+----------------+
|first_value(col)|
+----------------+
| 5|
+----------------+
-- grouping
SELECT name, grouping(name), sum(age) FROM VALUES (2, 'Alice'), (5, 'Bob') people(age, name) GROUP BY cube(name);
+-----+--------------+--------+
| name|grouping(name)|sum(age)|
+-----+--------------+--------+
| NULL| 1| 7|
|Alice| 0| 2|
| Bob| 0| 5|
+-----+--------------+--------+
-- grouping_id
SELECT name, grouping_id(), sum(age), avg(height) FROM VALUES (2, 'Alice', 165), (5, 'Bob', 180) people(age, name, height) GROUP BY cube(name, height);
+-----+-------------+--------+-----------+
| name|grouping_id()|sum(age)|avg(height)|
+-----+-------------+--------+-----------+
| NULL| 2| 2| 165.0|
|Alice| 0| 2| 165.0|
|Alice| 1| 2| 165.0|
| NULL| 3| 7| 172.5|
| Bob| 1| 5| 180.0|
| Bob| 0| 5| 180.0|
| NULL| 2| 5| 180.0|
+-----+-------------+--------+-----------+
-- histogram_numeric
SELECT histogram_numeric(col, 5) FROM VALUES (0), (1), (2), (10) AS tab(col);
+-------------------------+
|histogram_numeric(col, 5)|
+-------------------------+
| [{0, 1.0}, {1, 1....|
+-------------------------+
-- hll_sketch_agg
SELECT hll_sketch_estimate(hll_sketch_agg(col, 12)) FROM VALUES (1), (1), (2), (2), (3) tab(col);
+--------------------------------------------+
|hll_sketch_estimate(hll_sketch_agg(col, 12))|
+--------------------------------------------+
| 3|
+--------------------------------------------+
-- hll_union_agg
SELECT hll_sketch_estimate(hll_union_agg(sketch, true)) FROM (SELECT hll_sketch_agg(col) as sketch FROM VALUES (1) tab(col) UNION ALL SELECT hll_sketch_agg(col, 20) as sketch FROM VALUES (1) tab(col));
+------------------------------------------------+
|hll_sketch_estimate(hll_union_agg(sketch, true))|
+------------------------------------------------+
| 1|
+------------------------------------------------+
-- kurtosis
SELECT kurtosis(col) FROM VALUES (-10), (-20), (100), (1000) AS tab(col);
+-------------------+
| kurtosis(col)|
+-------------------+
|-0.7014368047529618|
+-------------------+
SELECT kurtosis(col) FROM VALUES (1), (10), (100), (10), (1) as tab(col);
+-------------------+
| kurtosis(col)|
+-------------------+
|0.19432323191699075|
+-------------------+
-- last
SELECT last(col) FROM VALUES (10), (5), (20) AS tab(col);
+---------+
|last(col)|
+---------+
| 20|
+---------+
SELECT last(col) FROM VALUES (10), (5), (NULL) AS tab(col);
+---------+
|last(col)|
+---------+
| NULL|
+---------+
SELECT last(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
+---------+
|last(col)|
+---------+
| 5|
+---------+
-- last_value
SELECT last_value(col) FROM VALUES (10), (5), (20) AS tab(col);
+---------------+
|last_value(col)|
+---------------+
| 20|
+---------------+
SELECT last_value(col) FROM VALUES (10), (5), (NULL) AS tab(col);
+---------------+
|last_value(col)|
+---------------+
| NULL|
+---------------+
SELECT last_value(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
+---------------+
|last_value(col)|
+---------------+
| 5|
+---------------+
-- max
SELECT max(col) FROM VALUES (10), (50), (20) AS tab(col);
+--------+
|max(col)|
+--------+
| 50|
+--------+
-- max_by
SELECT max_by(x, y) FROM VALUES ('a', 10), ('b', 50), ('c', 20) AS tab(x, y);
+------------+
|max_by(x, y)|
+------------+
| b|
+------------+
-- mean
SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col);
+---------+
|mean(col)|
+---------+
| 2.0|
+---------+
SELECT mean(col) FROM VALUES (1), (2), (NULL) AS tab(col);
+---------+
|mean(col)|
+---------+
| 1.5|
+---------+
-- median
SELECT median(col) FROM VALUES (0), (10) AS tab(col);
+-----------+
|median(col)|
+-----------+
| 5.0|
+-----------+
SELECT median(col) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '10' MONTH) AS tab(col);
+--------------------+
| median(col)|
+--------------------+
|INTERVAL '0-5' YE...|
+--------------------+
-- min
SELECT min(col) FROM VALUES (10), (-1), (20) AS tab(col);
+--------+
|min(col)|
+--------+
| -1|
+--------+
-- min_by
SELECT min_by(x, y) FROM VALUES ('a', 10), ('b', 50), ('c', 20) AS tab(x, y);
+------------+
|min_by(x, y)|
+------------+
| a|
+------------+
-- mode
SELECT mode(col) FROM VALUES (0), (10), (10) AS tab(col);
+---------+
|mode(col)|
+---------+
| 10|
+---------+
SELECT mode(col) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '10' MONTH), (INTERVAL '10' MONTH) AS tab(col);
+-------------------+
| mode(col)|
+-------------------+
|INTERVAL '10' MONTH|
+-------------------+
SELECT mode(col) FROM VALUES (0), (10), (10), (null), (null), (null) AS tab(col);
+---------+
|mode(col)|
+---------+
| 10|
+---------+
SELECT mode(col, false) FROM VALUES (-10), (0), (10) AS tab(col);
+---------+
|mode(col)|
+---------+
| 0|
+---------+
SELECT mode(col, true) FROM VALUES (-10), (0), (10) AS tab(col);
+---------------------------------------+
|mode() WITHIN GROUP (ORDER BY col DESC)|
+---------------------------------------+
| -10|
+---------------------------------------+
SELECT mode() WITHIN GROUP (ORDER BY col) FROM VALUES (0), (10), (10) AS tab(col);
+---------------------------------------+
|mode() WITHIN GROUP (ORDER BY col DESC)|
+---------------------------------------+
| 10|
+---------------------------------------+
SELECT mode() WITHIN GROUP (ORDER BY col) FROM VALUES (0), (10), (10), (20), (20) AS tab(col);
+---------------------------------------+
|mode() WITHIN GROUP (ORDER BY col DESC)|
+---------------------------------------+
| 10|
+---------------------------------------+
SELECT mode() WITHIN GROUP (ORDER BY col DESC) FROM VALUES (0), (10), (10), (20), (20) AS tab(col);
+----------------------------------+
|mode() WITHIN GROUP (ORDER BY col)|
+----------------------------------+
| 20|
+----------------------------------+
-- percentile
SELECT percentile(col, 0.3) FROM VALUES (0), (10) AS tab(col);
+-----------------------+
|percentile(col, 0.3, 1)|
+-----------------------+
| 3.0|
+-----------------------+
SELECT percentile(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col);
+-------------------------------------+
|percentile(col, array(0.25, 0.75), 1)|
+-------------------------------------+
| [2.5, 7.5]|
+-------------------------------------+
SELECT percentile(col, 0.5) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '10' MONTH) AS tab(col);
+-----------------------+
|percentile(col, 0.5, 1)|
+-----------------------+
| INTERVAL '0-5' YE...|
+-----------------------+
SELECT percentile(col, array(0.2, 0.5)) FROM VALUES (INTERVAL '0' SECOND), (INTERVAL '10' SECOND) AS tab(col);
+-----------------------------------+
|percentile(col, array(0.2, 0.5), 1)|
+-----------------------------------+
| [INTERVAL '0 00:0...|
+-----------------------------------+
-- percentile_approx
SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col);
+-------------------------------------------------+
|percentile_approx(col, array(0.5, 0.4, 0.1), 100)|
+-------------------------------------------------+
| [1, 1, 0]|
+-------------------------------------------------+
SELECT percentile_approx(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col);
+--------------------------------+
|percentile_approx(col, 0.5, 100)|
+--------------------------------+
| 7|
+--------------------------------+
SELECT percentile_approx(col, 0.5, 100) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '1' MONTH), (INTERVAL '2' MONTH), (INTERVAL '10' MONTH) AS tab(col);
+--------------------------------+
|percentile_approx(col, 0.5, 100)|
+--------------------------------+
| INTERVAL '1' MONTH|
+--------------------------------+
SELECT percentile_approx(col, array(0.5, 0.7), 100) FROM VALUES (INTERVAL '0' SECOND), (INTERVAL '1' SECOND), (INTERVAL '2' SECOND), (INTERVAL '10' SECOND) AS tab(col);
+--------------------------------------------+
|percentile_approx(col, array(0.5, 0.7), 100)|
+--------------------------------------------+
| [INTERVAL '01' SE...|
+--------------------------------------------+
-- percentile_cont
SELECT percentile_cont(0.25) WITHIN GROUP (ORDER BY col) FROM VALUES (0), (10) AS tab(col);
+-------------------------------------------------+
|percentile_cont(0.25) WITHIN GROUP (ORDER BY col)|
+-------------------------------------------------+
| 2.5|
+-------------------------------------------------+
SELECT percentile_cont(0.25) WITHIN GROUP (ORDER BY col) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '10' MONTH) AS tab(col);
+-------------------------------------------------+
|percentile_cont(0.25) WITHIN GROUP (ORDER BY col)|
+-------------------------------------------------+
| INTERVAL '0-2' YE...|
+-------------------------------------------------+
-- percentile_disc
SELECT percentile_disc(0.25) WITHIN GROUP (ORDER BY col) FROM VALUES (0), (10) AS tab(col);
+-------------------------------------------------+
|percentile_disc(0.25) WITHIN GROUP (ORDER BY col)|
+-------------------------------------------------+
| 0.0|
+-------------------------------------------------+
SELECT percentile_disc(0.25) WITHIN GROUP (ORDER BY col) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '10' MONTH) AS tab(col);
+-------------------------------------------------+
|percentile_disc(0.25) WITHIN GROUP (ORDER BY col)|
+-------------------------------------------------+
| INTERVAL '0-0' YE...|
+-------------------------------------------------+
-- regr_avgx
SELECT regr_avgx(y, x) FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x);
+---------------+
|regr_avgx(y, x)|
+---------------+
| 2.75|
+---------------+
SELECT regr_avgx(y, x) FROM VALUES (1, null) AS tab(y, x);
+---------------+
|regr_avgx(y, x)|
+---------------+
| NULL|
+---------------+
SELECT regr_avgx(y, x) FROM VALUES (null, 1) AS tab(y, x);
+---------------+
|regr_avgx(y, x)|
+---------------+
| NULL|
+---------------+
SELECT regr_avgx(y, x) FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x);
+---------------+
|regr_avgx(y, x)|
+---------------+
| 3.0|
+---------------+
SELECT regr_avgx(y, x) FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x);
+---------------+
|regr_avgx(y, x)|
+---------------+
| 3.0|
+---------------+
-- regr_avgy
SELECT regr_avgy(y, x) FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x);
+---------------+
|regr_avgy(y, x)|
+---------------+
| 1.75|
+---------------+
SELECT regr_avgy(y, x) FROM VALUES (1, null) AS tab(y, x);
+---------------+
|regr_avgy(y, x)|
+---------------+
| NULL|
+---------------+
SELECT regr_avgy(y, x) FROM VALUES (null, 1) AS tab(y, x);
+---------------+
|regr_avgy(y, x)|
+---------------+
| NULL|
+---------------+
SELECT regr_avgy(y, x) FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x);
+------------------+
| regr_avgy(y, x)|
+------------------+
|1.6666666666666667|
+------------------+
SELECT regr_avgy(y, x) FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x);
+---------------+
|regr_avgy(y, x)|
+---------------+
| 1.5|
+---------------+
-- regr_count
SELECT regr_count(y, x) FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x);
+----------------+
|regr_count(y, x)|
+----------------+
| 4|
+----------------+
SELECT regr_count(y, x) FROM VALUES (1, null) AS tab(y, x);
+----------------+
|regr_count(y, x)|
+----------------+
| 0|
+----------------+
SELECT regr_count(y, x) FROM VALUES (null, 1) AS tab(y, x);
+----------------+
|regr_count(y, x)|
+----------------+
| 0|
+----------------+
SELECT regr_count(y, x) FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x);
+----------------+
|regr_count(y, x)|
+----------------+
| 3|
+----------------+
SELECT regr_count(y, x) FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x);
+----------------+
|regr_count(y, x)|
+----------------+
| 2|
+----------------+
-- regr_intercept
SELECT regr_intercept(y, x) FROM VALUES (1, 1), (2, 2), (3, 3), (4, 4) AS tab(y, x);
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| 0.0|
+--------------------+
SELECT regr_intercept(y, x) FROM VALUES (1, null) AS tab(y, x);
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| NULL|
+--------------------+
SELECT regr_intercept(y, x) FROM VALUES (null, 1) AS tab(y, x);
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| NULL|
+--------------------+
SELECT regr_intercept(y, x) FROM VALUES (1, 1), (2, null), (3, 3), (4, 4) AS tab(y, x);
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| 0.0|
+--------------------+
SELECT regr_intercept(y, x) FROM VALUES (1, 1), (2, null), (null, 3), (4, 4) AS tab(y, x);
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| 0.0|
+--------------------+
-- regr_r2
SELECT regr_r2(y, x) FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x);
+------------------+
| regr_r2(y, x)|
+------------------+
|0.2727272727272726|
+------------------+
SELECT regr_r2(y, x) FROM VALUES (1, null) AS tab(y, x);
+-------------+
|regr_r2(y, x)|
+-------------+
| NULL|
+-------------+
SELECT regr_r2(y, x) FROM VALUES (null, 1) AS tab(y, x);
+-------------+
|regr_r2(y, x)|
+-------------+
| NULL|
+-------------+
SELECT regr_r2(y, x) FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x);
+------------------+
| regr_r2(y, x)|
+------------------+
|0.7500000000000001|
+------------------+
SELECT regr_r2(y, x) FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x);
+-------------+
|regr_r2(y, x)|
+-------------+
| 1.0|
+-------------+
-- regr_slope
SELECT regr_slope(y, x) FROM VALUES (1, 1), (2, 2), (3, 3), (4, 4) AS tab(y, x);
+----------------+
|regr_slope(y, x)|
+----------------+
| 1.0|
+----------------+
SELECT regr_slope(y, x) FROM VALUES (1, null) AS tab(y, x);
+----------------+
|regr_slope(y, x)|
+----------------+
| NULL|
+----------------+
SELECT regr_slope(y, x) FROM VALUES (null, 1) AS tab(y, x);
+----------------+
|regr_slope(y, x)|
+----------------+
| NULL|
+----------------+
SELECT regr_slope(y, x) FROM VALUES (1, 1), (2, null), (3, 3), (4, 4) AS tab(y, x);
+----------------+
|regr_slope(y, x)|
+----------------+
| 1.0|
+----------------+
SELECT regr_slope(y, x) FROM VALUES (1, 1), (2, null), (null, 3), (4, 4) AS tab(y, x);
+----------------+
|regr_slope(y, x)|
+----------------+
| 1.0|
+----------------+
-- regr_sxx
SELECT regr_sxx(y, x) FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x);
+------------------+
| regr_sxx(y, x)|
+------------------+
|2.7499999999999996|
+------------------+
SELECT regr_sxx(y, x) FROM VALUES (1, null) AS tab(y, x);
+--------------+
|regr_sxx(y, x)|
+--------------+
| NULL|
+--------------+
SELECT regr_sxx(y, x) FROM VALUES (null, 1) AS tab(y, x);
+--------------+
|regr_sxx(y, x)|
+--------------+
| NULL|
+--------------+
SELECT regr_sxx(y, x) FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x);
+--------------+
|regr_sxx(y, x)|
+--------------+
| 2.0|
+--------------+
SELECT regr_sxx(y, x) FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x);
+--------------+
|regr_sxx(y, x)|
+--------------+
| 2.0|
+--------------+
-- regr_sxy
SELECT regr_sxy(y, x) FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x);
+------------------+
| regr_sxy(y, x)|
+------------------+
|0.7499999999999998|
+------------------+
SELECT regr_sxy(y, x) FROM VALUES (1, null) AS tab(y, x);
+--------------+
|regr_sxy(y, x)|
+--------------+
| NULL|
+--------------+
SELECT regr_sxy(y, x) FROM VALUES (null, 1) AS tab(y, x);
+--------------+
|regr_sxy(y, x)|
+--------------+
| NULL|
+--------------+
SELECT regr_sxy(y, x) FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x);
+--------------+
|regr_sxy(y, x)|
+--------------+
| 1.0|
+--------------+
SELECT regr_sxy(y, x) FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x);
+--------------+
|regr_sxy(y, x)|
+--------------+
| 1.0|
+--------------+
-- regr_syy
SELECT regr_syy(y, x) FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x);
+------------------+
| regr_syy(y, x)|
+------------------+
|0.7499999999999999|
+------------------+
SELECT regr_syy(y, x) FROM VALUES (1, null) AS tab(y, x);
+--------------+
|regr_syy(y, x)|
+--------------+
| NULL|
+--------------+
SELECT regr_syy(y, x) FROM VALUES (null, 1) AS tab(y, x);
+--------------+
|regr_syy(y, x)|
+--------------+
| NULL|
+--------------+
SELECT regr_syy(y, x) FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x);
+------------------+
| regr_syy(y, x)|
+------------------+
|0.6666666666666666|
+------------------+
SELECT regr_syy(y, x) FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x);
+--------------+
|regr_syy(y, x)|
+--------------+
| 0.5|
+--------------+
-- skewness
SELECT skewness(col) FROM VALUES (-10), (-20), (100), (1000) AS tab(col);
+------------------+
| skewness(col)|
+------------------+
|1.1135657469022013|
+------------------+
SELECT skewness(col) FROM VALUES (-1000), (-100), (10), (20) AS tab(col);
+-------------------+
| skewness(col)|
+-------------------+
|-1.1135657469022011|
+-------------------+
-- some
SELECT some(col) FROM VALUES (true), (false), (false) AS tab(col);
+---------+
|some(col)|
+---------+
| true|
+---------+
SELECT some(col) FROM VALUES (NULL), (true), (false) AS tab(col);
+---------+
|some(col)|
+---------+
| true|
+---------+
SELECT some(col) FROM VALUES (false), (false), (NULL) AS tab(col);
+---------+
|some(col)|
+---------+
| false|
+---------+
-- std
SELECT std(col) FROM VALUES (1), (2), (3) AS tab(col);
+--------+
|std(col)|
+--------+
| 1.0|
+--------+
-- stddev
SELECT stddev(col) FROM VALUES (1), (2), (3) AS tab(col);
+-----------+
|stddev(col)|
+-----------+
| 1.0|
+-----------+
-- stddev_pop
SELECT stddev_pop(col) FROM VALUES (1), (2), (3) AS tab(col);
+-----------------+
| stddev_pop(col)|
+-----------------+
|0.816496580927726|
+-----------------+
-- stddev_samp
SELECT stddev_samp(col) FROM VALUES (1), (2), (3) AS tab(col);
+----------------+
|stddev_samp(col)|
+----------------+
| 1.0|
+----------------+
-- sum
SELECT sum(col) FROM VALUES (5), (10), (15) AS tab(col);
+--------+
|sum(col)|
+--------+
| 30|
+--------+
SELECT sum(col) FROM VALUES (NULL), (10), (15) AS tab(col);
+--------+
|sum(col)|
+--------+
| 25|
+--------+
SELECT sum(col) FROM VALUES (NULL), (NULL) AS tab(col);
+--------+
|sum(col)|
+--------+
| NULL|
+--------+
-- try_avg
SELECT try_avg(col) FROM VALUES (1), (2), (3) AS tab(col);
+------------+
|try_avg(col)|
+------------+
| 2.0|
+------------+
SELECT try_avg(col) FROM VALUES (1), (2), (NULL) AS tab(col);
+------------+
|try_avg(col)|
+------------+
| 1.5|
+------------+
SELECT try_avg(col) FROM VALUES (interval '2147483647 months'), (interval '1 months') AS tab(col);
+------------+
|try_avg(col)|
+------------+
| NULL|
+------------+
-- try_sum
SELECT try_sum(col) FROM VALUES (5), (10), (15) AS tab(col);
+------------+
|try_sum(col)|
+------------+
| 30|
+------------+
SELECT try_sum(col) FROM VALUES (NULL), (10), (15) AS tab(col);
+------------+
|try_sum(col)|
+------------+
| 25|
+------------+
SELECT try_sum(col) FROM VALUES (NULL), (NULL) AS tab(col);
+------------+
|try_sum(col)|
+------------+
| NULL|
+------------+
SELECT try_sum(col) FROM VALUES (9223372036854775807L), (1L) AS tab(col);
+------------+
|try_sum(col)|
+------------+
| NULL|
+------------+
-- var_pop
SELECT var_pop(col) FROM VALUES (1), (2), (3) AS tab(col);
+------------------+
| var_pop(col)|
+------------------+
|0.6666666666666666|
+------------------+
-- var_samp
SELECT var_samp(col) FROM VALUES (1), (2), (3) AS tab(col);
+-------------+
|var_samp(col)|
+-------------+
| 1.0|
+-------------+
-- variance
SELECT variance(col) FROM VALUES (1), (2), (3) AS tab(col);
+-------------+
|variance(col)|
+-------------+
| 1.0|
+-------------+
Window Functions
Function | Description |
---|---|
cume_dist() | Computes the position of a value relative to all values in the partition. |
dense_rank() | Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps in the ranking sequence. |
lag(input[, offset[, default]]) | Returns the value of `input` at the `offset`th row before the current row in the window. The default value of `offset` is 1 and the default value of `default` is null. If the value of `input` at the `offset`th row is null, null is returned. If there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row), `default` is returned. |
lead(input[, offset[, default]]) | Returns the value of `input` at the `offset`th row after the current row in the window. The default value of `offset` is 1 and the default value of `default` is null. If the value of `input` at the `offset`th row is null, null is returned. If there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row), `default` is returned. |
nth_value(input[, offset]) | Returns the value of `input` at the row that is the `offset`th row from beginning of the window frame. Offset starts at 1. If ignoreNulls=true, we will skip nulls when finding the `offset`th row. Otherwise, every row counts for the `offset`. If there is no such an `offset`th row (e.g., when the offset is 10, size of the window frame is less than 10), null is returned. |
ntile(n) | Divides the rows for each window partition into `n` buckets ranging from 1 to at most `n`. |
percent_rank() | Computes the percentage ranking of a value in a group of values. |
rank() | Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. |
row_number() | Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition. |
Examples
-- cume_dist
SELECT a, b, cume_dist() OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+--------------------------------------------------------------------------------------------------------------+
| a| b|cume_dist() OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)|
+---+---+--------------------------------------------------------------------------------------------------------------+
| A1| 1| 0.6666666666666666|
| A1| 1| 0.6666666666666666|
| A1| 2| 1.0|
| A2| 3| 1.0|
+---+---+--------------------------------------------------------------------------------------------------------------+
-- dense_rank
SELECT a, b, dense_rank(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+--------------------------------------------------------------------------------------------------------------+
| a| b|DENSE_RANK() OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)|
+---+---+--------------------------------------------------------------------------------------------------------------+
| A1| 1| 1|
| A1| 1| 1|
| A1| 2| 2|
| A2| 3| 1|
+---+---+--------------------------------------------------------------------------------------------------------------+
-- lag
SELECT a, b, lag(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+-----------------------------------------------------------------------------------------------------------+
| a| b|lag(b, 1, NULL) OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST ROWS BETWEEN -1 FOLLOWING AND -1 FOLLOWING)|
+---+---+-----------------------------------------------------------------------------------------------------------+
| A1| 1| NULL|
| A1| 1| 1|
| A1| 2| 1|
| A2| 3| NULL|
+---+---+-----------------------------------------------------------------------------------------------------------+
-- lead
SELECT a, b, lead(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+----------------------------------------------------------------------------------------------------------+
| a| b|lead(b, 1, NULL) OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)|
+---+---+----------------------------------------------------------------------------------------------------------+
| A1| 1| 1|
| A1| 1| 2|
| A1| 2| NULL|
| A2| 3| NULL|
+---+---+----------------------------------------------------------------------------------------------------------+
-- nth_value
SELECT a, b, nth_value(b, 2) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+------------------------------------------------------------------------------------------------------------------+
| a| b|nth_value(b, 2) OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)|
+---+---+------------------------------------------------------------------------------------------------------------------+
| A1| 1| 1|
| A1| 1| 1|
| A1| 2| 1|
| A2| 3| NULL|
+---+---+------------------------------------------------------------------------------------------------------------------+
-- ntile
SELECT a, b, ntile(2) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+----------------------------------------------------------------------------------------------------------+
| a| b|ntile(2) OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)|
+---+---+----------------------------------------------------------------------------------------------------------+
| A1| 1| 1|
| A1| 1| 1|
| A1| 2| 2|
| A2| 3| 1|
+---+---+----------------------------------------------------------------------------------------------------------+
-- percent_rank
SELECT a, b, percent_rank(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+----------------------------------------------------------------------------------------------------------------+
| a| b|PERCENT_RANK() OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)|
+---+---+----------------------------------------------------------------------------------------------------------------+
| A1| 1| 0.0|
| A1| 1| 0.0|
| A1| 2| 1.0|
| A2| 3| 0.0|
+---+---+----------------------------------------------------------------------------------------------------------------+
-- rank
SELECT a, b, rank(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+--------------------------------------------------------------------------------------------------------+
| a| b|RANK() OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)|
+---+---+--------------------------------------------------------------------------------------------------------+
| A1| 1| 1|
| A1| 1| 1|
| A1| 2| 3|
| A2| 3| 1|
+---+---+--------------------------------------------------------------------------------------------------------+
-- row_number
SELECT a, b, row_number() OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
+---+---+--------------------------------------------------------------------------------------------------------------+
| a| b|row_number() OVER (PARTITION BY a ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)|
+---+---+--------------------------------------------------------------------------------------------------------------+
| A1| 1| 1|
| A1| 1| 2|
| A1| 2| 3|
| A2| 3| 1|
+---+---+--------------------------------------------------------------------------------------------------------------+
Array Functions
Function | Description |
---|---|
array(expr, ...) | Returns an array with the given elements. |
array_append(array, element) | Add the element at the end of the array passed as first argument. Type of element should be similar to type of the elements of the array. Null element is also appended into the array. But if the array passed, is NULL output is NULL |
array_compact(array) | Removes null values from the array. |
array_contains(array, value) | Returns true if the array contains the value. |
array_distinct(array) | Removes duplicate values from the array. |
array_except(array1, array2) | Returns an array of the elements in array1 but not in array2, without duplicates. |
array_insert(x, pos, val) | Places val into index pos of array x. Array indices start at 1. The maximum negative index is -1 for which the function inserts new element after the current last element. Index above array size appends the array, or prepends the array if index is negative, with 'null' elements. |
array_intersect(array1, array2) | Returns an array of the elements in the intersection of array1 and array2, without duplicates. |
array_join(array, delimiter[, nullReplacement]) | Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. If no value is set for nullReplacement, any null value is filtered. |
array_max(array) | Returns the maximum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped. |
array_min(array) | Returns the minimum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped. |
array_position(array, element) | Returns the (1-based) index of the first matching element of the array as long, or 0 if no match is found. |
array_prepend(array, element) | Add the element at the beginning of the array passed as first argument. Type of element should be the same as the type of the elements of the array. Null element is also prepended to the array. But if the array passed is NULL output is NULL |
array_remove(array, element) | Remove all elements that equal to element from array. |
array_repeat(element, count) | Returns the array containing element count times. |
array_size(expr) | Returns the size of an array. The function returns null for null input. |
array_union(array1, array2) | Returns an array of the elements in the union of array1 and array2, without duplicates. |
arrays_overlap(a1, a2) | Returns true if a1 contains at least a non-null element present also in a2. If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise. |
arrays_zip(a1, a2, ...) | Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays. |
flatten(arrayOfArrays) | Transforms an array of arrays into a single array. |
get(array, index) | Returns element of array at given (0-based) index. If the index points outside of the array boundaries, then this function returns NULL. |
sequence(start, stop, step) | Generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions. Supported types are: byte, short, integer, long, date, timestamp. The start and stop expressions must resolve to the same type. If start and stop expressions resolve to the 'date' or 'timestamp' type then the step expression must resolve to the 'interval' or 'year-month interval' or 'day-time interval' type, otherwise to the same type as the start and stop expressions. |
shuffle(array) | Returns a random permutation of the given array. |
slice(x, start, length) | Subsets array x starting from index start (array indices start at 1, or starting from the end if start is negative) with the specified length. |
sort_array(array[, ascendingOrder]) | Sorts the input array in ascending or descending order according to the natural ordering of the array elements. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order. |
Examples
-- array
SELECT array(1, 2, 3);
+--------------+
|array(1, 2, 3)|
+--------------+
| [1, 2, 3]|
+--------------+
-- array_append
SELECT array_append(array('b', 'd', 'c', 'a'), 'd');
+----------------------------------+
|array_append(array(b, d, c, a), d)|
+----------------------------------+
| [b, d, c, a, d]|
+----------------------------------+
SELECT array_append(array(1, 2, 3, null), null);
+----------------------------------------+
|array_append(array(1, 2, 3, NULL), NULL)|
+----------------------------------------+
| [1, 2, 3, NULL, N...|
+----------------------------------------+
SELECT array_append(CAST(null as Array<Int>), 2);
+---------------------+
|array_append(NULL, 2)|
+---------------------+
| NULL|
+---------------------+
-- array_compact
SELECT array_compact(array(1, 2, 3, null));
+-----------------------------------+
|array_compact(array(1, 2, 3, NULL))|
+-----------------------------------+
| [1, 2, 3]|
+-----------------------------------+
SELECT array_compact(array("a", "b", "c"));
+-----------------------------+
|array_compact(array(a, b, c))|
+-----------------------------+
| [a, b, c]|
+-----------------------------+
-- array_contains
SELECT array_contains(array(1, 2, 3), 2);
+---------------------------------+
|array_contains(array(1, 2, 3), 2)|
+---------------------------------+
| true|
+---------------------------------+
-- array_distinct
SELECT array_distinct(array(1, 2, 3, null, 3));
+---------------------------------------+
|array_distinct(array(1, 2, 3, NULL, 3))|
+---------------------------------------+
| [1, 2, 3, NULL]|
+---------------------------------------+
-- array_except
SELECT array_except(array(1, 2, 3), array(1, 3, 5));
+--------------------------------------------+
|array_except(array(1, 2, 3), array(1, 3, 5))|
+--------------------------------------------+
| [2]|
+--------------------------------------------+
-- array_insert
SELECT array_insert(array(1, 2, 3, 4), 5, 5);
+-------------------------------------+
|array_insert(array(1, 2, 3, 4), 5, 5)|
+-------------------------------------+
| [1, 2, 3, 4, 5]|
+-------------------------------------+
SELECT array_insert(array(5, 4, 3, 2), -1, 1);
+--------------------------------------+
|array_insert(array(5, 4, 3, 2), -1, 1)|
+--------------------------------------+
| [5, 4, 3, 2, 1]|
+--------------------------------------+
SELECT array_insert(array(5, 3, 2, 1), -4, 4);
+--------------------------------------+
|array_insert(array(5, 3, 2, 1), -4, 4)|
+--------------------------------------+
| [5, 4, 3, 2, 1]|
+--------------------------------------+
-- array_intersect
SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
+-----------------------------------------------+
|array_intersect(array(1, 2, 3), array(1, 3, 5))|
+-----------------------------------------------+
| [1, 3]|
+-----------------------------------------------+
-- array_join
SELECT array_join(array('hello', 'world'), ' ');
+----------------------------------+
|array_join(array(hello, world), )|
+----------------------------------+
| hello world|
+----------------------------------+
SELECT array_join(array('hello', null ,'world'), ' ');
+----------------------------------------+
|array_join(array(hello, NULL, world), )|
+----------------------------------------+
| hello world|
+----------------------------------------+
SELECT array_join(array('hello', null ,'world'), ' ', ',');
+-------------------------------------------+
|array_join(array(hello, NULL, world), , ,)|
+-------------------------------------------+
| hello , world|
+-------------------------------------------+
-- array_max
SELECT array_max(array(1, 20, null, 3));
+--------------------------------+
|array_max(array(1, 20, NULL, 3))|
+--------------------------------+
| 20|
+--------------------------------+
-- array_min
SELECT array_min(array(1, 20, null, 3));
+--------------------------------+
|array_min(array(1, 20, NULL, 3))|
+--------------------------------+
| 1|
+--------------------------------+
-- array_position
SELECT array_position(array(312, 773, 708, 708), 708);
+----------------------------------------------+
|array_position(array(312, 773, 708, 708), 708)|
+----------------------------------------------+
| 3|
+----------------------------------------------+
SELECT array_position(array(312, 773, 708, 708), 414);
+----------------------------------------------+
|array_position(array(312, 773, 708, 708), 414)|
+----------------------------------------------+
| 0|
+----------------------------------------------+
-- array_prepend
SELECT array_prepend(array('b', 'd', 'c', 'a'), 'd');
+-----------------------------------+
|array_prepend(array(b, d, c, a), d)|
+-----------------------------------+
| [d, b, d, c, a]|
+-----------------------------------+
SELECT array_prepend(array(1, 2, 3, null), null);
+-----------------------------------------+
|array_prepend(array(1, 2, 3, NULL), NULL)|
+-----------------------------------------+
| [NULL, 1, 2, 3, N...|
+-----------------------------------------+
SELECT array_prepend(CAST(null as Array<Int>), 2);
+----------------------+
|array_prepend(NULL, 2)|
+----------------------+
| NULL|
+----------------------+
-- array_remove
SELECT array_remove(array(1, 2, 3, null, 3), 3);
+----------------------------------------+
|array_remove(array(1, 2, 3, NULL, 3), 3)|
+----------------------------------------+
| [1, 2, NULL]|
+----------------------------------------+
-- array_repeat
SELECT array_repeat('123', 2);
+--------------------+
|array_repeat(123, 2)|
+--------------------+
| [123, 123]|
+--------------------+
-- array_size
SELECT array_size(array('b', 'd', 'c', 'a'));
+-----------------------------+
|array_size(array(b, d, c, a))|
+-----------------------------+
| 4|
+-----------------------------+
-- array_union
SELECT array_union(array(1, 2, 3), array(1, 3, 5));
+-------------------------------------------+
|array_union(array(1, 2, 3), array(1, 3, 5))|
+-------------------------------------------+
| [1, 2, 3, 5]|
+-------------------------------------------+
-- arrays_overlap
SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
+----------------------------------------------+
|arrays_overlap(array(1, 2, 3), array(3, 4, 5))|
+----------------------------------------------+
| true|
+----------------------------------------------+
-- arrays_zip
SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
+------------------------------------------+
|arrays_zip(array(1, 2, 3), array(2, 3, 4))|
+------------------------------------------+
| [{1, 2}, {2, 3}, ...|
+------------------------------------------+
SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
+-------------------------------------------------+
|arrays_zip(array(1, 2), array(2, 3), array(3, 4))|
+-------------------------------------------------+
| [{1, 2, 3}, {2, 3...|
+-------------------------------------------------+
-- flatten
SELECT flatten(array(array(1, 2), array(3, 4)));
+----------------------------------------+
|flatten(array(array(1, 2), array(3, 4)))|
+----------------------------------------+
| [1, 2, 3, 4]|
+----------------------------------------+
-- get
SELECT get(array(1, 2, 3), 0);
+----------------------+
|get(array(1, 2, 3), 0)|
+----------------------+
| 1|
+----------------------+
SELECT get(array(1, 2, 3), 3);
+----------------------+
|get(array(1, 2, 3), 3)|
+----------------------+
| NULL|
+----------------------+
SELECT get(array(1, 2, 3), -1);
+-----------------------+
|get(array(1, 2, 3), -1)|
+-----------------------+
| NULL|
+-----------------------+
-- sequence
SELECT sequence(1, 5);
+---------------+
| sequence(1, 5)|
+---------------+
|[1, 2, 3, 4, 5]|
+---------------+
SELECT sequence(5, 1);
+---------------+
| sequence(5, 1)|
+---------------+
|[5, 4, 3, 2, 1]|
+---------------+
SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);
+----------------------------------------------------------------------+
|sequence(to_date(2018-01-01), to_date(2018-03-01), INTERVAL '1' MONTH)|
+----------------------------------------------------------------------+
| [2018-01-01, 2018...|
+----------------------------------------------------------------------+
SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval '0-1' year to month);
+--------------------------------------------------------------------------------+
|sequence(to_date(2018-01-01), to_date(2018-03-01), INTERVAL '0-1' YEAR TO MONTH)|
+--------------------------------------------------------------------------------+
| [2018-01-01, 2018...|
+--------------------------------------------------------------------------------+
-- shuffle
SELECT shuffle(array(1, 20, 3, 5));
+---------------------------+
|shuffle(array(1, 20, 3, 5))|
+---------------------------+
| [20, 5, 3, 1]|
+---------------------------+
SELECT shuffle(array(1, 20, null, 3));
+------------------------------+
|shuffle(array(1, 20, NULL, 3))|
+------------------------------+
| [1, 20, NULL, 3]|
+------------------------------+
-- slice
SELECT slice(array(1, 2, 3, 4), 2, 2);
+------------------------------+
|slice(array(1, 2, 3, 4), 2, 2)|
+------------------------------+
| [2, 3]|
+------------------------------+
SELECT slice(array(1, 2, 3, 4), -2, 2);
+-------------------------------+
|slice(array(1, 2, 3, 4), -2, 2)|
+-------------------------------+
| [3, 4]|
+-------------------------------+
-- sort_array
SELECT sort_array(array('b', 'd', null, 'c', 'a'), true);
+-----------------------------------------+
|sort_array(array(b, d, NULL, c, a), true)|
+-----------------------------------------+
| [NULL, a, b, c, d]|
+-----------------------------------------+
SELECT sort_array(array('b', 'd', null, 'c', 'a'), false);
+------------------------------------------+
|sort_array(array(b, d, NULL, c, a), false)|
+------------------------------------------+
| [d, c, b, a, NULL]|
+------------------------------------------+
Collection Functions
Function | Description |
---|---|
aggregate(expr, start, merge, finish) | Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function. |
array_sort(expr, func) | Sorts the input array. If func is omitted, sort in ascending order. The elements of the input array must be orderable. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the end of the returned array. Since 3.0.0 this function also sorts and returns the array based on the given comparator function. The comparator will take two arguments representing two elements of the array. It returns a negative integer, 0, or a positive integer as the first element is less than, equal to, or greater than the second element. If the comparator function returns null, the function will fail and raise an error. |
cardinality(expr) | Returns the size of an array or a map. This function returns -1 for null input only if spark.sql.ansi.enabled is false and spark.sql.legacy.sizeOfNull is true. Otherwise, it returns null for null input. With the default settings, the function returns null for null input. |
concat(col1, col2, ..., colN) | Returns the concatenation of col1, col2, ..., colN. |
element_at(array, index) | Returns element of array at given (1-based) index. If Index is 0, Spark will throw an error. If index < 0, accesses elements from the last to the first. The function returns NULL if the index exceeds the length of the array and `spark.sql.ansi.enabled` is set to false. If `spark.sql.ansi.enabled` is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices. |
element_at(map, key) | Returns value for given key. The function returns NULL if the key is not contained in the map. |
exists(expr, pred) | Tests whether a predicate holds for one or more elements in the array. |
filter(expr, func) | Filters the input array using the given predicate. |
forall(expr, pred) | Tests whether a predicate holds for all elements in the array. |
map_filter(expr, func) | Filters entries in a map using the function. |
map_zip_with(map1, map2, function) | Merges two given maps into a single map by applying function to the pair of values with the same key. For keys only presented in one map, NULL will be passed as the value for the missing key. If an input map contains duplicated keys, only the first entry of the duplicated key is passed into the lambda function. |
reduce(expr, start, merge, finish) | Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function. |
reverse(array) | Returns a reversed string or an array with reverse order of elements. |
size(expr) | Returns the size of an array or a map. This function returns -1 for null input only if spark.sql.ansi.enabled is false and spark.sql.legacy.sizeOfNull is true. Otherwise, it returns null for null input. With the default settings, the function returns null for null input. |
transform(expr, func) | Transforms elements in an array using the function. |
transform_keys(expr, func) | Transforms elements in a map using the function. |
transform_values(expr, func) | Transforms values in the map using the function. |
try_element_at(array, index) | Returns element of array at given (1-based) index. If Index is 0, Spark will throw an error. If index < 0, accesses elements from the last to the first. The function always returns NULL if the index exceeds the length of the array. |
try_element_at(map, key) | Returns value for given key. The function always returns NULL if the key is not contained in the map. |
zip_with(left, right, func) | Merges the two given arrays, element-wise, into a single array using function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying function. |
Examples
-- aggregate
SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x);
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|aggregate(array(1, 2, 3), 0, lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()), lambdafunction(namedlambdavariable(), namedlambdavariable()))|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 6|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10);
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|aggregate(array(1, 2, 3), 0, lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()), lambdafunction((namedlambdavariable() * 10), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 60|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-- array_sort
SELECT array_sort(array(5, 6, 1), (left, right) -> case when left < right then -1 when left > right then 1 else 0 end);
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|array_sort(array(5, 6, 1), lambdafunction(CASE WHEN (namedlambdavariable() < namedlambdavariable()) THEN -1 WHEN (namedlambdavariable() > namedlambdavariable()) THEN 1 ELSE 0 END, namedlambdavariable(), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [1, 5, 6]|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT array_sort(array('bc', 'ab', 'dc'), (left, right) -> case when left is null and right is null then 0 when left is null then -1 when right is null then 1 when left < right then 1 when left > right then -1 else 0 end);
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|array_sort(array(bc, ab, dc), lambdafunction(CASE WHEN ((namedlambdavariable() IS NULL) AND (namedlambdavariable() IS NULL)) THEN 0 WHEN (namedlambdavariable() IS NULL) THEN -1 WHEN (namedlambdavariable() IS NULL) THEN 1 WHEN (namedlambdavariable() < namedlambdavariable()) THEN 1 WHEN (namedlambdavariable() > namedlambdavariable()) THEN -1 ELSE 0 END, namedlambdavariable(), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [dc, bc, ab]|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT array_sort(array('b', 'd', null, 'c', 'a'));
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|array_sort(array(b, d, NULL, c, a), lambdafunction((IF(((namedlambdavariable() IS NULL) AND (namedlambdavariable() IS NULL)), 0, (IF((namedlambdavariable() IS NULL), 1, (IF((namedlambdavariable() IS NULL), -1, (IF((namedlambdavariable() < namedlambdavariable()), -1, (IF((namedlambdavariable() > namedlambdavariable()), 1, 0)))))))))), namedlambdavariable(), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [a, b, c, d, NULL]|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-- cardinality
SELECT cardinality(array('b', 'd', 'c', 'a'));
+------------------------------+
|cardinality(array(b, d, c, a))|
+------------------------------+
| 4|
+------------------------------+
SELECT cardinality(map('a', 1, 'b', 2));
+----------------------------+
|cardinality(map(a, 1, b, 2))|
+----------------------------+
| 2|
+----------------------------+
-- concat
SELECT concat('Spark', 'SQL');
+------------------+
|concat(Spark, SQL)|
+------------------+
| SparkSQL|
+------------------+
SELECT concat(array(1, 2, 3), array(4, 5), array(6));
+---------------------------------------------+
|concat(array(1, 2, 3), array(4, 5), array(6))|
+---------------------------------------------+
| [1, 2, 3, 4, 5, 6]|
+---------------------------------------------+
-- element_at
SELECT element_at(array(1, 2, 3), 2);
+-----------------------------+
|element_at(array(1, 2, 3), 2)|
+-----------------------------+
| 2|
+-----------------------------+
SELECT element_at(map(1, 'a', 2, 'b'), 2);
+------------------------------+
|element_at(map(1, a, 2, b), 2)|
+------------------------------+
| b|
+------------------------------+
-- exists
SELECT exists(array(1, 2, 3), x -> x % 2 == 0);
+------------------------------------------------------------------------------------------------+
|exists(array(1, 2, 3), lambdafunction(((namedlambdavariable() % 2) = 0), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------+
| true|
+------------------------------------------------------------------------------------------------+
SELECT exists(array(1, 2, 3), x -> x % 2 == 10);
+-------------------------------------------------------------------------------------------------+
|exists(array(1, 2, 3), lambdafunction(((namedlambdavariable() % 2) = 10), namedlambdavariable()))|
+-------------------------------------------------------------------------------------------------+
| false|
+-------------------------------------------------------------------------------------------------+
SELECT exists(array(1, null, 3), x -> x % 2 == 0);
+---------------------------------------------------------------------------------------------------+
|exists(array(1, NULL, 3), lambdafunction(((namedlambdavariable() % 2) = 0), namedlambdavariable()))|
+---------------------------------------------------------------------------------------------------+
| NULL|
+---------------------------------------------------------------------------------------------------+
SELECT exists(array(0, null, 2, 3, null), x -> x IS NULL);
+----------------------------------------------------------------------------------------------------------+
|exists(array(0, NULL, 2, 3, NULL), lambdafunction((namedlambdavariable() IS NULL), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------------------+
| true|
+----------------------------------------------------------------------------------------------------------+
SELECT exists(array(1, 2, 3), x -> x IS NULL);
+----------------------------------------------------------------------------------------------+
|exists(array(1, 2, 3), lambdafunction((namedlambdavariable() IS NULL), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------+
| false|
+----------------------------------------------------------------------------------------------+
-- filter
SELECT filter(array(1, 2, 3), x -> x % 2 == 1);
+------------------------------------------------------------------------------------------------+
|filter(array(1, 2, 3), lambdafunction(((namedlambdavariable() % 2) = 1), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------+
| [1, 3]|
+------------------------------------------------------------------------------------------------+
SELECT filter(array(0, 2, 3), (x, i) -> x > i);
+-------------------------------------------------------------------------------------------------------------------------------------+
|filter(array(0, 2, 3), lambdafunction((namedlambdavariable() > namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+-------------------------------------------------------------------------------------------------------------------------------------+
| [2, 3]|
+-------------------------------------------------------------------------------------------------------------------------------------+
SELECT filter(array(0, null, 2, 3, null), x -> x IS NOT NULL);
+--------------------------------------------------------------------------------------------------------------+
|filter(array(0, NULL, 2, 3, NULL), lambdafunction((namedlambdavariable() IS NOT NULL), namedlambdavariable()))|
+--------------------------------------------------------------------------------------------------------------+
| [0, 2, 3]|
+--------------------------------------------------------------------------------------------------------------+
-- forall
SELECT forall(array(1, 2, 3), x -> x % 2 == 0);
+------------------------------------------------------------------------------------------------+
|forall(array(1, 2, 3), lambdafunction(((namedlambdavariable() % 2) = 0), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------+
| false|
+------------------------------------------------------------------------------------------------+
SELECT forall(array(2, 4, 8), x -> x % 2 == 0);
+------------------------------------------------------------------------------------------------+
|forall(array(2, 4, 8), lambdafunction(((namedlambdavariable() % 2) = 0), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------+
| true|
+------------------------------------------------------------------------------------------------+
SELECT forall(array(1, null, 3), x -> x % 2 == 0);
+---------------------------------------------------------------------------------------------------+
|forall(array(1, NULL, 3), lambdafunction(((namedlambdavariable() % 2) = 0), namedlambdavariable()))|
+---------------------------------------------------------------------------------------------------+
| false|
+---------------------------------------------------------------------------------------------------+
SELECT forall(array(2, null, 8), x -> x % 2 == 0);
+---------------------------------------------------------------------------------------------------+
|forall(array(2, NULL, 8), lambdafunction(((namedlambdavariable() % 2) = 0), namedlambdavariable()))|
+---------------------------------------------------------------------------------------------------+
| NULL|
+---------------------------------------------------------------------------------------------------+
-- map_filter
SELECT map_filter(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+-------------------------------------------------------------------------------------------------------------------------------------------------+
|map_filter(map(1, 0, 2, 2, 3, -1), lambdafunction((namedlambdavariable() > namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| {1 -> 0, 3 -> -1}|
+-------------------------------------------------------------------------------------------------------------------------------------------------+
-- map_zip_with
SELECT map_zip_with(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2));
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|map_zip_with(map(1, a, 2, b), map(1, x, 2, y), lambdafunction(concat(namedlambdavariable(), namedlambdavariable()), namedlambdavariable(), namedlambdavariable(), namedlambdavariable()))|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {1 -> ax, 2 -> by}|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT map_zip_with(map('a', 1, 'b', 2), map('b', 3, 'c', 4), (k, v1, v2) -> coalesce(v1, 0) + coalesce(v2, 0));
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|map_zip_with(map(a, 1, b, 2), map(b, 3, c, 4), lambdafunction((coalesce(namedlambdavariable(), 0) + coalesce(namedlambdavariable(), 0)), namedlambdavariable(), namedlambdavariable(), namedlambdavariable()))|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {a -> 1, b -> 5, ...|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-- reduce
SELECT reduce(array(1, 2, 3), 0, (acc, x) -> acc + x);
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|reduce(array(1, 2, 3), 0, lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()), lambdafunction(namedlambdavariable(), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 6|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT reduce(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10);
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|reduce(array(1, 2, 3), 0, lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()), lambdafunction((namedlambdavariable() * 10), namedlambdavariable()))|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 60|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-- reverse
SELECT reverse('Spark SQL');
+------------------+
|reverse(Spark SQL)|
+------------------+
| LQS krapS|
+------------------+
SELECT reverse(array(2, 1, 4, 3));
+--------------------------+
|reverse(array(2, 1, 4, 3))|
+--------------------------+
| [3, 4, 1, 2]|
+--------------------------+
-- size
SELECT size(array('b', 'd', 'c', 'a'));
+-----------------------+
|size(array(b, d, c, a))|
+-----------------------+
| 4|
+-----------------------+
SELECT size(map('a', 1, 'b', 2));
+---------------------+
|size(map(a, 1, b, 2))|
+---------------------+
| 2|
+---------------------+
-- transform
SELECT transform(array(1, 2, 3), x -> x + 1);
+---------------------------------------------------------------------------------------------+
|transform(array(1, 2, 3), lambdafunction((namedlambdavariable() + 1), namedlambdavariable()))|
+---------------------------------------------------------------------------------------------+
| [2, 3, 4]|
+---------------------------------------------------------------------------------------------+
SELECT transform(array(1, 2, 3), (x, i) -> x + i);
+----------------------------------------------------------------------------------------------------------------------------------------+
|transform(array(1, 2, 3), lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------------------------------------------------+
| [1, 3, 5]|
+----------------------------------------------------------------------------------------------------------------------------------------+
-- transform_keys
SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + 1);
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
|transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), lambdafunction((namedlambdavariable() + 1), namedlambdavariable(), namedlambdavariable()))|
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| {2 -> 1, 3 -> 2, ...|
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {2 -> 1, 4 -> 2, ...|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-- transform_values
SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> v + 1);
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
|transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), lambdafunction((namedlambdavariable() + 1), namedlambdavariable(), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {1 -> 2, 2 -> 3, ...|
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {1 -> 2, 2 -> 4, ...|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-- try_element_at
SELECT try_element_at(array(1, 2, 3), 2);
+---------------------------------+
|try_element_at(array(1, 2, 3), 2)|
+---------------------------------+
| 2|
+---------------------------------+
SELECT try_element_at(map(1, 'a', 2, 'b'), 2);
+----------------------------------+
|try_element_at(map(1, a, 2, b), 2)|
+----------------------------------+
| b|
+----------------------------------+
-- zip_with
SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x));
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|zip_with(array(1, 2, 3), array(a, b, c), lambdafunction(named_struct(y, namedlambdavariable(), x, namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{a, 1}, {b, 2}, ...|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT zip_with(array(1, 2), array(3, 4), (x, y) -> x + y);
+-------------------------------------------------------------------------------------------------------------------------------------------------+
|zip_with(array(1, 2), array(3, 4), lambdafunction((namedlambdavariable() + namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| [4, 6]|
+-------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT zip_with(array('a', 'b', 'c'), array('d', 'e', 'f'), (x, y) -> concat(x, y));
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
|zip_with(array(a, b, c), array(d, e, f), lambdafunction(concat(namedlambdavariable(), namedlambdavariable()), namedlambdavariable(), namedlambdavariable()))|
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [ad, be, cf]|
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
STRUCT Functions
Function | Description |
---|---|
named_struct(name1, val1, name2, val2, ...) | Creates a struct with the given field names and values. |
struct(col1, col2, col3, ...) | Creates a struct with the given field values. |
Examples
-- named_struct
SELECT named_struct("a", 1, "b", 2, "c", 3);
+------------------------------+
|named_struct(a, 1, b, 2, c, 3)|
+------------------------------+
| {1, 2, 3}|
+------------------------------+
-- struct
SELECT struct(1, 2, 3);
+---------------+
|struct(1, 2, 3)|
+---------------+
| {1, 2, 3}|
+---------------+
Map Functions
Function | Description |
---|---|
map(key0, value0, key1, value1, ...) | Creates a map with the given key/value pairs. |
map_concat(map, ...) | Returns the union of all the given maps |
map_contains_key(map, key) | Returns true if the map contains the key. |
map_entries(map) | Returns an unordered array of all entries in the given map. |
map_from_arrays(keys, values) | Creates a map with a pair of the given key/value arrays. All elements in keys should not be null |
map_from_entries(arrayOfEntries) | Returns a map created from the given array of entries. |
map_keys(map) | Returns an unordered array containing the keys of the map. |
map_values(map) | Returns an unordered array containing the values of the map. |
str_to_map(text[, pairDelim[, keyValueDelim]]) | Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for `pairDelim` and ':' for `keyValueDelim`. Both `pairDelim` and `keyValueDelim` are treated as regular expressions. |
Examples
-- map
SELECT map(1.0, '2', 3.0, '4');
+--------------------+
| map(1.0, 2, 3.0, 4)|
+--------------------+
|{1.0 -> 2, 3.0 -> 4}|
+--------------------+
-- map_concat
SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
+--------------------------------------+
|map_concat(map(1, a, 2, b), map(3, c))|
+--------------------------------------+
| {1 -> a, 2 -> b, ...|
+--------------------------------------+
-- map_contains_key
SELECT map_contains_key(map(1, 'a', 2, 'b'), 1);
+------------------------------------+
|map_contains_key(map(1, a, 2, b), 1)|
+------------------------------------+
| true|
+------------------------------------+
SELECT map_contains_key(map(1, 'a', 2, 'b'), 3);
+------------------------------------+
|map_contains_key(map(1, a, 2, b), 3)|
+------------------------------------+
| false|
+------------------------------------+
-- map_entries
SELECT map_entries(map(1, 'a', 2, 'b'));
+----------------------------+
|map_entries(map(1, a, 2, b))|
+----------------------------+
| [{1, a}, {2, b}]|
+----------------------------+
-- map_from_arrays
SELECT map_from_arrays(array(1.0, 3.0), array('2', '4'));
+---------------------------------------------+
|map_from_arrays(array(1.0, 3.0), array(2, 4))|
+---------------------------------------------+
| {1.0 -> 2, 3.0 -> 4}|
+---------------------------------------------+
-- map_from_entries
SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
+---------------------------------------------------+
|map_from_entries(array(struct(1, a), struct(2, b)))|
+---------------------------------------------------+
| {1 -> a, 2 -> b}|
+---------------------------------------------------+
-- map_keys
SELECT map_keys(map(1, 'a', 2, 'b'));
+-------------------------+
|map_keys(map(1, a, 2, b))|
+-------------------------+
| [1, 2]|
+-------------------------+
-- map_values
SELECT map_values(map(1, 'a', 2, 'b'));
+---------------------------+
|map_values(map(1, a, 2, b))|
+---------------------------+
| [a, b]|
+---------------------------+
-- str_to_map
SELECT str_to_map('a:1,b:2,c:3', ',', ':');
+-----------------------------+
|str_to_map(a:1,b:2,c:3, ,, :)|
+-----------------------------+
| {a -> 1, b -> 2, ...|
+-----------------------------+
SELECT str_to_map('a');
+-------------------+
|str_to_map(a, ,, :)|
+-------------------+
| {a -> NULL}|
+-------------------+
Date and Timestamp Functions
Function | Description |
---|---|
add_months(start_date, num_months) | Returns the date that is `num_months` after `start_date`. |
convert_timezone([sourceTz, ]targetTz, sourceTs) | Converts the timestamp without time zone `sourceTs` from the `sourceTz` time zone to `targetTz`. |
curdate() | Returns the current date at the start of query evaluation. All calls of curdate within the same query return the same value. |
current_date() | Returns the current date at the start of query evaluation. All calls of current_date within the same query return the same value. |
current_date | Returns the current date at the start of query evaluation. |
current_timestamp() | Returns the current timestamp at the start of query evaluation. All calls of current_timestamp within the same query return the same value. |
current_timestamp | Returns the current timestamp at the start of query evaluation. |
current_timezone() | Returns the current session local timezone. |
date_add(start_date, num_days) | Returns the date that is `num_days` after `start_date`. |
date_diff(endDate, startDate) | Returns the number of days from `startDate` to `endDate`. |
date_format(timestamp, fmt) | Converts `timestamp` to a value of string in the format specified by the date format `fmt`. |
date_from_unix_date(days) | Create date from the number of days since 1970-01-01. |
date_part(field, source) | Extracts a part of the date/timestamp or interval source. |
date_sub(start_date, num_days) | Returns the date that is `num_days` before `start_date`. |
date_trunc(fmt, ts) | Returns timestamp `ts` truncated to the unit specified by the format model `fmt`. |
dateadd(start_date, num_days) | Returns the date that is `num_days` after `start_date`. |
datediff(endDate, startDate) | Returns the number of days from `startDate` to `endDate`. |
datepart(field, source) | Extracts a part of the date/timestamp or interval source. |
day(date) | Returns the day of month of the date/timestamp. |
dayname(date) | Returns the three-letter abbreviated day name from the given date. |
dayofmonth(date) | Returns the day of month of the date/timestamp. |
dayofweek(date) | Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday). |
dayofyear(date) | Returns the day of year of the date/timestamp. |
extract(field FROM source) | Extracts a part of the date/timestamp or interval source. |
from_unixtime(unix_time[, fmt]) | Returns `unix_time` in the specified `fmt`. |
from_utc_timestamp(timestamp, timezone) | Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'. |
hour(timestamp) | Returns the hour component of the string/timestamp. |
last_day(date) | Returns the last day of the month which the date belongs to. |
localtimestamp() | Returns the current timestamp without time zone at the start of query evaluation. All calls of localtimestamp within the same query return the same value. |
localtimestamp | Returns the current local date-time at the session time zone at the start of query evaluation. |
make_date(year, month, day) | Create date from year, month and day fields. If the configuration `spark.sql.ansi.enabled` is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead. |
make_dt_interval([days[, hours[, mins[, secs]]]]) | Make DayTimeIntervalType duration from days, hours, mins and secs. |
make_interval([years[, months[, weeks[, days[, hours[, mins[, secs]]]]]]]) | Make interval from years, months, weeks, days, hours, mins and secs. |
make_timestamp(year, month, day, hour, min, sec[, timezone]) | Create timestamp from year, month, day, hour, min, sec and timezone fields. The result data type is consistent with the value of configuration `spark.sql.timestampType`. If the configuration `spark.sql.ansi.enabled` is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead. |
make_timestamp_ltz(year, month, day, hour, min, sec[, timezone]) | Create the current timestamp with local time zone from year, month, day, hour, min, sec and timezone fields. If the configuration `spark.sql.ansi.enabled` is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead. |
make_timestamp_ntz(year, month, day, hour, min, sec) | Create local date-time from year, month, day, hour, min, sec fields. If the configuration `spark.sql.ansi.enabled` is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead. |
make_ym_interval([years[, months]]) | Make year-month interval from years, months. |
minute(timestamp) | Returns the minute component of the string/timestamp. |
month(date) | Returns the month component of the date/timestamp. |
monthname(date) | Returns the three-letter abbreviated month name from the given date. |
months_between(timestamp1, timestamp2[, roundOff]) | If `timestamp1` is later than `timestamp2`, then the result is positive. If `timestamp1` and `timestamp2` are on the same day of month, or both are the last day of month, time of day will be ignored. Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits unless roundOff=false. |
next_day(start_date, day_of_week) | Returns the first date which is later than `start_date` and named as indicated. The function returns NULL if at least one of the input parameters is NULL. When both of the input parameters are not NULL and day_of_week is an invalid input, the function throws SparkIllegalArgumentException if `spark.sql.ansi.enabled` is set to true, otherwise NULL. |
now() | Returns the current timestamp at the start of query evaluation. |
quarter(date) | Returns the quarter of the year for date, in the range 1 to 4. |
second(timestamp) | Returns the second component of the string/timestamp. |
session_window(time_column, gap_duration) | Generates session window given a timestamp specifying column and gap duration. See 'Types of time windows' in Structured Streaming guide doc for detailed explanation and examples. |
timestamp_micros(microseconds) | Creates timestamp from the number of microseconds since UTC epoch. |
timestamp_millis(milliseconds) | Creates timestamp from the number of milliseconds since UTC epoch. |
timestamp_seconds(seconds) | Creates timestamp from the number of seconds (can be fractional) since UTC epoch. |
to_date(date_str[, fmt]) | Parses the `date_str` expression with the `fmt` expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the `fmt` is omitted. |
to_timestamp(timestamp_str[, fmt]) | Parses the `timestamp_str` expression with the `fmt` expression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if the `fmt` is omitted. The result data type is consistent with the value of configuration `spark.sql.timestampType`. |
to_timestamp_ltz(timestamp_str[, fmt]) | Parses the `timestamp_str` expression with the `fmt` expression to a timestamp with local time zone. Returns null with invalid input. By default, it follows casting rules to a timestamp if the `fmt` is omitted. |
to_timestamp_ntz(timestamp_str[, fmt]) | Parses the `timestamp_str` expression with the `fmt` expression to a timestamp without time zone. Returns null with invalid input. By default, it follows casting rules to a timestamp if the `fmt` is omitted. |
to_unix_timestamp(timeExp[, fmt]) | Returns the UNIX timestamp of the given time. |
to_utc_timestamp(timestamp, timezone) | Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'. |
trunc(date, fmt) | Returns `date` with the time portion of the day truncated to the unit specified by the format model `fmt`. |
try_to_timestamp(timestamp_str[, fmt]) | Parses the `timestamp_str` expression with the `fmt` expression to a timestamp. The function always returns null on an invalid input with/without ANSI SQL mode enabled. By default, it follows casting rules to a timestamp if the `fmt` is omitted. The result data type is consistent with the value of configuration `spark.sql.timestampType`. |
unix_date(date) | Returns the number of days since 1970-01-01. |
unix_micros(timestamp) | Returns the number of microseconds since 1970-01-01 00:00:00 UTC. |
unix_millis(timestamp) | Returns the number of milliseconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision. |
unix_seconds(timestamp) | Returns the number of seconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision. |
unix_timestamp([timeExp[, fmt]]) | Returns the UNIX timestamp of current or specified time. |
weekday(date) | Returns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday). |
weekofyear(date) | Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days. |
window(time_column, window_duration[, slide_duration[, start_time]]) | Bucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. See 'Window Operations on Event Time' in Structured Streaming guide doc for detailed explanation and examples. |
window_time(window_column) | Extract the time value from time/session window column which can be used for event time value of window. The extracted time is (window.end - 1) which reflects the fact that the aggregating windows have exclusive upper bound - [start, end) See 'Window Operations on Event Time' in Structured Streaming guide doc for detailed explanation and examples. |
year(date) | Returns the year component of the date/timestamp. |
Examples
-- add_months
SELECT add_months('2016-08-31', 1);
+-------------------------+
|add_months(2016-08-31, 1)|
+-------------------------+
| 2016-09-30|
+-------------------------+
-- convert_timezone
SELECT convert_timezone('Europe/Brussels', 'America/Los_Angeles', timestamp_ntz'2021-12-06 00:00:00');
+-------------------------------------------------------------------------------------------+
|convert_timezone(Europe/Brussels, America/Los_Angeles, TIMESTAMP_NTZ '2021-12-06 00:00:00')|
+-------------------------------------------------------------------------------------------+
| 2021-12-05 15:00:00|
+-------------------------------------------------------------------------------------------+
SELECT convert_timezone('Europe/Brussels', timestamp_ntz'2021-12-05 15:00:00');
+------------------------------------------------------------------------------------------+
|convert_timezone(current_timezone(), Europe/Brussels, TIMESTAMP_NTZ '2021-12-05 15:00:00')|
+------------------------------------------------------------------------------------------+
| 2021-12-05 16:00:00|
+------------------------------------------------------------------------------------------+
-- curdate
SELECT curdate();
+--------------+
|current_date()|
+--------------+
| 2024-09-23|
+--------------+
-- current_date
SELECT current_date();
+--------------+
|current_date()|
+--------------+
| 2024-09-23|
+--------------+
SELECT current_date;
+--------------+
|current_date()|
+--------------+
| 2024-09-23|
+--------------+
-- current_timestamp
SELECT current_timestamp();
+--------------------+
| current_timestamp()|
+--------------------+
|2024-09-23 04:10:...|
+--------------------+
SELECT current_timestamp;
+--------------------+
| current_timestamp()|
+--------------------+
|2024-09-23 04:10:...|
+--------------------+
-- current_timezone
SELECT current_timezone();
+------------------+
|current_timezone()|
+------------------+
| Etc/UTC|
+------------------+
-- date_add
SELECT date_add('2016-07-30', 1);
+-----------------------+
|date_add(2016-07-30, 1)|
+-----------------------+
| 2016-07-31|
+-----------------------+
-- date_diff
SELECT date_diff('2009-07-31', '2009-07-30');
+---------------------------------+
|date_diff(2009-07-31, 2009-07-30)|
+---------------------------------+
| 1|
+---------------------------------+
SELECT date_diff('2009-07-30', '2009-07-31');
+---------------------------------+
|date_diff(2009-07-30, 2009-07-31)|
+---------------------------------+
| -1|
+---------------------------------+
-- date_format
SELECT date_format('2016-04-08', 'y');
+--------------------------+
|date_format(2016-04-08, y)|
+--------------------------+
| 2016|
+--------------------------+
-- date_from_unix_date
SELECT date_from_unix_date(1);
+----------------------+
|date_from_unix_date(1)|
+----------------------+
| 1970-01-02|
+----------------------+
-- date_part
SELECT date_part('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+-------------------------------------------------------+
|date_part(YEAR, TIMESTAMP '2019-08-12 01:00:00.123456')|
+-------------------------------------------------------+
| 2019|
+-------------------------------------------------------+
SELECT date_part('week', timestamp'2019-08-12 01:00:00.123456');
+-------------------------------------------------------+
|date_part(week, TIMESTAMP '2019-08-12 01:00:00.123456')|
+-------------------------------------------------------+
| 33|
+-------------------------------------------------------+
SELECT date_part('doy', DATE'2019-08-12');
+---------------------------------+
|date_part(doy, DATE '2019-08-12')|
+---------------------------------+
| 224|
+---------------------------------+
SELECT date_part('SECONDS', timestamp'2019-10-01 00:00:01.000001');
+----------------------------------------------------------+
|date_part(SECONDS, TIMESTAMP '2019-10-01 00:00:01.000001')|
+----------------------------------------------------------+
| 1.000001|
+----------------------------------------------------------+
SELECT date_part('days', interval 5 days 3 hours 7 minutes);
+-------------------------------------------------+
|date_part(days, INTERVAL '5 03:07' DAY TO MINUTE)|
+-------------------------------------------------+
| 5|
+-------------------------------------------------+
SELECT date_part('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+-------------------------------------------------------------+
|date_part(seconds, INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+-------------------------------------------------------------+
| 30.001001|
+-------------------------------------------------------------+
SELECT date_part('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
+--------------------------------------------------+
|date_part(MONTH, INTERVAL '2021-11' YEAR TO MONTH)|
+--------------------------------------------------+
| 11|
+--------------------------------------------------+
SELECT date_part('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+---------------------------------------------------------------+
|date_part(MINUTE, INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+---------------------------------------------------------------+
| 55|
+---------------------------------------------------------------+
-- date_sub
SELECT date_sub('2016-07-30', 1);
+-----------------------+
|date_sub(2016-07-30, 1)|
+-----------------------+
| 2016-07-29|
+-----------------------+
-- date_trunc
SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359');
+-----------------------------------------+
|date_trunc(YEAR, 2015-03-05T09:32:05.359)|
+-----------------------------------------+
| 2015-01-01 00:00:00|
+-----------------------------------------+
SELECT date_trunc('MM', '2015-03-05T09:32:05.359');
+---------------------------------------+
|date_trunc(MM, 2015-03-05T09:32:05.359)|
+---------------------------------------+
| 2015-03-01 00:00:00|
+---------------------------------------+
SELECT date_trunc('DD', '2015-03-05T09:32:05.359');
+---------------------------------------+
|date_trunc(DD, 2015-03-05T09:32:05.359)|
+---------------------------------------+
| 2015-03-05 00:00:00|
+---------------------------------------+
SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359');
+-----------------------------------------+
|date_trunc(HOUR, 2015-03-05T09:32:05.359)|
+-----------------------------------------+
| 2015-03-05 09:00:00|
+-----------------------------------------+
SELECT date_trunc('MILLISECOND', '2015-03-05T09:32:05.123456');
+---------------------------------------------------+
|date_trunc(MILLISECOND, 2015-03-05T09:32:05.123456)|
+---------------------------------------------------+
| 2015-03-05 09:32:...|
+---------------------------------------------------+
-- dateadd
SELECT dateadd('2016-07-30', 1);
+-----------------------+
|date_add(2016-07-30, 1)|
+-----------------------+
| 2016-07-31|
+-----------------------+
-- datediff
SELECT datediff('2009-07-31', '2009-07-30');
+--------------------------------+
|datediff(2009-07-31, 2009-07-30)|
+--------------------------------+
| 1|
+--------------------------------+
SELECT datediff('2009-07-30', '2009-07-31');
+--------------------------------+
|datediff(2009-07-30, 2009-07-31)|
+--------------------------------+
| -1|
+--------------------------------+
-- datepart
SELECT datepart('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+----------------------------------------------------------+
|datepart(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+----------------------------------------------------------+
| 2019|
+----------------------------------------------------------+
SELECT datepart('week', timestamp'2019-08-12 01:00:00.123456');
+----------------------------------------------------------+
|datepart(week FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+----------------------------------------------------------+
| 33|
+----------------------------------------------------------+
SELECT datepart('doy', DATE'2019-08-12');
+------------------------------------+
|datepart(doy FROM DATE '2019-08-12')|
+------------------------------------+
| 224|
+------------------------------------+
SELECT datepart('SECONDS', timestamp'2019-10-01 00:00:01.000001');
+-------------------------------------------------------------+
|datepart(SECONDS FROM TIMESTAMP '2019-10-01 00:00:01.000001')|
+-------------------------------------------------------------+
| 1.000001|
+-------------------------------------------------------------+
SELECT datepart('days', interval 5 days 3 hours 7 minutes);
+----------------------------------------------------+
|datepart(days FROM INTERVAL '5 03:07' DAY TO MINUTE)|
+----------------------------------------------------+
| 5|
+----------------------------------------------------+
SELECT datepart('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+----------------------------------------------------------------+
|datepart(seconds FROM INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+----------------------------------------------------------------+
| 30.001001|
+----------------------------------------------------------------+
SELECT datepart('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
+-----------------------------------------------------+
|datepart(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH)|
+-----------------------------------------------------+
| 11|
+-----------------------------------------------------+
SELECT datepart('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+------------------------------------------------------------------+
|datepart(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+------------------------------------------------------------------+
| 55|
+------------------------------------------------------------------+
-- day
SELECT day('2009-07-30');
+---------------+
|day(2009-07-30)|
+---------------+
| 30|
+---------------+
-- dayname
SELECT dayname(DATE('2008-02-20'));
+-------------------+
|dayname(2008-02-20)|
+-------------------+
| Wed|
+-------------------+
-- dayofmonth
SELECT dayofmonth('2009-07-30');
+----------------------+
|dayofmonth(2009-07-30)|
+----------------------+
| 30|
+----------------------+
-- dayofweek
SELECT dayofweek('2009-07-30');
+---------------------+
|dayofweek(2009-07-30)|
+---------------------+
| 5|
+---------------------+
-- dayofyear
SELECT dayofyear('2016-04-09');
+---------------------+
|dayofyear(2016-04-09)|
+---------------------+
| 100|
+---------------------+
-- extract
SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
| 2019|
+---------------------------------------------------------+
SELECT extract(week FROM timestamp'2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|extract(week FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
| 33|
+---------------------------------------------------------+
SELECT extract(doy FROM DATE'2019-08-12');
+-----------------------------------+
|extract(doy FROM DATE '2019-08-12')|
+-----------------------------------+
| 224|
+-----------------------------------+
SELECT extract(SECONDS FROM timestamp'2019-10-01 00:00:01.000001');
+------------------------------------------------------------+
|extract(SECONDS FROM TIMESTAMP '2019-10-01 00:00:01.000001')|
+------------------------------------------------------------+
| 1.000001|
+------------------------------------------------------------+
SELECT extract(days FROM interval 5 days 3 hours 7 minutes);
+---------------------------------------------------+
|extract(days FROM INTERVAL '5 03:07' DAY TO MINUTE)|
+---------------------------------------------------+
| 5|
+---------------------------------------------------+
SELECT extract(seconds FROM interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+---------------------------------------------------------------+
|extract(seconds FROM INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+---------------------------------------------------------------+
| 30.001001|
+---------------------------------------------------------------+
SELECT extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH);
+----------------------------------------------------+
|extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH)|
+----------------------------------------------------+
| 11|
+----------------------------------------------------+
SELECT extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+-----------------------------------------------------------------+
|extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+-----------------------------------------------------------------+
| 55|
+-----------------------------------------------------------------+
-- from_unixtime
SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
+-------------------------------------+
|from_unixtime(0, yyyy-MM-dd HH:mm:ss)|
+-------------------------------------+
| 1970-01-01 00:00:00|
+-------------------------------------+
SELECT from_unixtime(0);
+-------------------------------------+
|from_unixtime(0, yyyy-MM-dd HH:mm:ss)|
+-------------------------------------+
| 1970-01-01 00:00:00|
+-------------------------------------+
-- from_utc_timestamp
SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
+------------------------------------------+
|from_utc_timestamp(2016-08-31, Asia/Seoul)|
+------------------------------------------+
| 2016-08-31 09:00:00|
+------------------------------------------+
-- hour
SELECT hour('2009-07-30 12:58:59');
+-------------------------+
|hour(2009-07-30 12:58:59)|
+-------------------------+
| 12|
+-------------------------+
-- last_day
SELECT last_day('2009-01-12');
+--------------------+
|last_day(2009-01-12)|
+--------------------+
| 2009-01-31|
+--------------------+
-- localtimestamp
SELECT localtimestamp();
+--------------------+
| localtimestamp()|
+--------------------+
|2024-09-23 04:10:...|
+--------------------+
-- make_date
SELECT make_date(2013, 7, 15);
+----------------------+
|make_date(2013, 7, 15)|
+----------------------+
| 2013-07-15|
+----------------------+
SELECT make_date(2019, 7, NULL);
+------------------------+
|make_date(2019, 7, NULL)|
+------------------------+
| NULL|
+------------------------+
-- make_dt_interval
SELECT make_dt_interval(1, 12, 30, 01.001001);
+-------------------------------------+
|make_dt_interval(1, 12, 30, 1.001001)|
+-------------------------------------+
| INTERVAL '1 12:30...|
+-------------------------------------+
SELECT make_dt_interval(2);
+-----------------------------------+
|make_dt_interval(2, 0, 0, 0.000000)|
+-----------------------------------+
| INTERVAL '2 00:00...|
+-----------------------------------+
SELECT make_dt_interval(100, null, 3);
+----------------------------------------+
|make_dt_interval(100, NULL, 3, 0.000000)|
+----------------------------------------+
| NULL|
+----------------------------------------+
-- make_interval
SELECT make_interval(100, 11, 1, 1, 12, 30, 01.001001);
+----------------------------------------------+
|make_interval(100, 11, 1, 1, 12, 30, 1.001001)|
+----------------------------------------------+
| 100 years 11 mont...|
+----------------------------------------------+
SELECT make_interval(100, null, 3);
+----------------------------------------------+
|make_interval(100, NULL, 3, 0, 0, 0, 0.000000)|
+----------------------------------------------+
| NULL|
+----------------------------------------------+
SELECT make_interval(0, 1, 0, 1, 0, 0, 100.000001);
+-------------------------------------------+
|make_interval(0, 1, 0, 1, 0, 0, 100.000001)|
+-------------------------------------------+
| 1 months 1 days 1...|
+-------------------------------------------+
-- make_timestamp
SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887);
+-------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, 45.887)|
+-------------------------------------------+
| 2014-12-28 06:30:...|
+-------------------------------------------+
SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887, 'CET');
+------------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, 45.887, CET)|
+------------------------------------------------+
| 2014-12-28 05:30:...|
+------------------------------------------------+
SELECT make_timestamp(2019, 6, 30, 23, 59, 60);
+---------------------------------------+
|make_timestamp(2019, 6, 30, 23, 59, 60)|
+---------------------------------------+
| 2019-07-01 00:00:00|
+---------------------------------------+
SELECT make_timestamp(2019, 6, 30, 23, 59, 1);
+--------------------------------------+
|make_timestamp(2019, 6, 30, 23, 59, 1)|
+--------------------------------------+
| 2019-06-30 23:59:01|
+--------------------------------------+
SELECT make_timestamp(null, 7, 22, 15, 30, 0);
+--------------------------------------+
|make_timestamp(NULL, 7, 22, 15, 30, 0)|
+--------------------------------------+
| NULL|
+--------------------------------------+
-- make_timestamp_ltz
SELECT make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887);
+-----------------------------------------------+
|make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887)|
+-----------------------------------------------+
| 2014-12-28 06:30:...|
+-----------------------------------------------+
SELECT make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887, 'CET');
+----------------------------------------------------+
|make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887, CET)|
+----------------------------------------------------+
| 2014-12-28 05:30:...|
+----------------------------------------------------+
SELECT make_timestamp_ltz(2019, 6, 30, 23, 59, 60);
+-------------------------------------------+
|make_timestamp_ltz(2019, 6, 30, 23, 59, 60)|
+-------------------------------------------+
| 2019-07-01 00:00:00|
+-------------------------------------------+
SELECT make_timestamp_ltz(null, 7, 22, 15, 30, 0);
+------------------------------------------+
|make_timestamp_ltz(NULL, 7, 22, 15, 30, 0)|
+------------------------------------------+
| NULL|
+------------------------------------------+
-- make_timestamp_ntz
SELECT make_timestamp_ntz(2014, 12, 28, 6, 30, 45.887);
+-----------------------------------------------+
|make_timestamp_ntz(2014, 12, 28, 6, 30, 45.887)|
+-----------------------------------------------+
| 2014-12-28 06:30:...|
+-----------------------------------------------+
SELECT make_timestamp_ntz(2019, 6, 30, 23, 59, 60);
+-------------------------------------------+
|make_timestamp_ntz(2019, 6, 30, 23, 59, 60)|
+-------------------------------------------+
| 2019-07-01 00:00:00|
+-------------------------------------------+
SELECT make_timestamp_ntz(null, 7, 22, 15, 30, 0);
+------------------------------------------+
|make_timestamp_ntz(NULL, 7, 22, 15, 30, 0)|
+------------------------------------------+
| NULL|
+------------------------------------------+
-- make_ym_interval
SELECT make_ym_interval(1, 2);
+----------------------+
|make_ym_interval(1, 2)|
+----------------------+
| INTERVAL '1-2' YE...|
+----------------------+
SELECT make_ym_interval(1, 0);
+----------------------+
|make_ym_interval(1, 0)|
+----------------------+
| INTERVAL '1-0' YE...|
+----------------------+
SELECT make_ym_interval(-1, 1);
+-----------------------+
|make_ym_interval(-1, 1)|
+-----------------------+
| INTERVAL '-0-11' ...|
+-----------------------+
SELECT make_ym_interval(2);
+----------------------+
|make_ym_interval(2, 0)|
+----------------------+
| INTERVAL '2-0' YE...|
+----------------------+
-- minute
SELECT minute('2009-07-30 12:58:59');
+---------------------------+
|minute(2009-07-30 12:58:59)|
+---------------------------+
| 58|
+---------------------------+
-- month
SELECT month('2016-07-30');
+-----------------+
|month(2016-07-30)|
+-----------------+
| 7|
+-----------------+
-- monthname
SELECT monthname('2008-02-20');
+---------------------+
|monthname(2008-02-20)|
+---------------------+
| Feb|
+---------------------+
-- months_between
SELECT months_between('1997-02-28 10:30:00', '1996-10-30');
+-----------------------------------------------------+
|months_between(1997-02-28 10:30:00, 1996-10-30, true)|
+-----------------------------------------------------+
| 3.94959677|
+-----------------------------------------------------+
SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false);
+------------------------------------------------------+
|months_between(1997-02-28 10:30:00, 1996-10-30, false)|
+------------------------------------------------------+
| 3.9495967741935485|
+------------------------------------------------------+
-- next_day
SELECT next_day('2015-01-14', 'TU');
+------------------------+
|next_day(2015-01-14, TU)|
+------------------------+
| 2015-01-20|
+------------------------+
-- now
SELECT now();
+--------------------+
| now()|
+--------------------+
|2024-09-23 04:10:...|
+--------------------+
-- quarter
SELECT quarter('2016-08-31');
+-------------------+
|quarter(2016-08-31)|
+-------------------+
| 3|
+-------------------+
-- second
SELECT second('2009-07-30 12:58:59');
+---------------------------+
|second(2009-07-30 12:58:59)|
+---------------------------+
| 59|
+---------------------------+
-- session_window
SELECT a, session_window.start, session_window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:10:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, session_window(b, '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:09:30| 2|
| A1|2021-01-01 00:10:00|2021-01-01 00:15:00| 1|
| A2|2021-01-01 00:01:00|2021-01-01 00:06:00| 1|
+---+-------------------+-------------------+---+
SELECT a, session_window.start, session_window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:10:00'), ('A2', '2021-01-01 00:01:00'), ('A2', '2021-01-01 00:04:30') AS tab(a, b) GROUP by a, session_window(b, CASE WHEN a = 'A1' THEN '5 minutes' WHEN a = 'A2' THEN '1 minute' ELSE '10 minutes' END) ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:09:30| 2|
| A1|2021-01-01 00:10:00|2021-01-01 00:15:00| 1|
| A2|2021-01-01 00:01:00|2021-01-01 00:02:00| 1|
| A2|2021-01-01 00:04:30|2021-01-01 00:05:30| 1|
+---+-------------------+-------------------+---+
-- timestamp_micros
SELECT timestamp_micros(1230219000123123);
+----------------------------------+
|timestamp_micros(1230219000123123)|
+----------------------------------+
| 2008-12-25 15:30:...|
+----------------------------------+
-- timestamp_millis
SELECT timestamp_millis(1230219000123);
+-------------------------------+
|timestamp_millis(1230219000123)|
+-------------------------------+
| 2008-12-25 15:30:...|
+-------------------------------+
-- timestamp_seconds
SELECT timestamp_seconds(1230219000);
+-----------------------------+
|timestamp_seconds(1230219000)|
+-----------------------------+
| 2008-12-25 15:30:00|
+-----------------------------+
SELECT timestamp_seconds(1230219000.123);
+---------------------------------+
|timestamp_seconds(1230219000.123)|
+---------------------------------+
| 2008-12-25 15:30:...|
+---------------------------------+
-- to_date
SELECT to_date('2009-07-30 04:17:52');
+----------------------------+
|to_date(2009-07-30 04:17:52)|
+----------------------------+
| 2009-07-30|
+----------------------------+
SELECT to_date('2016-12-31', 'yyyy-MM-dd');
+-------------------------------+
|to_date(2016-12-31, yyyy-MM-dd)|
+-------------------------------+
| 2016-12-31|
+-------------------------------+
-- to_timestamp
SELECT to_timestamp('2016-12-31 00:12:00');
+---------------------------------+
|to_timestamp(2016-12-31 00:12:00)|
+---------------------------------+
| 2016-12-31 00:12:00|
+---------------------------------+
SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd');
+------------------------------------+
|to_timestamp(2016-12-31, yyyy-MM-dd)|
+------------------------------------+
| 2016-12-31 00:00:00|
+------------------------------------+
-- to_timestamp_ltz
SELECT to_timestamp_ltz('2016-12-31 00:12:00');
+-------------------------------------+
|to_timestamp_ltz(2016-12-31 00:12:00)|
+-------------------------------------+
| 2016-12-31 00:12:00|
+-------------------------------------+
SELECT to_timestamp_ltz('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|to_timestamp_ltz(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
| 2016-12-31 00:00:00|
+----------------------------------------+
-- to_timestamp_ntz
SELECT to_timestamp_ntz('2016-12-31 00:12:00');
+-------------------------------------+
|to_timestamp_ntz(2016-12-31 00:12:00)|
+-------------------------------------+
| 2016-12-31 00:12:00|
+-------------------------------------+
SELECT to_timestamp_ntz('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|to_timestamp_ntz(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
| 2016-12-31 00:00:00|
+----------------------------------------+
-- to_unix_timestamp
SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+-----------------------------------------+
|to_unix_timestamp(2016-04-08, yyyy-MM-dd)|
+-----------------------------------------+
| 1460073600|
+-----------------------------------------+
-- to_utc_timestamp
SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul');
+----------------------------------------+
|to_utc_timestamp(2016-08-31, Asia/Seoul)|
+----------------------------------------+
| 2016-08-30 15:00:00|
+----------------------------------------+
-- trunc
SELECT trunc('2019-08-04', 'week');
+-----------------------+
|trunc(2019-08-04, week)|
+-----------------------+
| 2019-07-29|
+-----------------------+
SELECT trunc('2019-08-04', 'quarter');
+--------------------------+
|trunc(2019-08-04, quarter)|
+--------------------------+
| 2019-07-01|
+--------------------------+
SELECT trunc('2009-02-12', 'MM');
+---------------------+
|trunc(2009-02-12, MM)|
+---------------------+
| 2009-02-01|
+---------------------+
SELECT trunc('2015-10-27', 'YEAR');
+-----------------------+
|trunc(2015-10-27, YEAR)|
+-----------------------+
| 2015-01-01|
+-----------------------+
-- try_to_timestamp
SELECT try_to_timestamp('2016-12-31 00:12:00');
+-------------------------------------+
|try_to_timestamp(2016-12-31 00:12:00)|
+-------------------------------------+
| 2016-12-31 00:12:00|
+-------------------------------------+
SELECT try_to_timestamp('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|try_to_timestamp(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
| 2016-12-31 00:00:00|
+----------------------------------------+
SELECT try_to_timestamp('foo', 'yyyy-MM-dd');
+---------------------------------+
|try_to_timestamp(foo, yyyy-MM-dd)|
+---------------------------------+
| NULL|
+---------------------------------+
-- unix_date
SELECT unix_date(DATE("1970-01-02"));
+---------------------+
|unix_date(1970-01-02)|
+---------------------+
| 1|
+---------------------+
-- unix_micros
SELECT unix_micros(TIMESTAMP('1970-01-01 00:00:01Z'));
+---------------------------------+
|unix_micros(1970-01-01 00:00:01Z)|
+---------------------------------+
| 1000000|
+---------------------------------+
-- unix_millis
SELECT unix_millis(TIMESTAMP('1970-01-01 00:00:01Z'));
+---------------------------------+
|unix_millis(1970-01-01 00:00:01Z)|
+---------------------------------+
| 1000|
+---------------------------------+
-- unix_seconds
SELECT unix_seconds(TIMESTAMP('1970-01-01 00:00:01Z'));
+----------------------------------+
|unix_seconds(1970-01-01 00:00:01Z)|
+----------------------------------+
| 1|
+----------------------------------+
-- unix_timestamp
SELECT unix_timestamp();
+--------------------------------------------------------+
|unix_timestamp(current_timestamp(), yyyy-MM-dd HH:mm:ss)|
+--------------------------------------------------------+
| 1727064630|
+--------------------------------------------------------+
SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+--------------------------------------+
|unix_timestamp(2016-04-08, yyyy-MM-dd)|
+--------------------------------------+
| 1460073600|
+--------------------------------------+
-- weekday
SELECT weekday('2009-07-30');
+-------------------+
|weekday(2009-07-30)|
+-------------------+
| 3|
+-------------------+
-- weekofyear
SELECT weekofyear('2008-02-20');
+----------------------+
|weekofyear(2008-02-20)|
+----------------------+
| 8|
+----------------------+
-- window
SELECT a, window.start, window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:05:00| 2|
| A1|2021-01-01 00:05:00|2021-01-01 00:10:00| 1|
| A2|2021-01-01 00:00:00|2021-01-01 00:05:00| 1|
+---+-------------------+-------------------+---+
SELECT a, window.start, window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '10 minutes', '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
| a| start| end|cnt|
+---+-------------------+-------------------+---+
| A1|2020-12-31 23:55:00|2021-01-01 00:05:00| 2|
| A1|2021-01-01 00:00:00|2021-01-01 00:10:00| 3|
| A1|2021-01-01 00:05:00|2021-01-01 00:15:00| 1|
| A2|2020-12-31 23:55:00|2021-01-01 00:05:00| 1|
| A2|2021-01-01 00:00:00|2021-01-01 00:10:00| 1|
+---+-------------------+-------------------+---+
-- window_time
SELECT a, window.start as start, window.end as end, window_time(window), cnt FROM (SELECT a, window, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '5 minutes') ORDER BY a, window.start);
+---+-------------------+-------------------+--------------------+---+
| a| start| end| window_time(window)|cnt|
+---+-------------------+-------------------+--------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:05:00|2021-01-01 00:04:...| 2|
| A1|2021-01-01 00:05:00|2021-01-01 00:10:00|2021-01-01 00:09:...| 1|
| A2|2021-01-01 00:00:00|2021-01-01 00:05:00|2021-01-01 00:04:...| 1|
+---+-------------------+-------------------+--------------------+---+
-- year
SELECT year('2016-07-30');
+----------------+
|year(2016-07-30)|
+----------------+
| 2016|
+----------------+
Mathematical Functions
Function | Description |
---|---|
expr1 % expr2, or mod(expr1, expr2) | Returns the remainder after `expr1`/`expr2`. |
expr1 * expr2 | Returns `expr1`*`expr2`. |
expr1 + expr2 | Returns `expr1`+`expr2`. |
expr1 - expr2 | Returns `expr1`-`expr2`. |
expr1 / expr2 | Returns `expr1`/`expr2`. It always performs floating point division. |
abs(expr) | Returns the absolute value of the numeric or interval value. |
acos(expr) | Returns the inverse cosine (a.k.a. arc cosine) of `expr`, as if computed by `java.lang.Math.acos`. |
acosh(expr) | Returns inverse hyperbolic cosine of `expr`. |
asin(expr) | Returns the inverse sine (a.k.a. arc sine) the arc sin of `expr`, as if computed by `java.lang.Math.asin`. |
asinh(expr) | Returns inverse hyperbolic sine of `expr`. |
atan(expr) | Returns the inverse tangent (a.k.a. arc tangent) of `expr`, as if computed by `java.lang.Math.atan` |
atan2(exprY, exprX) | Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates (`exprX`, `exprY`), as if computed by `java.lang.Math.atan2`. |
atanh(expr) | Returns inverse hyperbolic tangent of `expr`. |
bin(expr) | Returns the string representation of the long value `expr` represented in binary. |
bround(expr, d) | Returns `expr` rounded to `d` decimal places using HALF_EVEN rounding mode. |
cbrt(expr) | Returns the cube root of `expr`. |
ceil(expr[, scale]) | Returns the smallest number after rounding up that is not smaller than `expr`. An optional `scale` parameter can be specified to control the rounding behavior. |
ceiling(expr[, scale]) | Returns the smallest number after rounding up that is not smaller than `expr`. An optional `scale` parameter can be specified to control the rounding behavior. |
conv(num, from_base, to_base) | Convert `num` from `from_base` to `to_base`. |
cos(expr) | Returns the cosine of `expr`, as if computed by `java.lang.Math.cos`. |
cosh(expr) | Returns the hyperbolic cosine of `expr`, as if computed by `java.lang.Math.cosh`. |
cot(expr) | Returns the cotangent of `expr`, as if computed by `1/java.lang.Math.tan`. |
csc(expr) | Returns the cosecant of `expr`, as if computed by `1/java.lang.Math.sin`. |
degrees(expr) | Converts radians to degrees. |
expr1 div expr2 | Divide `expr1` by `expr2`. It returns NULL if an operand is NULL or `expr2` is 0. The result is casted to long. |
e() | Returns Euler's number, e. |
exp(expr) | Returns e to the power of `expr`. |
expm1(expr) - Returns exp(`expr`) | 1. |
factorial(expr) | Returns the factorial of `expr`. `expr` is [0..20]. Otherwise, null. |
floor(expr[, scale]) | Returns the largest number after rounding down that is not greater than `expr`. An optional `scale` parameter can be specified to control the rounding behavior. |
greatest(expr, ...) | Returns the greatest value of all parameters, skipping null values. |
hex(expr) | Converts `expr` to hexadecimal. |
hypot(expr1, expr2) | Returns sqrt(`expr1`**2 + `expr2`**2). |
least(expr, ...) | Returns the least value of all parameters, skipping null values. |
ln(expr) | Returns the natural logarithm (base e) of `expr`. |
log(base, expr) | Returns the logarithm of `expr` with `base`. |
log10(expr) | Returns the logarithm of `expr` with base 10. |
log1p(expr) | Returns log(1 + `expr`). |
log2(expr) | Returns the logarithm of `expr` with base 2. |
expr1 % expr2, or mod(expr1, expr2) | Returns the remainder after `expr1`/`expr2`. |
negative(expr) | Returns the negated value of `expr`. |
pi() | Returns pi. |
pmod(expr1, expr2) | Returns the positive value of `expr1` mod `expr2`. |
positive(expr) | Returns the value of `expr`. |
pow(expr1, expr2) | Raises `expr1` to the power of `expr2`. |
power(expr1, expr2) | Raises `expr1` to the power of `expr2`. |
radians(expr) | Converts degrees to radians. |
rand([seed]) | Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1). |
randn([seed]) | Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution. |
random([seed]) | Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1). |
rint(expr) | Returns the double value that is closest in value to the argument and is equal to a mathematical integer. |
round(expr, d) | Returns `expr` rounded to `d` decimal places using HALF_UP rounding mode. |
sec(expr) | Returns the secant of `expr`, as if computed by `1/java.lang.Math.cos`. |
sign(expr) | Returns -1.0, 0.0 or 1.0 as `expr` is negative, 0 or positive. |
signum(expr) | Returns -1.0, 0.0 or 1.0 as `expr` is negative, 0 or positive. |
sin(expr) | Returns the sine of `expr`, as if computed by `java.lang.Math.sin`. |
sinh(expr) | Returns hyperbolic sine of `expr`, as if computed by `java.lang.Math.sinh`. |
sqrt(expr) | Returns the square root of `expr`. |
tan(expr) | Returns the tangent of `expr`, as if computed by `java.lang.Math.tan`. |
tanh(expr) | Returns the hyperbolic tangent of `expr`, as if computed by `java.lang.Math.tanh`. |
try_add(expr1, expr2) | Returns the sum of `expr1`and `expr2` and the result is null on overflow. The acceptable input types are the same with the `+` operator. |
try_divide(dividend, divisor) | Returns `dividend`/`divisor`. It always performs floating point division. Its result is always null if `expr2` is 0. `dividend` must be a numeric or an interval. `divisor` must be a numeric. |
try_mod(dividend, divisor) | Returns the remainder after `expr1`/`expr2`. `dividend` must be a numeric. `divisor` must be a numeric. |
try_multiply(expr1, expr2) | Returns `expr1`*`expr2` and the result is null on overflow. The acceptable input types are the same with the `*` operator. |
try_subtract(expr1, expr2) | Returns `expr1`-`expr2` and the result is null on overflow. The acceptable input types are the same with the `-` operator. |
unhex(expr) | Converts hexadecimal `expr` to binary. |
uniform(min, max[, seed]) | Returns a random value with independent and identically distributed (i.i.d.) values with the specified range of numbers. The random seed is optional. The provided numbers specifying the minimum and maximum values of the range must be constant. If both of these numbers are integers, then the result will also be an integer. Otherwise if one or both of these are floating-point numbers, then the result will also be a floating-point number. |
width_bucket(value, min_value, max_value, num_bucket) | Returns the bucket number to which `value` would be assigned in an equiwidth histogram with `num_bucket` buckets, in the range `min_value` to `max_value`." |
Examples
-- %
SELECT 2 % 1.8;
+---------+
|(2 % 1.8)|
+---------+
| 0.2|
+---------+
SELECT MOD(2, 1.8);
+-----------+
|mod(2, 1.8)|
+-----------+
| 0.2|
+-----------+
-- *
SELECT 2 * 3;
+-------+
|(2 * 3)|
+-------+
| 6|
+-------+
-- +
SELECT 1 + 2;
+-------+
|(1 + 2)|
+-------+
| 3|
+-------+
-- -
SELECT 2 - 1;
+-------+
|(2 - 1)|
+-------+
| 1|
+-------+
-- /
SELECT 3 / 2;
+-------+
|(3 / 2)|
+-------+
| 1.5|
+-------+
SELECT 2L / 2L;
+-------+
|(2 / 2)|
+-------+
| 1.0|
+-------+
-- abs
SELECT abs(-1);
+-------+
|abs(-1)|
+-------+
| 1|
+-------+
SELECT abs(INTERVAL -'1-1' YEAR TO MONTH);
+----------------------------------+
|abs(INTERVAL '-1-1' YEAR TO MONTH)|
+----------------------------------+
| INTERVAL '1-1' YE...|
+----------------------------------+
-- acos
SELECT acos(1);
+-------+
|ACOS(1)|
+-------+
| 0.0|
+-------+
SELECT acos(2);
+-------+
|ACOS(2)|
+-------+
| NaN|
+-------+
-- acosh
SELECT acosh(1);
+--------+
|ACOSH(1)|
+--------+
| 0.0|
+--------+
SELECT acosh(0);
+--------+
|ACOSH(0)|
+--------+
| NaN|
+--------+
-- asin
SELECT asin(0);
+-------+
|ASIN(0)|
+-------+
| 0.0|
+-------+
SELECT asin(2);
+-------+
|ASIN(2)|
+-------+
| NaN|
+-------+
-- asinh
SELECT asinh(0);
+--------+
|ASINH(0)|
+--------+
| 0.0|
+--------+
-- atan
SELECT atan(0);
+-------+
|ATAN(0)|
+-------+
| 0.0|
+-------+
-- atan2
SELECT atan2(0, 0);
+-----------+
|ATAN2(0, 0)|
+-----------+
| 0.0|
+-----------+
-- atanh
SELECT atanh(0);
+--------+
|ATANH(0)|
+--------+
| 0.0|
+--------+
SELECT atanh(2);
+--------+
|ATANH(2)|
+--------+
| NaN|
+--------+
-- bin
SELECT bin(13);
+-------+
|bin(13)|
+-------+
| 1101|
+-------+
SELECT bin(-13);
+--------------------+
| bin(-13)|
+--------------------+
|11111111111111111...|
+--------------------+
SELECT bin(13.3);
+---------+
|bin(13.3)|
+---------+
| 1101|
+---------+
-- bround
SELECT bround(2.5, 0);
+--------------+
|bround(2.5, 0)|
+--------------+
| 2|
+--------------+
SELECT bround(25, -1);
+--------------+
|bround(25, -1)|
+--------------+
| 20|
+--------------+
-- cbrt
SELECT cbrt(27.0);
+----------+
|CBRT(27.0)|
+----------+
| 3.0|
+----------+
-- ceil
SELECT ceil(-0.1);
+----------+
|CEIL(-0.1)|
+----------+
| 0|
+----------+
SELECT ceil(5);
+-------+
|CEIL(5)|
+-------+
| 5|
+-------+
SELECT ceil(3.1411, 3);
+---------------+
|ceil(3.1411, 3)|
+---------------+
| 3.142|
+---------------+
SELECT ceil(3.1411, -3);
+----------------+
|ceil(3.1411, -3)|
+----------------+
| 1000|
+----------------+
-- ceiling
SELECT ceiling(-0.1);
+-------------+
|ceiling(-0.1)|
+-------------+
| 0|
+-------------+
SELECT ceiling(5);
+----------+
|ceiling(5)|
+----------+
| 5|
+----------+
SELECT ceiling(3.1411, 3);
+------------------+
|ceiling(3.1411, 3)|
+------------------+
| 3.142|
+------------------+
SELECT ceiling(3.1411, -3);
+-------------------+
|ceiling(3.1411, -3)|
+-------------------+
| 1000|
+-------------------+
-- conv
SELECT conv('100', 2, 10);
+----------------+
|conv(100, 2, 10)|
+----------------+
| 4|
+----------------+
SELECT conv(-10, 16, -10);
+------------------+
|conv(-10, 16, -10)|
+------------------+
| -16|
+------------------+
-- cos
SELECT cos(0);
+------+
|COS(0)|
+------+
| 1.0|
+------+
-- cosh
SELECT cosh(0);
+-------+
|COSH(0)|
+-------+
| 1.0|
+-------+
-- cot
SELECT cot(1);
+------------------+
| COT(1)|
+------------------+
|0.6420926159343306|
+------------------+
-- csc
SELECT csc(1);
+------------------+
| CSC(1)|
+------------------+
|1.1883951057781212|
+------------------+
-- degrees
SELECT degrees(3.141592653589793);
+--------------------------+
|DEGREES(3.141592653589793)|
+--------------------------+
| 180.0|
+--------------------------+
-- div
SELECT 3 div 2;
+---------+
|(3 div 2)|
+---------+
| 1|
+---------+
SELECT INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH;
+------------------------------------------------------+
|(INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH)|
+------------------------------------------------------+
| -13|
+------------------------------------------------------+
-- e
SELECT e();
+-----------------+
| E()|
+-----------------+
|2.718281828459045|
+-----------------+
-- exp
SELECT exp(0);
+------+
|EXP(0)|
+------+
| 1.0|
+------+
-- expm1
SELECT expm1(0);
+--------+
|EXPM1(0)|
+--------+
| 0.0|
+--------+
-- factorial
SELECT factorial(5);
+------------+
|factorial(5)|
+------------+
| 120|
+------------+
-- floor
SELECT floor(-0.1);
+-----------+
|FLOOR(-0.1)|
+-----------+
| -1|
+-----------+
SELECT floor(5);
+--------+
|FLOOR(5)|
+--------+
| 5|
+--------+
SELECT floor(3.1411, 3);
+----------------+
|floor(3.1411, 3)|
+----------------+
| 3.141|
+----------------+
SELECT floor(3.1411, -3);
+-----------------+
|floor(3.1411, -3)|
+-----------------+
| 0|
+-----------------+
-- greatest
SELECT greatest(10, 9, 2, 4, 3);
+------------------------+
|greatest(10, 9, 2, 4, 3)|
+------------------------+
| 10|
+------------------------+
-- hex
SELECT hex(17);
+-------+
|hex(17)|
+-------+
| 11|
+-------+
SELECT hex('Spark SQL');
+------------------+
| hex(Spark SQL)|
+------------------+
|537061726B2053514C|
+------------------+
-- hypot
SELECT hypot(3, 4);
+-----------+
|HYPOT(3, 4)|
+-----------+
| 5.0|
+-----------+
-- least
SELECT least(10, 9, 2, 4, 3);
+---------------------+
|least(10, 9, 2, 4, 3)|
+---------------------+
| 2|
+---------------------+
-- ln
SELECT ln(1);
+-----+
|ln(1)|
+-----+
| 0.0|
+-----+
-- log
SELECT log(10, 100);
+------------+
|LOG(10, 100)|
+------------+
| 2.0|
+------------+
-- log10
SELECT log10(10);
+---------+
|LOG10(10)|
+---------+
| 1.0|
+---------+
-- log1p
SELECT log1p(0);
+--------+
|LOG1P(0)|
+--------+
| 0.0|
+--------+
-- log2
SELECT log2(2);
+-------+
|LOG2(2)|
+-------+
| 1.0|
+-------+
-- mod
SELECT 2 % 1.8;
+---------+
|(2 % 1.8)|
+---------+
| 0.2|
+---------+
SELECT MOD(2, 1.8);
+-----------+
|mod(2, 1.8)|
+-----------+
| 0.2|
+-----------+
-- negative
SELECT negative(1);
+-----------+
|negative(1)|
+-----------+
| -1|
+-----------+
-- pi
SELECT pi();
+-----------------+
| PI()|
+-----------------+
|3.141592653589793|
+-----------------+
-- pmod
SELECT pmod(10, 3);
+-----------+
|pmod(10, 3)|
+-----------+
| 1|
+-----------+
SELECT pmod(-10, 3);
+------------+
|pmod(-10, 3)|
+------------+
| 2|
+------------+
-- positive
SELECT positive(1);
+-----+
|(+ 1)|
+-----+
| 1|
+-----+
-- pow
SELECT pow(2, 3);
+---------+
|pow(2, 3)|
+---------+
| 8.0|
+---------+
-- power
SELECT power(2, 3);
+-----------+
|POWER(2, 3)|
+-----------+
| 8.0|
+-----------+
-- radians
SELECT radians(180);
+-----------------+
| RADIANS(180)|
+-----------------+
|3.141592653589793|
+-----------------+
-- rand
SELECT rand();
+------------------+
| rand()|
+------------------+
|0.8482748955252059|
+------------------+
SELECT rand(0);
+------------------+
| rand(0)|
+------------------+
|0.7604953758285915|
+------------------+
SELECT rand(null);
+------------------+
| rand(NULL)|
+------------------+
|0.7604953758285915|
+------------------+
-- randn
SELECT randn();
+------------------+
| randn()|
+------------------+
|-2.159977130020423|
+------------------+
SELECT randn(0);
+------------------+
| randn(0)|
+------------------+
|1.6034991609278433|
+------------------+
SELECT randn(null);
+------------------+
| randn(NULL)|
+------------------+
|1.6034991609278433|
+------------------+
-- random
SELECT random();
+-------------------+
| rand()|
+-------------------+
|0.12195248351371113|
+-------------------+
SELECT random(0);
+------------------+
| rand(0)|
+------------------+
|0.7604953758285915|
+------------------+
SELECT random(null);
+------------------+
| rand(NULL)|
+------------------+
|0.7604953758285915|
+------------------+
-- rint
SELECT rint(12.3456);
+-------------+
|rint(12.3456)|
+-------------+
| 12.0|
+-------------+
-- round
SELECT round(2.5, 0);
+-------------+
|round(2.5, 0)|
+-------------+
| 3|
+-------------+
-- sec
SELECT sec(0);
+------+
|SEC(0)|
+------+
| 1.0|
+------+
-- sign
SELECT sign(40);
+--------+
|sign(40)|
+--------+
| 1.0|
+--------+
SELECT sign(INTERVAL -'100' YEAR);
+--------------------------+
|sign(INTERVAL '-100' YEAR)|
+--------------------------+
| -1.0|
+--------------------------+
-- signum
SELECT signum(40);
+----------+
|SIGNUM(40)|
+----------+
| 1.0|
+----------+
SELECT signum(INTERVAL -'100' YEAR);
+----------------------------+
|SIGNUM(INTERVAL '-100' YEAR)|
+----------------------------+
| -1.0|
+----------------------------+
-- sin
SELECT sin(0);
+------+
|SIN(0)|
+------+
| 0.0|
+------+
-- sinh
SELECT sinh(0);
+-------+
|SINH(0)|
+-------+
| 0.0|
+-------+
-- sqrt
SELECT sqrt(4);
+-------+
|SQRT(4)|
+-------+
| 2.0|
+-------+
-- tan
SELECT tan(0);
+------+
|TAN(0)|
+------+
| 0.0|
+------+
-- tanh
SELECT tanh(0);
+-------+
|TANH(0)|
+-------+
| 0.0|
+-------+
-- try_add
SELECT try_add(1, 2);
+-------------+
|try_add(1, 2)|
+-------------+
| 3|
+-------------+
SELECT try_add(2147483647, 1);
+----------------------+
|try_add(2147483647, 1)|
+----------------------+
| NULL|
+----------------------+
SELECT try_add(date'2021-01-01', 1);
+-----------------------------+
|try_add(DATE '2021-01-01', 1)|
+-----------------------------+
| 2021-01-02|
+-----------------------------+
SELECT try_add(date'2021-01-01', interval 1 year);
+---------------------------------------------+
|try_add(DATE '2021-01-01', INTERVAL '1' YEAR)|
+---------------------------------------------+
| 2022-01-01|
+---------------------------------------------+
SELECT try_add(timestamp'2021-01-01 00:00:00', interval 1 day);
+----------------------------------------------------------+
|try_add(TIMESTAMP '2021-01-01 00:00:00', INTERVAL '1' DAY)|
+----------------------------------------------------------+
| 2021-01-02 00:00:00|
+----------------------------------------------------------+
SELECT try_add(interval 1 year, interval 2 year);
+---------------------------------------------+
|try_add(INTERVAL '1' YEAR, INTERVAL '2' YEAR)|
+---------------------------------------------+
| INTERVAL '3' YEAR|
+---------------------------------------------+
-- try_divide
SELECT try_divide(3, 2);
+----------------+
|try_divide(3, 2)|
+----------------+
| 1.5|
+----------------+
SELECT try_divide(2L, 2L);
+----------------+
|try_divide(2, 2)|
+----------------+
| 1.0|
+----------------+
SELECT try_divide(1, 0);
+----------------+
|try_divide(1, 0)|
+----------------+
| NULL|
+----------------+
SELECT try_divide(interval 2 month, 2);
+---------------------------------+
|try_divide(INTERVAL '2' MONTH, 2)|
+---------------------------------+
| INTERVAL '0-1' YE...|
+---------------------------------+
SELECT try_divide(interval 2 month, 0);
+---------------------------------+
|try_divide(INTERVAL '2' MONTH, 0)|
+---------------------------------+
| NULL|
+---------------------------------+
-- try_mod
SELECT try_mod(3, 2);
+-------------+
|try_mod(3, 2)|
+-------------+
| 1|
+-------------+
SELECT try_mod(2L, 2L);
+-------------+
|try_mod(2, 2)|
+-------------+
| 0|
+-------------+
SELECT try_mod(3.0, 2.0);
+-----------------+
|try_mod(3.0, 2.0)|
+-----------------+
| 1.0|
+-----------------+
SELECT try_mod(1, 0);
+-------------+
|try_mod(1, 0)|
+-------------+
| NULL|
+-------------+
-- try_multiply
SELECT try_multiply(2, 3);
+------------------+
|try_multiply(2, 3)|
+------------------+
| 6|
+------------------+
SELECT try_multiply(-2147483648, 10);
+-----------------------------+
|try_multiply(-2147483648, 10)|
+-----------------------------+
| NULL|
+-----------------------------+
SELECT try_multiply(interval 2 year, 3);
+----------------------------------+
|try_multiply(INTERVAL '2' YEAR, 3)|
+----------------------------------+
| INTERVAL '6-0' YE...|
+----------------------------------+
-- try_subtract
SELECT try_subtract(2, 1);
+------------------+
|try_subtract(2, 1)|
+------------------+
| 1|
+------------------+
SELECT try_subtract(-2147483648, 1);
+----------------------------+
|try_subtract(-2147483648, 1)|
+----------------------------+
| NULL|
+----------------------------+
SELECT try_subtract(date'2021-01-02', 1);
+----------------------------------+
|try_subtract(DATE '2021-01-02', 1)|
+----------------------------------+
| 2021-01-01|
+----------------------------------+
SELECT try_subtract(date'2021-01-01', interval 1 year);
+--------------------------------------------------+
|try_subtract(DATE '2021-01-01', INTERVAL '1' YEAR)|
+--------------------------------------------------+
| 2020-01-01|
+--------------------------------------------------+
SELECT try_subtract(timestamp'2021-01-02 00:00:00', interval 1 day);
+---------------------------------------------------------------+
|try_subtract(TIMESTAMP '2021-01-02 00:00:00', INTERVAL '1' DAY)|
+---------------------------------------------------------------+
| 2021-01-01 00:00:00|
+---------------------------------------------------------------+
SELECT try_subtract(interval 2 year, interval 1 year);
+--------------------------------------------------+
|try_subtract(INTERVAL '2' YEAR, INTERVAL '1' YEAR)|
+--------------------------------------------------+
| INTERVAL '1' YEAR|
+--------------------------------------------------+
-- unhex
SELECT decode(unhex('537061726B2053514C'), 'UTF-8');
+----------------------------------------+
|decode(unhex(537061726B2053514C), UTF-8)|
+----------------------------------------+
| Spark SQL|
+----------------------------------------+
-- uniform
SELECT uniform(10, 20, 0) > 0 AS result;
+------+
|result|
+------+
| true|
+------+
-- width_bucket
SELECT width_bucket(5.3, 0.2, 10.6, 5);
+-------------------------------+
|width_bucket(5.3, 0.2, 10.6, 5)|
+-------------------------------+
| 3|
+-------------------------------+
SELECT width_bucket(-2.1, 1.3, 3.4, 3);
+-------------------------------+
|width_bucket(-2.1, 1.3, 3.4, 3)|
+-------------------------------+
| 0|
+-------------------------------+
SELECT width_bucket(8.1, 0.0, 5.7, 4);
+------------------------------+
|width_bucket(8.1, 0.0, 5.7, 4)|
+------------------------------+
| 5|
+------------------------------+
SELECT width_bucket(-0.9, 5.2, 0.5, 2);
+-------------------------------+
|width_bucket(-0.9, 5.2, 0.5, 2)|
+-------------------------------+
| 3|
+-------------------------------+
SELECT width_bucket(INTERVAL '0' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10);
+--------------------------------------------------------------------------+
|width_bucket(INTERVAL '0' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10)|
+--------------------------------------------------------------------------+
| 1|
+--------------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '1' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10);
+--------------------------------------------------------------------------+
|width_bucket(INTERVAL '1' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10)|
+--------------------------------------------------------------------------+
| 2|
+--------------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '0' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10);
+-----------------------------------------------------------------------+
|width_bucket(INTERVAL '0' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10)|
+-----------------------------------------------------------------------+
| 1|
+-----------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '1' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10);
+-----------------------------------------------------------------------+
|width_bucket(INTERVAL '1' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10)|
+-----------------------------------------------------------------------+
| 2|
+-----------------------------------------------------------------------+
String Functions
Function | Description |
---|---|
ascii(str) | Returns the numeric value of the first character of `str`. |
base64(bin) | Converts the argument from a binary `bin` to a base 64 string. |
bit_length(expr) | Returns the bit length of string data or number of bits of binary data. |
btrim(str) | Removes the leading and trailing space characters from `str`. |
btrim(str, trimStr) | Remove the leading and trailing `trimStr` characters from `str`. |
char(expr) | Returns the ASCII character having the binary equivalent to `expr`. If n is larger than 256 the result is equivalent to chr(n % 256) |
char_length(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. |
character_length(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. |
chr(expr) | Returns the ASCII character having the binary equivalent to `expr`. If n is larger than 256 the result is equivalent to chr(n % 256) |
collate(expr, collationName) | Marks a given expression with the specified collation. |
collation(expr) | Returns the collation name of a given expression. |
concat_ws(sep[, str | array(str)]+) | Returns the concatenation of the strings separated by `sep`, skipping null values. |
contains(left, right) | Returns a boolean. The value is True if right is found inside left. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type. |
decode(bin, charset) | Decodes the first argument using the second argument character set. If either argument is null, the result will also be null. |
decode(expr, search, result [, search, result ] ... [, default]) | Compares expr to each search value in order. If expr is equal to a search value, decode returns the corresponding result. If no match is found, then it returns default. If default is omitted, it returns null. |
elt(n, input1, input2, ...) | Returns the `n`-th input, e.g., returns `input2` when `n` is 2. The function returns NULL if the index exceeds the length of the array and `spark.sql.ansi.enabled` is set to false. If `spark.sql.ansi.enabled` is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices. |
encode(str, charset) | Encodes the first argument using the second argument character set. If either argument is null, the result will also be null. |
endswith(left, right) | Returns a boolean. The value is True if left ends with right. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type. |
find_in_set(str, str_array) | Returns the index (1-based) of the given string (`str`) in the comma-delimited list (`str_array`). Returns 0, if the string was not found or if the given string (`str`) contains a comma. |
format_number(expr1, expr2) | Formats the number `expr1` like '#,###,###.##', rounded to `expr2` decimal places. If `expr2` is 0, the result has no decimal point or fractional part. `expr2` also accept a user specified format. This is supposed to function like MySQL's FORMAT. |
format_string(strfmt, obj, ...) | Returns a formatted string from printf-style format strings. |
initcap(str) | Returns `str` with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space. |
instr(str, substr) | Returns the (1-based) index of the first occurrence of `substr` in `str`. |
is_valid_utf8(str) | Returns true if `str` is a valid UTF-8 string, otherwise returns false. |
lcase(str) | Returns `str` with all characters changed to lowercase. |
left(str, len) | Returns the leftmost `len`(`len` can be string type) characters from the string `str`,if `len` is less or equal than 0 the result is an empty string. |
len(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. |
length(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. |
levenshtein(str1, str2[, threshold]) | Returns the Levenshtein distance between the two given strings. If threshold is set and distance more than it, return -1. |
locate(substr, str[, pos]) | Returns the position of the first occurrence of `substr` in `str` after position `pos`. The given `pos` and return value are 1-based. |
lower(str) | Returns `str` with all characters changed to lowercase. |
lpad(str, len[, pad]) | Returns `str`, left-padded with `pad` to a length of `len`. If `str` is longer than `len`, the return value is shortened to `len` characters or bytes. If `pad` is not specified, `str` will be padded to the left with space characters if it is a character string, and with zeros if it is a byte sequence. |
ltrim(str) | Removes the leading space characters from `str`. |
luhn_check(str ) | Checks that a string of digits is valid according to the Luhn algorithm. This checksum function is widely applied on credit card numbers and government identification numbers to distinguish valid numbers from mistyped, incorrect numbers. |
make_valid_utf8(str) | Returns the original string if `str` is a valid UTF-8 string, otherwise returns a new string whose invalid UTF8 byte sequences are replaced using the UNICODE replacement character U+FFFD. |
mask(input[, upperChar, lowerChar, digitChar, otherChar]) | masks the given string value. The function replaces characters with 'X' or 'x', and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed. |
octet_length(expr) | Returns the byte length of string data or number of bytes of binary data. |
overlay(input, replace, pos[, len]) | Replace `input` with `replace` that starts at `pos` and is of length `len`. |
position(substr, str[, pos]) | Returns the position of the first occurrence of `substr` in `str` after position `pos`. The given `pos` and return value are 1-based. |
printf(strfmt, obj, ...) | Returns a formatted string from printf-style format strings. |
randstr(length[, seed]) | Returns a string of the specified length whose characters are chosen uniformly at random from the following pool of characters: 0-9, a-z, A-Z. The random seed is optional. The string length must be a constant two-byte or four-byte integer (SMALLINT or INT, respectively). |
regexp_count(str, regexp) | Returns a count of the number of times that the regular expression pattern `regexp` is matched in the string `str`. |
regexp_extract(str, regexp[, idx]) | Extract the first string in the `str` that match the `regexp` expression and corresponding to the regex group index. |
regexp_extract_all(str, regexp[, idx]) | Extract all strings in the `str` that match the `regexp` expression and corresponding to the regex group index. |
regexp_instr(str, regexp) | Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring. Positions are 1-based, not 0-based. If no match is found, returns 0. |
regexp_replace(str, regexp, rep[, position]) | Replaces all substrings of `str` that match `regexp` with `rep`. |
regexp_substr(str, regexp) | Returns the substring that matches the regular expression `regexp` within the string `str`. If the regular expression is not found, the result is null. |
repeat(str, n) | Returns the string which repeats the given string value n times. |
replace(str, search[, replace]) | Replaces all occurrences of `search` with `replace`. |
right(str, len) | Returns the rightmost `len`(`len` can be string type) characters from the string `str`,if `len` is less or equal than 0 the result is an empty string. |
rpad(str, len[, pad]) | Returns `str`, right-padded with `pad` to a length of `len`. If `str` is longer than `len`, the return value is shortened to `len` characters. If `pad` is not specified, `str` will be padded to the right with space characters if it is a character string, and with zeros if it is a binary string. |
rtrim(str) | Removes the trailing space characters from `str`. |
sentences(str[, lang[, country]]) | Splits `str` into an array of array of words. |
soundex(str) | Returns Soundex code of the string. |
space(n) | Returns a string consisting of `n` spaces. |
split(str, regex, limit) | Splits `str` around occurrences that match `regex` and returns an array with a length of at most `limit` |
split_part(str, delimiter, partNum) | Splits `str` by delimiter and return requested part of the split (1-based). If any input is null, returns null. if `partNum` is out of range of split parts, returns empty string. If `partNum` is 0, throws an error. If `partNum` is negative, the parts are counted backward from the end of the string. If the `delimiter` is an empty string, the `str` is not split. |
startswith(left, right) | Returns a boolean. The value is True if left starts with right. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type. |
substr(str, pos[, len]) | Returns the substring of `str` that starts at `pos` and is of length `len`, or the slice of byte array that starts at `pos` and is of length `len`. |
substr(str FROM pos[ FOR len]]) | Returns the substring of `str` that starts at `pos` and is of length `len`, or the slice of byte array that starts at `pos` and is of length `len`. |
substring(str, pos[, len]) | Returns the substring of `str` that starts at `pos` and is of length `len`, or the slice of byte array that starts at `pos` and is of length `len`. |
substring(str FROM pos[ FOR len]]) | Returns the substring of `str` that starts at `pos` and is of length `len`, or the slice of byte array that starts at `pos` and is of length `len`. |
substring_index(str, delim, count) | Returns the substring from `str` before `count` occurrences of the delimiter `delim`. If `count` is positive, everything to the left of the final delimiter (counting from the left) is returned. If `count` is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when searching for `delim`. |
to_binary(str[, fmt]) | Converts the input `str` to a binary value based on the supplied `fmt`. `fmt` can be a case-insensitive string literal of "hex", "utf-8", "utf8", or "base64". By default, the binary format for conversion is "hex" if `fmt` is omitted. The function returns NULL if at least one of the input parameters is NULL. |
to_char(expr, format) | Convert `expr` to a string based on the `format`. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' prints '+' for positive values but 'MI' prints a space. 'PR': Only allowed at the end of the format string; specifies that the result string will be wrapped by angle brackets if the input value is negative. ('<1>'). If `expr` is a datetime, `format` shall be a valid datetime pattern, see Datetime Patterns. If `expr` is a binary, it is converted to a string in one of the formats: 'base64': a base 64 string. 'hex': a string in the hexadecimal format. 'utf-8': the input binary is decoded to UTF-8 string. |
to_number(expr, fmt) | Convert string 'expr' to a number based on the string format 'fmt'. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input string. If the 0/9 sequence starts with 0 and is before the decimal point, it can only match a digit sequence of the same size. Otherwise, if the sequence starts with 9 or is after the decimal point, it can match a digit sequence that has the same or smaller size. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. 'expr' must match the grouping separator relevant for the size of the number. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' allows '-' but 'MI' does not. 'PR': Only allowed at the end of the format string; specifies that 'expr' indicates a negative number with wrapping angled brackets. ('<1>'). |
to_varchar(expr, format) | Convert `expr` to a string based on the `format`. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' prints '+' for positive values but 'MI' prints a space. 'PR': Only allowed at the end of the format string; specifies that the result string will be wrapped by angle brackets if the input value is negative. ('<1>'). If `expr` is a datetime, `format` shall be a valid datetime pattern, see Datetime Patterns. If `expr` is a binary, it is converted to a string in one of the formats: 'base64': a base 64 string. 'hex': a string in the hexadecimal format. 'utf-8': the input binary is decoded to UTF-8 string. |
translate(input, from, to) | Translates the `input` string by replacing the characters present in the `from` string with the corresponding characters in the `to` string. |
trim(str) | Removes the leading and trailing space characters from `str`. |
trim(BOTH FROM str) | Removes the leading and trailing space characters from `str`. |
trim(LEADING FROM str) | Removes the leading space characters from `str`. |
trim(TRAILING FROM str) | Removes the trailing space characters from `str`. |
trim(trimStr FROM str) | Remove the leading and trailing `trimStr` characters from `str`. |
trim(BOTH trimStr FROM str) | Remove the leading and trailing `trimStr` characters from `str`. |
trim(LEADING trimStr FROM str) | Remove the leading `trimStr` characters from `str`. |
trim(TRAILING trimStr FROM str) | Remove the trailing `trimStr` characters from `str`. |
try_to_binary(str[, fmt]) | This is a special version of `to_binary` that performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed. |
try_to_number(expr, fmt) | Convert string 'expr' to a number based on the string format `fmt`. Returns NULL if the string 'expr' does not match the expected format. The format follows the same semantics as the to_number function. |
try_validate_utf8(str) | Returns the original string if `str` is a valid UTF-8 string, otherwise returns NULL. |
ucase(str) | Returns `str` with all characters changed to uppercase. |
unbase64(str) | Converts the argument from a base 64 string `str` to a binary. |
upper(str) | Returns `str` with all characters changed to uppercase. |
validate_utf8(str) | Returns the original string if `str` is a valid UTF-8 string, otherwise throws an exception. |
Examples
-- ascii
SELECT ascii('222');
+----------+
|ascii(222)|
+----------+
| 50|
+----------+
SELECT ascii(2);
+--------+
|ascii(2)|
+--------+
| 50|
+--------+
-- base64
SELECT base64('Spark SQL');
+-----------------+
|base64(Spark SQL)|
+-----------------+
| U3BhcmsgU1FM|
+-----------------+
SELECT base64(x'537061726b2053514c');
+-----------------------------+
|base64(X'537061726B2053514C')|
+-----------------------------+
| U3BhcmsgU1FM|
+-----------------------------+
-- bit_length
SELECT bit_length('Spark SQL');
+---------------------+
|bit_length(Spark SQL)|
+---------------------+
| 72|
+---------------------+
SELECT bit_length(x'537061726b2053514c');
+---------------------------------+
|bit_length(X'537061726B2053514C')|
+---------------------------------+
| 72|
+---------------------------------+
-- btrim
SELECT btrim(' SparkSQL ');
+----------------------+
|btrim( SparkSQL )|
+----------------------+
| SparkSQL|
+----------------------+
SELECT btrim(encode(' SparkSQL ', 'utf-8'));
+-------------------------------------+
|btrim(encode( SparkSQL , utf-8))|
+-------------------------------------+
| SparkSQL|
+-------------------------------------+
SELECT btrim('SSparkSQLS', 'SL');
+---------------------+
|btrim(SSparkSQLS, SL)|
+---------------------+
| parkSQ|
+---------------------+
SELECT btrim(encode('SSparkSQLS', 'utf-8'), encode('SL', 'utf-8'));
+---------------------------------------------------+
|btrim(encode(SSparkSQLS, utf-8), encode(SL, utf-8))|
+---------------------------------------------------+
| parkSQ|
+---------------------------------------------------+
-- char
SELECT char(65);
+--------+
|char(65)|
+--------+
| A|
+--------+
-- char_length
SELECT char_length('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT char_length(x'537061726b2053514c');
+----------------------------------+
|char_length(X'537061726B2053514C')|
+----------------------------------+
| 9|
+----------------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- character_length
SELECT character_length('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
SELECT character_length(x'537061726b2053514c');
+---------------------------------------+
|character_length(X'537061726B2053514C')|
+---------------------------------------+
| 9|
+---------------------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- chr
SELECT chr(65);
+-------+
|chr(65)|
+-------+
| A|
+-------+
-- collate
SELECT COLLATION('Spark SQL' collate UTF8_LCASE);
+-----------------------------------------+
|collation(collate(Spark SQL, UTF8_LCASE))|
+-----------------------------------------+
| UTF8_LCASE|
+-----------------------------------------+
-- collation
SELECT collation('Spark SQL');
+--------------------+
|collation(Spark SQL)|
+--------------------+
| UTF8_BINARY|
+--------------------+
-- concat_ws
SELECT concat_ws(' ', 'Spark', 'SQL');
+------------------------+
|concat_ws( , Spark, SQL)|
+------------------------+
| Spark SQL|
+------------------------+
SELECT concat_ws('s');
+------------+
|concat_ws(s)|
+------------+
| |
+------------+
SELECT concat_ws('/', 'foo', null, 'bar');
+----------------------------+
|concat_ws(/, foo, NULL, bar)|
+----------------------------+
| foo/bar|
+----------------------------+
SELECT concat_ws(null, 'Spark', 'SQL');
+---------------------------+
|concat_ws(NULL, Spark, SQL)|
+---------------------------+
| NULL|
+---------------------------+
-- contains
SELECT contains('Spark SQL', 'Spark');
+--------------------------+
|contains(Spark SQL, Spark)|
+--------------------------+
| true|
+--------------------------+
SELECT contains('Spark SQL', 'SPARK');
+--------------------------+
|contains(Spark SQL, SPARK)|
+--------------------------+
| false|
+--------------------------+
SELECT contains('Spark SQL', null);
+-------------------------+
|contains(Spark SQL, NULL)|
+-------------------------+
| NULL|
+-------------------------+
SELECT contains(x'537061726b2053514c', x'537061726b');
+----------------------------------------------+
|contains(X'537061726B2053514C', X'537061726B')|
+----------------------------------------------+
| true|
+----------------------------------------------+
-- decode
SELECT decode(encode('abc', 'utf-8'), 'utf-8');
+---------------------------------+
|decode(encode(abc, utf-8), utf-8)|
+---------------------------------+
| abc|
+---------------------------------+
SELECT decode(2, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
+----------------------------------------------------------------------------------+
|decode(2, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle, Non domestic)|
+----------------------------------------------------------------------------------+
| San Francisco|
+----------------------------------------------------------------------------------+
SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
+----------------------------------------------------------------------------------+
|decode(6, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle, Non domestic)|
+----------------------------------------------------------------------------------+
| Non domestic|
+----------------------------------------------------------------------------------+
SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle');
+--------------------------------------------------------------------+
|decode(6, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle)|
+--------------------------------------------------------------------+
| NULL|
+--------------------------------------------------------------------+
SELECT decode(null, 6, 'Spark', NULL, 'SQL', 4, 'rocks');
+-------------------------------------------+
|decode(NULL, 6, Spark, NULL, SQL, 4, rocks)|
+-------------------------------------------+
| SQL|
+-------------------------------------------+
-- elt
SELECT elt(1, 'scala', 'java');
+-------------------+
|elt(1, scala, java)|
+-------------------+
| scala|
+-------------------+
SELECT elt(2, 'a', 1);
+------------+
|elt(2, a, 1)|
+------------+
| 1|
+------------+
-- encode
SELECT encode('abc', 'utf-8');
+------------------+
|encode(abc, utf-8)|
+------------------+
| [61 62 63]|
+------------------+
-- endswith
SELECT endswith('Spark SQL', 'SQL');
+------------------------+
|endswith(Spark SQL, SQL)|
+------------------------+
| true|
+------------------------+
SELECT endswith('Spark SQL', 'Spark');
+--------------------------+
|endswith(Spark SQL, Spark)|
+--------------------------+
| false|
+--------------------------+
SELECT endswith('Spark SQL', null);
+-------------------------+
|endswith(Spark SQL, NULL)|
+-------------------------+
| NULL|
+-------------------------+
SELECT endswith(x'537061726b2053514c', x'537061726b');
+----------------------------------------------+
|endswith(X'537061726B2053514C', X'537061726B')|
+----------------------------------------------+
| false|
+----------------------------------------------+
SELECT endswith(x'537061726b2053514c', x'53514c');
+------------------------------------------+
|endswith(X'537061726B2053514C', X'53514C')|
+------------------------------------------+
| true|
+------------------------------------------+
-- find_in_set
SELECT find_in_set('ab','abc,b,ab,c,def');
+-------------------------------+
|find_in_set(ab, abc,b,ab,c,def)|
+-------------------------------+
| 3|
+-------------------------------+
-- format_number
SELECT format_number(12332.123456, 4);
+------------------------------+
|format_number(12332.123456, 4)|
+------------------------------+
| 12,332.1235|
+------------------------------+
SELECT format_number(12332.123456, '##################.###');
+---------------------------------------------------+
|format_number(12332.123456, ##################.###)|
+---------------------------------------------------+
| 12332.123|
+---------------------------------------------------+
-- format_string
SELECT format_string("Hello World %d %s", 100, "days");
+-------------------------------------------+
|format_string(Hello World %d %s, 100, days)|
+-------------------------------------------+
| Hello World 100 days|
+-------------------------------------------+
-- initcap
SELECT initcap('sPark sql');
+------------------+
|initcap(sPark sql)|
+------------------+
| Spark Sql|
+------------------+
-- instr
SELECT instr('SparkSQL', 'SQL');
+--------------------+
|instr(SparkSQL, SQL)|
+--------------------+
| 6|
+--------------------+
-- is_valid_utf8
SELECT is_valid_utf8('Spark');
+--------------------+
|is_valid_utf8(Spark)|
+--------------------+
| true|
+--------------------+
SELECT is_valid_utf8(x'61');
+--------------------+
|is_valid_utf8(X'61')|
+--------------------+
| true|
+--------------------+
SELECT is_valid_utf8(x'80');
+--------------------+
|is_valid_utf8(X'80')|
+--------------------+
| false|
+--------------------+
SELECT is_valid_utf8(x'61C262');
+------------------------+
|is_valid_utf8(X'61C262')|
+------------------------+
| false|
+------------------------+
-- lcase
SELECT lcase('SparkSql');
+---------------+
|lcase(SparkSql)|
+---------------+
| sparksql|
+---------------+
-- left
SELECT left('Spark SQL', 3);
+------------------+
|left(Spark SQL, 3)|
+------------------+
| Spa|
+------------------+
SELECT left(encode('Spark SQL', 'utf-8'), 3);
+---------------------------------+
|left(encode(Spark SQL, utf-8), 3)|
+---------------------------------+
| [53 70 61]|
+---------------------------------+
-- len
SELECT len('Spark SQL ');
+---------------+
|len(Spark SQL )|
+---------------+
| 10|
+---------------+
SELECT len(x'537061726b2053514c');
+--------------------------+
|len(X'537061726B2053514C')|
+--------------------------+
| 9|
+--------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- length
SELECT length('Spark SQL ');
+------------------+
|length(Spark SQL )|
+------------------+
| 10|
+------------------+
SELECT length(x'537061726b2053514c');
+-----------------------------+
|length(X'537061726B2053514C')|
+-----------------------------+
| 9|
+-----------------------------+
SELECT CHAR_LENGTH('Spark SQL ');
+-----------------------+
|char_length(Spark SQL )|
+-----------------------+
| 10|
+-----------------------+
SELECT CHARACTER_LENGTH('Spark SQL ');
+----------------------------+
|character_length(Spark SQL )|
+----------------------------+
| 10|
+----------------------------+
-- levenshtein
SELECT levenshtein('kitten', 'sitting');
+----------------------------+
|levenshtein(kitten, sitting)|
+----------------------------+
| 3|
+----------------------------+
SELECT levenshtein('kitten', 'sitting', 2);
+-------------------------------+
|levenshtein(kitten, sitting, 2)|
+-------------------------------+
| -1|
+-------------------------------+
-- locate
SELECT locate('bar', 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
| 4|
+-------------------------+
SELECT locate('bar', 'foobarbar', 5);
+-------------------------+
|locate(bar, foobarbar, 5)|
+-------------------------+
| 7|
+-------------------------+
SELECT POSITION('bar' IN 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
| 4|
+-------------------------+
-- lower
SELECT lower('SparkSql');
+---------------+
|lower(SparkSql)|
+---------------+
| sparksql|
+---------------+
-- lpad
SELECT lpad('hi', 5, '??');
+---------------+
|lpad(hi, 5, ??)|
+---------------+
| ???hi|
+---------------+
SELECT lpad('hi', 1, '??');
+---------------+
|lpad(hi, 1, ??)|
+---------------+
| h|
+---------------+
SELECT lpad('hi', 5);
+--------------+
|lpad(hi, 5, )|
+--------------+
| hi|
+--------------+
SELECT hex(lpad(unhex('aabb'), 5));
+--------------------------------+
|hex(lpad(unhex(aabb), 5, X'00'))|
+--------------------------------+
| 000000AABB|
+--------------------------------+
SELECT hex(lpad(unhex('aabb'), 5, unhex('1122')));
+--------------------------------------+
|hex(lpad(unhex(aabb), 5, unhex(1122)))|
+--------------------------------------+
| 112211AABB|
+--------------------------------------+
-- ltrim
SELECT ltrim(' SparkSQL ');
+----------------------+
|ltrim( SparkSQL )|
+----------------------+
| SparkSQL |
+----------------------+
-- luhn_check
SELECT luhn_check('8112189876');
+----------------------+
|luhn_check(8112189876)|
+----------------------+
| true|
+----------------------+
SELECT luhn_check('79927398713');
+-----------------------+
|luhn_check(79927398713)|
+-----------------------+
| true|
+-----------------------+
SELECT luhn_check('79927398714');
+-----------------------+
|luhn_check(79927398714)|
+-----------------------+
| false|
+-----------------------+
-- make_valid_utf8
SELECT make_valid_utf8('Spark');
+----------------------+
|make_valid_utf8(Spark)|
+----------------------+
| Spark|
+----------------------+
SELECT make_valid_utf8(x'61');
+----------------------+
|make_valid_utf8(X'61')|
+----------------------+
| a|
+----------------------+
SELECT make_valid_utf8(x'80');
+----------------------+
|make_valid_utf8(X'80')|
+----------------------+
| ďż˝|
+----------------------+
SELECT make_valid_utf8(x'61C262');
+--------------------------+
|make_valid_utf8(X'61C262')|
+--------------------------+
| a�b|
+--------------------------+
-- mask
SELECT mask('abcd-EFGH-8765-4321');
+----------------------------------------+
|mask(abcd-EFGH-8765-4321, X, x, n, NULL)|
+----------------------------------------+
| xxxx-XXXX-nnnn-nnnn|
+----------------------------------------+
SELECT mask('abcd-EFGH-8765-4321', 'Q');
+----------------------------------------+
|mask(abcd-EFGH-8765-4321, Q, x, n, NULL)|
+----------------------------------------+
| xxxx-QQQQ-nnnn-nnnn|
+----------------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, n, NULL)|
+--------------------------------+
| QqQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#');
+--------------------------------+
|mask(AbCD123-@$#, X, x, n, NULL)|
+--------------------------------+
| XxXXnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q');
+--------------------------------+
|mask(AbCD123-@$#, Q, x, n, NULL)|
+--------------------------------+
| QxQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, n, NULL)|
+--------------------------------+
| QqQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q', 'd');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, d, NULL)|
+--------------------------------+
| QqQQddd-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q', 'd', 'o');
+-----------------------------+
|mask(AbCD123-@$#, Q, q, d, o)|
+-----------------------------+
| QqQQdddoooo|
+-----------------------------+
SELECT mask('AbCD123-@$#', NULL, 'q', 'd', 'o');
+--------------------------------+
|mask(AbCD123-@$#, NULL, q, d, o)|
+--------------------------------+
| AqCDdddoooo|
+--------------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, 'd', 'o');
+-----------------------------------+
|mask(AbCD123-@$#, NULL, NULL, d, o)|
+-----------------------------------+
| AbCDdddoooo|
+-----------------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, NULL, 'o');
+--------------------------------------+
|mask(AbCD123-@$#, NULL, NULL, NULL, o)|
+--------------------------------------+
| AbCD123oooo|
+--------------------------------------+
SELECT mask(NULL, NULL, NULL, NULL, 'o');
+-------------------------------+
|mask(NULL, NULL, NULL, NULL, o)|
+-------------------------------+
| NULL|
+-------------------------------+
SELECT mask(NULL);
+-------------------------+
|mask(NULL, X, x, n, NULL)|
+-------------------------+
| NULL|
+-------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, NULL, NULL);
+-----------------------------------------+
|mask(AbCD123-@$#, NULL, NULL, NULL, NULL)|
+-----------------------------------------+
| AbCD123-@$#|
+-----------------------------------------+
-- octet_length
SELECT octet_length('Spark SQL');
+-----------------------+
|octet_length(Spark SQL)|
+-----------------------+
| 9|
+-----------------------+
SELECT octet_length(x'537061726b2053514c');
+-----------------------------------+
|octet_length(X'537061726B2053514C')|
+-----------------------------------+
| 9|
+-----------------------------------+
-- overlay
SELECT overlay('Spark SQL' PLACING '_' FROM 6);
+----------------------------+
|overlay(Spark SQL, _, 6, -1)|
+----------------------------+
| Spark_SQL|
+----------------------------+
SELECT overlay('Spark SQL' PLACING 'CORE' FROM 7);
+-------------------------------+
|overlay(Spark SQL, CORE, 7, -1)|
+-------------------------------+
| Spark CORE|
+-------------------------------+
SELECT overlay('Spark SQL' PLACING 'ANSI ' FROM 7 FOR 0);
+-------------------------------+
|overlay(Spark SQL, ANSI , 7, 0)|
+-------------------------------+
| Spark ANSI SQL|
+-------------------------------+
SELECT overlay('Spark SQL' PLACING 'tructured' FROM 2 FOR 4);
+-----------------------------------+
|overlay(Spark SQL, tructured, 2, 4)|
+-----------------------------------+
| Structured SQL|
+-----------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('_', 'utf-8') FROM 6);
+----------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(_, utf-8), 6, -1)|
+----------------------------------------------------------+
| [53 70 61 72 6B 5...|
+----------------------------------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('CORE', 'utf-8') FROM 7);
+-------------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(CORE, utf-8), 7, -1)|
+-------------------------------------------------------------+
| [53 70 61 72 6B 2...|
+-------------------------------------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('ANSI ', 'utf-8') FROM 7 FOR 0);
+-------------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(ANSI , utf-8), 7, 0)|
+-------------------------------------------------------------+
| [53 70 61 72 6B 2...|
+-------------------------------------------------------------+
SELECT overlay(encode('Spark SQL', 'utf-8') PLACING encode('tructured', 'utf-8') FROM 2 FOR 4);
+-----------------------------------------------------------------+
|overlay(encode(Spark SQL, utf-8), encode(tructured, utf-8), 2, 4)|
+-----------------------------------------------------------------+
| [53 74 72 75 63 7...|
+-----------------------------------------------------------------+
-- position
SELECT position('bar', 'foobarbar');
+---------------------------+
|position(bar, foobarbar, 1)|
+---------------------------+
| 4|
+---------------------------+
SELECT position('bar', 'foobarbar', 5);
+---------------------------+
|position(bar, foobarbar, 5)|
+---------------------------+
| 7|
+---------------------------+
SELECT POSITION('bar' IN 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
| 4|
+-------------------------+
-- printf
SELECT printf("Hello World %d %s", 100, "days");
+------------------------------------+
|printf(Hello World %d %s, 100, days)|
+------------------------------------+
| Hello World 100 days|
+------------------------------------+
-- randstr
SELECT randstr(3, 0) AS result;
+------+
|result|
+------+
| ceV|
+------+
-- regexp_count
SELECT regexp_count('Steven Jones and Stephen Smith are the best players', 'Ste(v|ph)en');
+------------------------------------------------------------------------------+
|regexp_count(Steven Jones and Stephen Smith are the best players, Ste(v|ph)en)|
+------------------------------------------------------------------------------+
| 2|
+------------------------------------------------------------------------------+
SELECT regexp_count('abcdefghijklmnopqrstuvwxyz', '[a-z]{3}');
+--------------------------------------------------+
|regexp_count(abcdefghijklmnopqrstuvwxyz, [a-z]{3})|
+--------------------------------------------------+
| 8|
+--------------------------------------------------+
-- regexp_extract
SELECT regexp_extract('100-200', '(\\d+)-(\\d+)', 1);
+---------------------------------------+
|regexp_extract(100-200, (\d+)-(\d+), 1)|
+---------------------------------------+
| 100|
+---------------------------------------+
SELECT regexp_extract('100-200', r'(\d+)-(\d+)', 1);
+---------------------------------------+
|regexp_extract(100-200, (\d+)-(\d+), 1)|
+---------------------------------------+
| 100|
+---------------------------------------+
-- regexp_extract_all
SELECT regexp_extract_all('100-200, 300-400', '(\\d+)-(\\d+)', 1);
+----------------------------------------------------+
|regexp_extract_all(100-200, 300-400, (\d+)-(\d+), 1)|
+----------------------------------------------------+
| [100, 300]|
+----------------------------------------------------+
SELECT regexp_extract_all('100-200, 300-400', r'(\d+)-(\d+)', 1);
+----------------------------------------------------+
|regexp_extract_all(100-200, 300-400, (\d+)-(\d+), 1)|
+----------------------------------------------------+
| [100, 300]|
+----------------------------------------------------+
-- regexp_instr
SELECT regexp_instr(r"\abc", r"^\\abc$");
+------------------------------+
|regexp_instr(\abc, ^\\abc$, 0)|
+------------------------------+
| 1|
+------------------------------+
SELECT regexp_instr('user@spark.apache.org', '@[^.]*');
+----------------------------------------------+
|regexp_instr(user@spark.apache.org, @[^.]*, 0)|
+----------------------------------------------+
| 5|
+----------------------------------------------+
-- regexp_replace
SELECT regexp_replace('100-200', '(\\d+)', 'num');
+--------------------------------------+
|regexp_replace(100-200, (\d+), num, 1)|
+--------------------------------------+
| num-num|
+--------------------------------------+
SELECT regexp_replace('100-200', r'(\d+)', 'num');
+--------------------------------------+
|regexp_replace(100-200, (\d+), num, 1)|
+--------------------------------------+
| num-num|
+--------------------------------------+
-- regexp_substr
SELECT regexp_substr('Steven Jones and Stephen Smith are the best players', 'Ste(v|ph)en');
+-------------------------------------------------------------------------------+
|regexp_substr(Steven Jones and Stephen Smith are the best players, Ste(v|ph)en)|
+-------------------------------------------------------------------------------+
| Steven|
+-------------------------------------------------------------------------------+
SELECT regexp_substr('Steven Jones and Stephen Smith are the best players', 'Jeck');
+------------------------------------------------------------------------+
|regexp_substr(Steven Jones and Stephen Smith are the best players, Jeck)|
+------------------------------------------------------------------------+
| NULL|
+------------------------------------------------------------------------+
-- repeat
SELECT repeat('123', 2);
+--------------+
|repeat(123, 2)|
+--------------+
| 123123|
+--------------+
-- replace
SELECT replace('ABCabc', 'abc', 'DEF');
+-------------------------+
|replace(ABCabc, abc, DEF)|
+-------------------------+
| ABCDEF|
+-------------------------+
-- right
SELECT right('Spark SQL', 3);
+-------------------+
|right(Spark SQL, 3)|
+-------------------+
| SQL|
+-------------------+
-- rpad
SELECT rpad('hi', 5, '??');
+---------------+
|rpad(hi, 5, ??)|
+---------------+
| hi???|
+---------------+
SELECT rpad('hi', 1, '??');
+---------------+
|rpad(hi, 1, ??)|
+---------------+
| h|
+---------------+
SELECT rpad('hi', 5);
+--------------+
|rpad(hi, 5, )|
+--------------+
| hi |
+--------------+
SELECT hex(rpad(unhex('aabb'), 5));
+--------------------------------+
|hex(rpad(unhex(aabb), 5, X'00'))|
+--------------------------------+
| AABB000000|
+--------------------------------+
SELECT hex(rpad(unhex('aabb'), 5, unhex('1122')));
+--------------------------------------+
|hex(rpad(unhex(aabb), 5, unhex(1122)))|
+--------------------------------------+
| AABB112211|
+--------------------------------------+
-- rtrim
SELECT rtrim(' SparkSQL ');
+----------------------+
|rtrim( SparkSQL )|
+----------------------+
| SparkSQL|
+----------------------+
-- sentences
SELECT sentences('Hi there! Good morning.');
+--------------------------------------+
|sentences(Hi there! Good morning., , )|
+--------------------------------------+
| [[Hi, there], [Go...|
+--------------------------------------+
SELECT sentences('Hi there! Good morning.', 'en');
+----------------------------------------+
|sentences(Hi there! Good morning., en, )|
+----------------------------------------+
| [[Hi, there], [Go...|
+----------------------------------------+
SELECT sentences('Hi there! Good morning.', 'en', 'US');
+------------------------------------------+
|sentences(Hi there! Good morning., en, US)|
+------------------------------------------+
| [[Hi, there], [Go...|
+------------------------------------------+
-- soundex
SELECT soundex('Miller');
+---------------+
|soundex(Miller)|
+---------------+
| M460|
+---------------+
-- space
SELECT concat(space(2), '1');
+-------------------+
|concat(space(2), 1)|
+-------------------+
| 1|
+-------------------+
-- split
SELECT split('oneAtwoBthreeC', '[ABC]');
+--------------------------------+
|split(oneAtwoBthreeC, [ABC], -1)|
+--------------------------------+
| [one, two, three, ]|
+--------------------------------+
SELECT split('oneAtwoBthreeC', '[ABC]', -1);
+--------------------------------+
|split(oneAtwoBthreeC, [ABC], -1)|
+--------------------------------+
| [one, two, three, ]|
+--------------------------------+
SELECT split('oneAtwoBthreeC', '[ABC]', 2);
+-------------------------------+
|split(oneAtwoBthreeC, [ABC], 2)|
+-------------------------------+
| [one, twoBthreeC]|
+-------------------------------+
-- split_part
SELECT split_part('11.12.13', '.', 3);
+--------------------------+
|split_part(11.12.13, ., 3)|
+--------------------------+
| 13|
+--------------------------+
-- startswith
SELECT startswith('Spark SQL', 'Spark');
+----------------------------+
|startswith(Spark SQL, Spark)|
+----------------------------+
| true|
+----------------------------+
SELECT startswith('Spark SQL', 'SQL');
+--------------------------+
|startswith(Spark SQL, SQL)|
+--------------------------+
| false|
+--------------------------+
SELECT startswith('Spark SQL', null);
+---------------------------+
|startswith(Spark SQL, NULL)|
+---------------------------+
| NULL|
+---------------------------+
SELECT startswith(x'537061726b2053514c', x'537061726b');
+------------------------------------------------+
|startswith(X'537061726B2053514C', X'537061726B')|
+------------------------------------------------+
| true|
+------------------------------------------------+
SELECT startswith(x'537061726b2053514c', x'53514c');
+--------------------------------------------+
|startswith(X'537061726B2053514C', X'53514C')|
+--------------------------------------------+
| false|
+--------------------------------------------+
-- substr
SELECT substr('Spark SQL', 5);
+--------------------------------+
|substr(Spark SQL, 5, 2147483647)|
+--------------------------------+
| k SQL|
+--------------------------------+
SELECT substr('Spark SQL', -3);
+---------------------------------+
|substr(Spark SQL, -3, 2147483647)|
+---------------------------------+
| SQL|
+---------------------------------+
SELECT substr('Spark SQL', 5, 1);
+-----------------------+
|substr(Spark SQL, 5, 1)|
+-----------------------+
| k|
+-----------------------+
SELECT substr('Spark SQL' FROM 5);
+-----------------------------------+
|substring(Spark SQL, 5, 2147483647)|
+-----------------------------------+
| k SQL|
+-----------------------------------+
SELECT substr('Spark SQL' FROM -3);
+------------------------------------+
|substring(Spark SQL, -3, 2147483647)|
+------------------------------------+
| SQL|
+------------------------------------+
SELECT substr('Spark SQL' FROM 5 FOR 1);
+--------------------------+
|substring(Spark SQL, 5, 1)|
+--------------------------+
| k|
+--------------------------+
SELECT substr(encode('Spark SQL', 'utf-8'), 5);
+-----------------------------------------------+
|substr(encode(Spark SQL, utf-8), 5, 2147483647)|
+-----------------------------------------------+
| [6B 20 53 51 4C]|
+-----------------------------------------------+
-- substring
SELECT substring('Spark SQL', 5);
+-----------------------------------+
|substring(Spark SQL, 5, 2147483647)|
+-----------------------------------+
| k SQL|
+-----------------------------------+
SELECT substring('Spark SQL', -3);
+------------------------------------+
|substring(Spark SQL, -3, 2147483647)|
+------------------------------------+
| SQL|
+------------------------------------+
SELECT substring('Spark SQL', 5, 1);
+--------------------------+
|substring(Spark SQL, 5, 1)|
+--------------------------+
| k|
+--------------------------+
SELECT substring('Spark SQL' FROM 5);
+-----------------------------------+
|substring(Spark SQL, 5, 2147483647)|
+-----------------------------------+
| k SQL|
+-----------------------------------+
SELECT substring('Spark SQL' FROM -3);
+------------------------------------+
|substring(Spark SQL, -3, 2147483647)|
+------------------------------------+
| SQL|
+------------------------------------+
SELECT substring('Spark SQL' FROM 5 FOR 1);
+--------------------------+
|substring(Spark SQL, 5, 1)|
+--------------------------+
| k|
+--------------------------+
SELECT substring(encode('Spark SQL', 'utf-8'), 5);
+--------------------------------------------------+
|substring(encode(Spark SQL, utf-8), 5, 2147483647)|
+--------------------------------------------------+
| [6B 20 53 51 4C]|
+--------------------------------------------------+
-- substring_index
SELECT substring_index('www.apache.org', '.', 2);
+-------------------------------------+
|substring_index(www.apache.org, ., 2)|
+-------------------------------------+
| www.apache|
+-------------------------------------+
-- to_binary
SELECT to_binary('abc', 'utf-8');
+---------------------+
|to_binary(abc, utf-8)|
+---------------------+
| [61 62 63]|
+---------------------+
-- to_char
SELECT to_char(454, '999');
+-----------------+
|to_char(454, 999)|
+-----------------+
| 454|
+-----------------+
SELECT to_char(454.00, '000D00');
+-----------------------+
|to_char(454.00, 000D00)|
+-----------------------+
| 454.00|
+-----------------------+
SELECT to_char(12454, '99G999');
+----------------------+
|to_char(12454, 99G999)|
+----------------------+
| 12,454|
+----------------------+
SELECT to_char(78.12, '$99.99');
+----------------------+
|to_char(78.12, $99.99)|
+----------------------+
| $78.12|
+----------------------+
SELECT to_char(-12454.8, '99G999D9S');
+----------------------------+
|to_char(-12454.8, 99G999D9S)|
+----------------------------+
| 12,454.8-|
+----------------------------+
SELECT to_char(date'2016-04-08', 'y');
+---------------------------------+
|date_format(DATE '2016-04-08', y)|
+---------------------------------+
| 2016|
+---------------------------------+
SELECT to_char(x'537061726b2053514c', 'base64');
+-----------------------------+
|base64(X'537061726B2053514C')|
+-----------------------------+
| U3BhcmsgU1FM|
+-----------------------------+
SELECT to_char(x'537061726b2053514c', 'hex');
+--------------------------+
|hex(X'537061726B2053514C')|
+--------------------------+
| 537061726B2053514C|
+--------------------------+
SELECT to_char(encode('abc', 'utf-8'), 'utf-8');
+---------------------------------+
|decode(encode(abc, utf-8), utf-8)|
+---------------------------------+
| abc|
+---------------------------------+
-- to_number
SELECT to_number('454', '999');
+-------------------+
|to_number(454, 999)|
+-------------------+
| 454|
+-------------------+
SELECT to_number('454.00', '000.00');
+-------------------------+
|to_number(454.00, 000.00)|
+-------------------------+
| 454.00|
+-------------------------+
SELECT to_number('12,454', '99,999');
+-------------------------+
|to_number(12,454, 99,999)|
+-------------------------+
| 12454|
+-------------------------+
SELECT to_number('$78.12', '$99.99');
+-------------------------+
|to_number($78.12, $99.99)|
+-------------------------+
| 78.12|
+-------------------------+
SELECT to_number('12,454.8-', '99,999.9S');
+-------------------------------+
|to_number(12,454.8-, 99,999.9S)|
+-------------------------------+
| -12454.8|
+-------------------------------+
-- to_varchar
SELECT to_varchar(454, '999');
+-----------------+
|to_char(454, 999)|
+-----------------+
| 454|
+-----------------+
SELECT to_varchar(454.00, '000D00');
+-----------------------+
|to_char(454.00, 000D00)|
+-----------------------+
| 454.00|
+-----------------------+
SELECT to_varchar(12454, '99G999');
+----------------------+
|to_char(12454, 99G999)|
+----------------------+
| 12,454|
+----------------------+
SELECT to_varchar(78.12, '$99.99');
+----------------------+
|to_char(78.12, $99.99)|
+----------------------+
| $78.12|
+----------------------+
SELECT to_varchar(-12454.8, '99G999D9S');
+----------------------------+
|to_char(-12454.8, 99G999D9S)|
+----------------------------+
| 12,454.8-|
+----------------------------+
SELECT to_varchar(date'2016-04-08', 'y');
+---------------------------------+
|date_format(DATE '2016-04-08', y)|
+---------------------------------+
| 2016|
+---------------------------------+
SELECT to_varchar(x'537061726b2053514c', 'base64');
+---------------------------------+
|to_varchar(X'537061726B2053514C')|
+---------------------------------+
| U3BhcmsgU1FM|
+---------------------------------+
SELECT to_varchar(x'537061726b2053514c', 'hex');
+---------------------------------+
|to_varchar(X'537061726B2053514C')|
+---------------------------------+
| 537061726B2053514C|
+---------------------------------+
SELECT to_varchar(encode('abc', 'utf-8'), 'utf-8');
+-------------------------------------+
|to_varchar(encode(abc, utf-8), utf-8)|
+-------------------------------------+
| abc|
+-------------------------------------+
-- translate
SELECT translate('AaBbCc', 'abc', '123');
+---------------------------+
|translate(AaBbCc, abc, 123)|
+---------------------------+
| A1B2C3|
+---------------------------+
-- trim
SELECT trim(' SparkSQL ');
+---------------------+
|trim( SparkSQL )|
+---------------------+
| SparkSQL|
+---------------------+
SELECT trim(BOTH FROM ' SparkSQL ');
+---------------------+
|trim( SparkSQL )|
+---------------------+
| SparkSQL|
+---------------------+
SELECT trim(LEADING FROM ' SparkSQL ');
+----------------------+
|ltrim( SparkSQL )|
+----------------------+
| SparkSQL |
+----------------------+
SELECT trim(TRAILING FROM ' SparkSQL ');
+----------------------+
|rtrim( SparkSQL )|
+----------------------+
| SparkSQL|
+----------------------+
SELECT trim('SL' FROM 'SSparkSQLS');
+-----------------------------+
|TRIM(BOTH SL FROM SSparkSQLS)|
+-----------------------------+
| parkSQ|
+-----------------------------+
SELECT trim(BOTH 'SL' FROM 'SSparkSQLS');
+-----------------------------+
|TRIM(BOTH SL FROM SSparkSQLS)|
+-----------------------------+
| parkSQ|
+-----------------------------+
SELECT trim(LEADING 'SL' FROM 'SSparkSQLS');
+--------------------------------+
|TRIM(LEADING SL FROM SSparkSQLS)|
+--------------------------------+
| parkSQLS|
+--------------------------------+
SELECT trim(TRAILING 'SL' FROM 'SSparkSQLS');
+---------------------------------+
|TRIM(TRAILING SL FROM SSparkSQLS)|
+---------------------------------+
| SSparkSQ|
+---------------------------------+
-- try_to_binary
SELECT try_to_binary('abc', 'utf-8');
+-------------------------+
|try_to_binary(abc, utf-8)|
+-------------------------+
| [61 62 63]|
+-------------------------+
select try_to_binary('a!', 'base64');
+-------------------------+
|try_to_binary(a!, base64)|
+-------------------------+
| NULL|
+-------------------------+
select try_to_binary('abc', 'invalidFormat');
+---------------------------------+
|try_to_binary(abc, invalidFormat)|
+---------------------------------+
| NULL|
+---------------------------------+
-- try_to_number
SELECT try_to_number('454', '999');
+-----------------------+
|try_to_number(454, 999)|
+-----------------------+
| 454|
+-----------------------+
SELECT try_to_number('454.00', '000.00');
+-----------------------------+
|try_to_number(454.00, 000.00)|
+-----------------------------+
| 454.00|
+-----------------------------+
SELECT try_to_number('12,454', '99,999');
+-----------------------------+
|try_to_number(12,454, 99,999)|
+-----------------------------+
| 12454|
+-----------------------------+
SELECT try_to_number('$78.12', '$99.99');
+-----------------------------+
|try_to_number($78.12, $99.99)|
+-----------------------------+
| 78.12|
+-----------------------------+
SELECT try_to_number('12,454.8-', '99,999.9S');
+-----------------------------------+
|try_to_number(12,454.8-, 99,999.9S)|
+-----------------------------------+
| -12454.8|
+-----------------------------------+
-- try_validate_utf8
SELECT try_validate_utf8('Spark');
+------------------------+
|try_validate_utf8(Spark)|
+------------------------+
| Spark|
+------------------------+
SELECT try_validate_utf8(x'61');
+------------------------+
|try_validate_utf8(X'61')|
+------------------------+
| a|
+------------------------+
SELECT try_validate_utf8(x'80');
+------------------------+
|try_validate_utf8(X'80')|
+------------------------+
| NULL|
+------------------------+
SELECT try_validate_utf8(x'61C262');
+----------------------------+
|try_validate_utf8(X'61C262')|
+----------------------------+
| NULL|
+----------------------------+
-- ucase
SELECT ucase('SparkSql');
+---------------+
|ucase(SparkSql)|
+---------------+
| SPARKSQL|
+---------------+
-- unbase64
SELECT unbase64('U3BhcmsgU1FM');
+----------------------+
|unbase64(U3BhcmsgU1FM)|
+----------------------+
| [53 70 61 72 6B 2...|
+----------------------+
-- upper
SELECT upper('SparkSql');
+---------------+
|upper(SparkSql)|
+---------------+
| SPARKSQL|
+---------------+
-- validate_utf8
SELECT validate_utf8('Spark');
+--------------------+
|validate_utf8(Spark)|
+--------------------+
| Spark|
+--------------------+
SELECT validate_utf8(x'61');
+--------------------+
|validate_utf8(X'61')|
+--------------------+
| a|
+--------------------+
Conditional Functions
Function | Description |
---|---|
input [NOT] between lower AND upper | evaluate if `input` is [not] in between `lower` and `upper` |
coalesce(expr1, expr2, ...) | Returns the first non-null argument if exists. Otherwise, null. |
if(expr1, expr2, expr3) | If `expr1` evaluates to true, then returns `expr2`; otherwise returns `expr3`. |
ifnull(expr1, expr2) | Returns `expr2` if `expr1` is null, or `expr1` otherwise. |
nanvl(expr1, expr2) | Returns `expr1` if it's not NaN, or `expr2` otherwise. |
nullif(expr1, expr2) | Returns null if `expr1` equals to `expr2`, or `expr1` otherwise. |
nullifzero(expr) | Returns null if `expr` is equal to zero, or `expr` otherwise. |
nvl(expr1, expr2) | Returns `expr2` if `expr1` is null, or `expr1` otherwise. |
nvl2(expr1, expr2, expr3) | Returns `expr2` if `expr1` is not null, or `expr3` otherwise. |
CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END | When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`. |
zeroifnull(expr) | Returns zero if `expr` is equal to null, or `expr` otherwise. |
Examples
-- between
SELECT 0.5 between 0.1 AND 1.0;
+----------------------+
|between(0.5, 0.1, 1.0)|
+----------------------+
| true|
+----------------------+
-- coalesce
SELECT coalesce(NULL, 1, NULL);
+-----------------------+
|coalesce(NULL, 1, NULL)|
+-----------------------+
| 1|
+-----------------------+
-- if
SELECT if(1 < 2, 'a', 'b');
+-------------------+
|(IF((1 < 2), a, b))|
+-------------------+
| a|
+-------------------+
-- ifnull
SELECT ifnull(NULL, array('2'));
+----------------------+
|ifnull(NULL, array(2))|
+----------------------+
| [2]|
+----------------------+
-- nanvl
SELECT nanvl(cast('NaN' as double), 123);
+-------------------------------+
|nanvl(CAST(NaN AS DOUBLE), 123)|
+-------------------------------+
| 123.0|
+-------------------------------+
-- nullif
SELECT nullif(2, 2);
+------------+
|nullif(2, 2)|
+------------+
| NULL|
+------------+
-- nullifzero
SELECT nullifzero(0);
+-------------+
|nullifzero(0)|
+-------------+
| NULL|
+-------------+
SELECT nullifzero(2);
+-------------+
|nullifzero(2)|
+-------------+
| 2|
+-------------+
-- nvl
SELECT nvl(NULL, array('2'));
+-------------------+
|nvl(NULL, array(2))|
+-------------------+
| [2]|
+-------------------+
-- nvl2
SELECT nvl2(NULL, 2, 1);
+----------------+
|nvl2(NULL, 2, 1)|
+----------------+
| 1|
+----------------+
-- when
SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END;
+-----------------------------------------------------------+
|CASE WHEN (1 > 0) THEN 1 WHEN (2 > 0) THEN 2.0 ELSE 1.2 END|
+-----------------------------------------------------------+
| 1.0|
+-----------------------------------------------------------+
SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END;
+-----------------------------------------------------------+
|CASE WHEN (1 < 0) THEN 1 WHEN (2 > 0) THEN 2.0 ELSE 1.2 END|
+-----------------------------------------------------------+
| 2.0|
+-----------------------------------------------------------+
SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 END;
+--------------------------------------------------+
|CASE WHEN (1 < 0) THEN 1 WHEN (2 < 0) THEN 2.0 END|
+--------------------------------------------------+
| NULL|
+--------------------------------------------------+
-- zeroifnull
SELECT zeroifnull(NULL);
+----------------+
|zeroifnull(NULL)|
+----------------+
| 0|
+----------------+
SELECT zeroifnull(2);
+-------------+
|zeroifnull(2)|
+-------------+
| 2|
+-------------+
Hash Functions
Function | Description |
---|---|
crc32(expr) | Returns a cyclic redundancy check value of the `expr` as a bigint. |
hash(expr1, expr2, ...) | Returns a hash value of the arguments. |
md5(expr) | Returns an MD5 128-bit checksum as a hex string of `expr`. |
sha(expr) | Returns a sha1 hash value as a hex string of the `expr`. |
sha1(expr) | Returns a sha1 hash value as a hex string of the `expr`. |
sha2(expr, bitLength) | Returns a checksum of SHA-2 family as a hex string of `expr`. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256. |
xxhash64(expr1, expr2, ...) | Returns a 64-bit hash value of the arguments. Hash seed is 42. |
Examples
-- crc32
SELECT crc32('Spark');
+------------+
|crc32(Spark)|
+------------+
| 1557323817|
+------------+
-- hash
SELECT hash('Spark', array(123), 2);
+--------------------------+
|hash(Spark, array(123), 2)|
+--------------------------+
| -1321691492|
+--------------------------+
-- md5
SELECT md5('Spark');
+--------------------+
| md5(Spark)|
+--------------------+
|8cde774d6f7333752...|
+--------------------+
-- sha
SELECT sha('Spark');
+--------------------+
| sha(Spark)|
+--------------------+
|85f5955f4b27a9a4c...|
+--------------------+
-- sha1
SELECT sha1('Spark');
+--------------------+
| sha1(Spark)|
+--------------------+
|85f5955f4b27a9a4c...|
+--------------------+
-- sha2
SELECT sha2('Spark', 256);
+--------------------+
| sha2(Spark, 256)|
+--------------------+
|529bc3b07127ecb7e...|
+--------------------+
-- xxhash64
SELECT xxhash64('Spark', array(123), 2);
+------------------------------+
|xxhash64(Spark, array(123), 2)|
+------------------------------+
| 5602566077635097486|
+------------------------------+
CSV Functions
Function | Description |
---|---|
from_csv(csvStr, schema[, options]) | Returns a struct value with the given `csvStr` and `schema`. |
schema_of_csv(csv[, options]) | Returns schema in the DDL format of CSV string. |
to_csv(expr[, options]) | Returns a CSV string with a given struct value |
Examples
-- from_csv
SELECT from_csv('1, 0.8', 'a INT, b DOUBLE');
+----------------+
|from_csv(1, 0.8)|
+----------------+
| {1, 0.8}|
+----------------+
SELECT from_csv('26/08/2015', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
+--------------------+
|from_csv(26/08/2015)|
+--------------------+
|{2015-08-26 00:00...|
+--------------------+
-- schema_of_csv
SELECT schema_of_csv('1,abc');
+--------------------+
|schema_of_csv(1,abc)|
+--------------------+
|STRUCT<_c0: INT, ...|
+--------------------+
-- to_csv
SELECT to_csv(named_struct('a', 1, 'b', 2));
+--------------------------------+
|to_csv(named_struct(a, 1, b, 2))|
+--------------------------------+
| 1,2|
+--------------------------------+
SELECT to_csv(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+----------------------------------------------------------------+
|to_csv(named_struct(time, to_timestamp(2015-08-26, yyyy-MM-dd)))|
+----------------------------------------------------------------+
| 26/08/2015|
+----------------------------------------------------------------+
JSON Functions
Function | Description |
---|---|
from_json(jsonStr, schema[, options]) | Returns a struct value with the given `jsonStr` and `schema`. |
get_json_object(json_txt, path) | Extracts a json object from `path`. |
json_array_length(jsonArray) | Returns the number of elements in the outermost JSON array. |
json_object_keys(json_object) | Returns all the keys of the outermost JSON object as an array. |
json_tuple(jsonStr, p1, p2, ..., pn) | Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string. |
schema_of_json(json[, options]) | Returns schema in the DDL format of JSON string. |
to_json(expr[, options]) | Returns a JSON string with a given struct value |
Examples
-- from_json
SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
+---------------------------+
|from_json({"a":1, "b":0.8})|
+---------------------------+
| {1, 0.8}|
+---------------------------+
SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
+--------------------------------+
|from_json({"time":"26/08/2015"})|
+--------------------------------+
| {2015-08-26 00:00...|
+--------------------------------+
SELECT from_json('{"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]}', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
+--------------------------------------------------------------------------------------------------------+
|from_json({"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]})|
+--------------------------------------------------------------------------------------------------------+
| {Alice, [{Bob, 1}...|
+--------------------------------------------------------------------------------------------------------+
-- get_json_object
SELECT get_json_object('{"a":"b"}', '$.a');
+-------------------------------+
|get_json_object({"a":"b"}, $.a)|
+-------------------------------+
| b|
+-------------------------------+
-- json_array_length
SELECT json_array_length('[1,2,3,4]');
+----------------------------+
|json_array_length([1,2,3,4])|
+----------------------------+
| 4|
+----------------------------+
SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+------------------------------------------------+
|json_array_length([1,2,3,{"f1":1,"f2":[5,6]},4])|
+------------------------------------------------+
| 5|
+------------------------------------------------+
SELECT json_array_length('[1,2');
+-----------------------+
|json_array_length([1,2)|
+-----------------------+
| NULL|
+-----------------------+
-- json_object_keys
SELECT json_object_keys('{}');
+--------------------+
|json_object_keys({})|
+--------------------+
| []|
+--------------------+
SELECT json_object_keys('{"key": "value"}');
+----------------------------------+
|json_object_keys({"key": "value"})|
+----------------------------------+
| [key]|
+----------------------------------+
SELECT json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}');
+--------------------------------------------------------+
|json_object_keys({"f1":"abc","f2":{"f3":"a", "f4":"b"}})|
+--------------------------------------------------------+
| [f1, f2]|
+--------------------------------------------------------+
-- json_tuple
SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
+---+---+
| c0| c1|
+---+---+
| 1| 2|
+---+---+
-- schema_of_json
SELECT schema_of_json('[{"col":0}]');
+---------------------------+
|schema_of_json([{"col":0}])|
+---------------------------+
| ARRAY<STRUCT<col:...|
+---------------------------+
SELECT schema_of_json('[{"col":01}]', map('allowNumericLeadingZeros', 'true'));
+----------------------------+
|schema_of_json([{"col":01}])|
+----------------------------+
| ARRAY<STRUCT<col:...|
+----------------------------+
-- to_json
SELECT to_json(named_struct('a', 1, 'b', 2));
+---------------------------------+
|to_json(named_struct(a, 1, b, 2))|
+---------------------------------+
| {"a":1,"b":2}|
+---------------------------------+
SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+-----------------------------------------------------------------+
|to_json(named_struct(time, to_timestamp(2015-08-26, yyyy-MM-dd)))|
+-----------------------------------------------------------------+
| {"time":"26/08/20...|
+-----------------------------------------------------------------+
SELECT to_json(array(named_struct('a', 1, 'b', 2)));
+----------------------------------------+
|to_json(array(named_struct(a, 1, b, 2)))|
+----------------------------------------+
| [{"a":1,"b":2}]|
+----------------------------------------+
SELECT to_json(map('a', named_struct('b', 1)));
+-----------------------------------+
|to_json(map(a, named_struct(b, 1)))|
+-----------------------------------+
| {"a":{"b":1}}|
+-----------------------------------+
SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)));
+----------------------------------------------------+
|to_json(map(named_struct(a, 1), named_struct(b, 2)))|
+----------------------------------------------------+
| {"[1]":{"b":2}}|
+----------------------------------------------------+
SELECT to_json(map('a', 1));
+------------------+
|to_json(map(a, 1))|
+------------------+
| {"a":1}|
+------------------+
SELECT to_json(array(map('a', 1)));
+-------------------------+
|to_json(array(map(a, 1)))|
+-------------------------+
| [{"a":1}]|
+-------------------------+
XML Functions
Function | Description |
---|---|
from_xml(xmlStr, schema[, options]) | Returns a struct value with the given `xmlStr` and `schema`. |
schema_of_xml(xml[, options]) | Returns schema in the DDL format of XML string. |
to_xml(expr[, options]) | Returns a XML string with a given struct value |
xpath(xml, xpath) | Returns a string array of values within the nodes of xml that match the XPath expression. |
xpath_boolean(xml, xpath) | Returns true if the XPath expression evaluates to true, or if a matching node is found. |
xpath_double(xml, xpath) | Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. |
xpath_float(xml, xpath) | Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. |
xpath_int(xml, xpath) | Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. |
xpath_long(xml, xpath) | Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. |
xpath_number(xml, xpath) | Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. |
xpath_short(xml, xpath) | Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. |
xpath_string(xml, xpath) | Returns the text contents of the first xml node that matches the XPath expression. |
Examples
-- from_xml
SELECT from_xml('<p><a>1</a><b>0.8</b></p>', 'a INT, b DOUBLE');
+-----------------------------------+
|from_xml(<p><a>1</a><b>0.8</b></p>)|
+-----------------------------------+
| {1, 0.8}|
+-----------------------------------+
SELECT from_xml('<p><time>26/08/2015</time></p>', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
+----------------------------------------+
|from_xml(<p><time>26/08/2015</time></p>)|
+----------------------------------------+
| {2015-08-26 00:00...|
+----------------------------------------+
SELECT from_xml('<p><teacher>Alice</teacher><student><name>Bob</name><rank>1</rank></student><student><name>Charlie</name><rank>2</rank></student></p>', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
+-----------------------------------------------------------------------------------------------------------------------------------------------+
|from_xml(<p><teacher>Alice</teacher><student><name>Bob</name><rank>1</rank></student><student><name>Charlie</name><rank>2</rank></student></p>)|
+-----------------------------------------------------------------------------------------------------------------------------------------------+
| {Alice, [{Bob, 1}...|
+-----------------------------------------------------------------------------------------------------------------------------------------------+
-- schema_of_xml
SELECT schema_of_xml('<p><a>1</a></p>');
+------------------------------+
|schema_of_xml(<p><a>1</a></p>)|
+------------------------------+
| STRUCT<a: BIGINT>|
+------------------------------+
SELECT schema_of_xml('<p><a attr="2">1</a><a>3</a></p>', map('excludeAttribute', 'true'));
+-----------------------------------------------+
|schema_of_xml(<p><a attr="2">1</a><a>3</a></p>)|
+-----------------------------------------------+
| STRUCT<a: ARRAY<B...|
+-----------------------------------------------+
-- to_xml
SELECT to_xml(named_struct('a', 1, 'b', 2));
+--------------------------------+
|to_xml(named_struct(a, 1, b, 2))|
+--------------------------------+
| <ROW>\n <a>1</...|
+--------------------------------+
SELECT to_xml(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+----------------------------------------------------------------+
|to_xml(named_struct(time, to_timestamp(2015-08-26, yyyy-MM-dd)))|
+----------------------------------------------------------------+
| <ROW>\n <time>...|
+----------------------------------------------------------------+
-- xpath
SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()');
+-----------------------------------------------------------------------+
|xpath(<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>, a/b/text())|
+-----------------------------------------------------------------------+
| [b1, b2, b3]|
+-----------------------------------------------------------------------+
SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b');
+----------------------------------------------------------------+
|xpath(<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>, a/b)|
+----------------------------------------------------------------+
| [NULL, NULL, NULL]|
+----------------------------------------------------------------+
-- xpath_boolean
SELECT xpath_boolean('<a><b>1</b></a>','a/b');
+-----------------------------------+
|xpath_boolean(<a><b>1</b></a>, a/b)|
+-----------------------------------+
| true|
+-----------------------------------+
-- xpath_double
SELECT xpath_double('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
+-----------------------------------------------+
|xpath_double(<a><b>1</b><b>2</b></a>, sum(a/b))|
+-----------------------------------------------+
| 3.0|
+-----------------------------------------------+
-- xpath_float
SELECT xpath_float('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
+----------------------------------------------+
|xpath_float(<a><b>1</b><b>2</b></a>, sum(a/b))|
+----------------------------------------------+
| 3.0|
+----------------------------------------------+
-- xpath_int
SELECT xpath_int('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
+--------------------------------------------+
|xpath_int(<a><b>1</b><b>2</b></a>, sum(a/b))|
+--------------------------------------------+
| 3|
+--------------------------------------------+
-- xpath_long
SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
+---------------------------------------------+
|xpath_long(<a><b>1</b><b>2</b></a>, sum(a/b))|
+---------------------------------------------+
| 3|
+---------------------------------------------+
-- xpath_number
SELECT xpath_number('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
+-----------------------------------------------+
|xpath_number(<a><b>1</b><b>2</b></a>, sum(a/b))|
+-----------------------------------------------+
| 3.0|
+-----------------------------------------------+
-- xpath_short
SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
+----------------------------------------------+
|xpath_short(<a><b>1</b><b>2</b></a>, sum(a/b))|
+----------------------------------------------+
| 3|
+----------------------------------------------+
-- xpath_string
SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c');
+-------------------------------------------+
|xpath_string(<a><b>b</b><c>cc</c></a>, a/c)|
+-------------------------------------------+
| cc|
+-------------------------------------------+
URL Functions
Function | Description |
---|---|
parse_url(url, partToExtract[, key]) | Extracts a part from a URL. |
try_url_decode(str) | This is a special version of `url_decode` that performs the same operation, but returns a NULL value instead of raising an error if the decoding cannot be performed. |
url_decode(str) | Decodes a `str` in 'application/x-www-form-urlencoded' format using a specific encoding scheme. |
url_encode(str) | Translates a string into 'application/x-www-form-urlencoded' format using a specific encoding scheme. |
Examples
-- parse_url
SELECT parse_url('http://spark.apache.org/path?query=1', 'HOST');
+-----------------------------------------------------+
|parse_url(http://spark.apache.org/path?query=1, HOST)|
+-----------------------------------------------------+
| spark.apache.org|
+-----------------------------------------------------+
SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY');
+------------------------------------------------------+
|parse_url(http://spark.apache.org/path?query=1, QUERY)|
+------------------------------------------------------+
| query=1|
+------------------------------------------------------+
SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY', 'query');
+-------------------------------------------------------------+
|parse_url(http://spark.apache.org/path?query=1, QUERY, query)|
+-------------------------------------------------------------+
| 1|
+-------------------------------------------------------------+
-- try_url_decode
SELECT try_url_decode('https%3A%2F%2Fspark.apache.org');
+----------------------------------------------+
|try_url_decode(https%3A%2F%2Fspark.apache.org)|
+----------------------------------------------+
| https://spark.apa...|
+----------------------------------------------+
-- url_decode
SELECT url_decode('https%3A%2F%2Fspark.apache.org');
+------------------------------------------+
|url_decode(https%3A%2F%2Fspark.apache.org)|
+------------------------------------------+
| https://spark.apa...|
+------------------------------------------+
-- url_encode
SELECT url_encode('https://spark.apache.org');
+------------------------------------+
|url_encode(https://spark.apache.org)|
+------------------------------------+
| https%3A%2F%2Fspa...|
+------------------------------------+
Bitwise Functions
Function | Description |
---|---|
expr1 & expr2 | Returns the result of bitwise AND of `expr1` and `expr2`. |
base << exp | Bitwise left shift. |
base >> expr | Bitwise (signed) right shift. |
base >>> expr | Bitwise unsigned right shift. |
expr1 ^ expr2 | Returns the result of bitwise exclusive OR of `expr1` and `expr2`. |
bit_count(expr) | Returns the number of bits that are set in the argument expr as an unsigned 64-bit integer, or NULL if the argument is NULL. |
bit_get(expr, pos) | Returns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative. |
getbit(expr, pos) | Returns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative. |
base shiftleft exp | Bitwise left shift. |
base shiftright expr | Bitwise (signed) right shift. |
base shiftrightunsigned expr | Bitwise unsigned right shift. |
expr1 | expr2 | Returns the result of bitwise OR of `expr1` and `expr2`. |
~ expr | Returns the result of bitwise NOT of `expr`. |
Examples
-- &
SELECT 3 & 5;
+-------+
|(3 & 5)|
+-------+
| 1|
+-------+
-- <<
SELECT shiftleft(2, 1);
+---------------+
|shiftleft(2, 1)|
+---------------+
| 4|
+---------------+
SELECT 2 << 1;
+--------+
|(2 << 1)|
+--------+
| 4|
+--------+
-- >>
SELECT shiftright(4, 1);
+----------------+
|shiftright(4, 1)|
+----------------+
| 2|
+----------------+
SELECT 4 >> 1;
+--------+
|(4 >> 1)|
+--------+
| 2|
+--------+
-- >>>
SELECT shiftrightunsigned(4, 1);
+------------------------+
|shiftrightunsigned(4, 1)|
+------------------------+
| 2|
+------------------------+
SELECT 4 >>> 1;
+---------+
|(4 >>> 1)|
+---------+
| 2|
+---------+
-- ^
SELECT 3 ^ 5;
+-------+
|(3 ^ 5)|
+-------+
| 6|
+-------+
-- bit_count
SELECT bit_count(0);
+------------+
|bit_count(0)|
+------------+
| 0|
+------------+
-- bit_get
SELECT bit_get(11, 0);
+--------------+
|bit_get(11, 0)|
+--------------+
| 1|
+--------------+
SELECT bit_get(11, 2);
+--------------+
|bit_get(11, 2)|
+--------------+
| 0|
+--------------+
-- getbit
SELECT getbit(11, 0);
+-------------+
|getbit(11, 0)|
+-------------+
| 1|
+-------------+
SELECT getbit(11, 2);
+-------------+
|getbit(11, 2)|
+-------------+
| 0|
+-------------+
-- shiftleft
SELECT shiftleft(2, 1);
+---------------+
|shiftleft(2, 1)|
+---------------+
| 4|
+---------------+
SELECT 2 << 1;
+--------+
|(2 << 1)|
+--------+
| 4|
+--------+
-- shiftright
SELECT shiftright(4, 1);
+----------------+
|shiftright(4, 1)|
+----------------+
| 2|
+----------------+
SELECT 4 >> 1;
+--------+
|(4 >> 1)|
+--------+
| 2|
+--------+
-- shiftrightunsigned
SELECT shiftrightunsigned(4, 1);
+------------------------+
|shiftrightunsigned(4, 1)|
+------------------------+
| 2|
+------------------------+
SELECT 4 >>> 1;
+---------+
|(4 >>> 1)|
+---------+
| 2|
+---------+
-- |
SELECT 3 | 5;
+-------+
|(3 | 5)|
+-------+
| 7|
+-------+
-- ~
SELECT ~ 0;
+---+
| ~0|
+---+
| -1|
+---+
Conversion Functions
Function | Description |
---|---|
bigint(expr) | Casts the value `expr` to the target data type `bigint`. |
binary(expr) | Casts the value `expr` to the target data type `binary`. |
boolean(expr) | Casts the value `expr` to the target data type `boolean`. |
cast(expr AS type) | Casts the value `expr` to the target data type `type`. `expr` :: `type` alternative casting syntax is also supported. |
date(expr) | Casts the value `expr` to the target data type `date`. |
decimal(expr) | Casts the value `expr` to the target data type `decimal`. |
double(expr) | Casts the value `expr` to the target data type `double`. |
float(expr) | Casts the value `expr` to the target data type `float`. |
int(expr) | Casts the value `expr` to the target data type `int`. |
smallint(expr) | Casts the value `expr` to the target data type `smallint`. |
string(expr) | Casts the value `expr` to the target data type `string`. |
timestamp(expr) | Casts the value `expr` to the target data type `timestamp`. |
tinyint(expr) | Casts the value `expr` to the target data type `tinyint`. |
Examples
-- cast
SELECT cast('10' as int);
+---------------+
|CAST(10 AS INT)|
+---------------+
| 10|
+---------------+
SELECT '10' :: int;
+---------------+
|CAST(10 AS INT)|
+---------------+
| 10|
+---------------+
Predicate Functions
Function | Description |
---|---|
! expr | Logical not. |
expr1 < expr2 | Returns true if `expr1` is less than `expr2`. |
expr1 <= expr2 | Returns true if `expr1` is less than or equal to `expr2`. |
expr1 <=> expr2 | Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null. |
expr1 = expr2 | Returns true if `expr1` equals `expr2`, or false otherwise. |
expr1 == expr2 | Returns true if `expr1` equals `expr2`, or false otherwise. |
expr1 > expr2 | Returns true if `expr1` is greater than `expr2`. |
expr1 >= expr2 | Returns true if `expr1` is greater than or equal to `expr2`. |
expr1 and expr2 | Logical AND. |
equal_null(expr1, expr2) | Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null. |
str ilike pattern[ ESCAPE escape] | Returns true if str matches `pattern` with `escape` case-insensitively, null if any arguments are null, false otherwise. |
expr1 in(expr2, expr3, ...) | Returns true if `expr` equals to any valN. |
isnan(expr) | Returns true if `expr` is NaN, or false otherwise. |
isnotnull(expr) | Returns true if `expr` is not null, or false otherwise. |
isnull(expr) | Returns true if `expr` is null, or false otherwise. |
str like pattern[ ESCAPE escape] | Returns true if str matches `pattern` with `escape`, null if any arguments are null, false otherwise. |
not expr | Logical not. |
expr1 or expr2 | Logical OR. |
regexp(str, regexp) | Returns true if `str` matches `regexp`, or false otherwise. |
regexp_like(str, regexp) | Returns true if `str` matches `regexp`, or false otherwise. |
rlike(str, regexp) | Returns true if `str` matches `regexp`, or false otherwise. |
Examples
-- !
SELECT ! true;
+----------+
|(NOT true)|
+----------+
| false|
+----------+
SELECT ! false;
+-----------+
|(NOT false)|
+-----------+
| true|
+-----------+
SELECT ! NULL;
+----------+
|(NOT NULL)|
+----------+
| NULL|
+----------+
-- <
SELECT 1 < 2;
+-------+
|(1 < 2)|
+-------+
| true|
+-------+
SELECT 1.1 < '1';
+---------+
|(1.1 < 1)|
+---------+
| false|
+---------+
SELECT to_date('2009-07-30 04:17:52') < to_date('2009-07-30 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) < to_date(2009-07-30 04:17:52))|
+-------------------------------------------------------------+
| false|
+-------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') < to_date('2009-08-01 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) < to_date(2009-08-01 04:17:52))|
+-------------------------------------------------------------+
| true|
+-------------------------------------------------------------+
SELECT 1 < NULL;
+----------+
|(1 < NULL)|
+----------+
| NULL|
+----------+
-- <=
SELECT 2 <= 2;
+--------+
|(2 <= 2)|
+--------+
| true|
+--------+
SELECT 1.0 <= '1';
+----------+
|(1.0 <= 1)|
+----------+
| true|
+----------+
SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-07-30 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) <= to_date(2009-07-30 04:17:52))|
+--------------------------------------------------------------+
| true|
+--------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-08-01 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) <= to_date(2009-08-01 04:17:52))|
+--------------------------------------------------------------+
| true|
+--------------------------------------------------------------+
SELECT 1 <= NULL;
+-----------+
|(1 <= NULL)|
+-----------+
| NULL|
+-----------+
-- <=>
SELECT 2 <=> 2;
+---------+
|(2 <=> 2)|
+---------+
| true|
+---------+
SELECT 1 <=> '1';
+---------+
|(1 <=> 1)|
+---------+
| true|
+---------+
SELECT true <=> NULL;
+---------------+
|(true <=> NULL)|
+---------------+
| false|
+---------------+
SELECT NULL <=> NULL;
+---------------+
|(NULL <=> NULL)|
+---------------+
| true|
+---------------+
-- =
SELECT 2 = 2;
+-------+
|(2 = 2)|
+-------+
| true|
+-------+
SELECT 1 = '1';
+-------+
|(1 = 1)|
+-------+
| true|
+-------+
SELECT true = NULL;
+-------------+
|(true = NULL)|
+-------------+
| NULL|
+-------------+
SELECT NULL = NULL;
+-------------+
|(NULL = NULL)|
+-------------+
| NULL|
+-------------+
-- ==
SELECT 2 == 2;
+-------+
|(2 = 2)|
+-------+
| true|
+-------+
SELECT 1 == '1';
+-------+
|(1 = 1)|
+-------+
| true|
+-------+
SELECT true == NULL;
+-------------+
|(true = NULL)|
+-------------+
| NULL|
+-------------+
SELECT NULL == NULL;
+-------------+
|(NULL = NULL)|
+-------------+
| NULL|
+-------------+
-- >
SELECT 2 > 1;
+-------+
|(2 > 1)|
+-------+
| true|
+-------+
SELECT 2 > 1.1;
+-------+
|(2 > 1)|
+-------+
| true|
+-------+
SELECT to_date('2009-07-30 04:17:52') > to_date('2009-07-30 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) > to_date(2009-07-30 04:17:52))|
+-------------------------------------------------------------+
| false|
+-------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') > to_date('2009-08-01 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) > to_date(2009-08-01 04:17:52))|
+-------------------------------------------------------------+
| false|
+-------------------------------------------------------------+
SELECT 1 > NULL;
+----------+
|(1 > NULL)|
+----------+
| NULL|
+----------+
-- >=
SELECT 2 >= 1;
+--------+
|(2 >= 1)|
+--------+
| true|
+--------+
SELECT 2.0 >= '2.1';
+------------+
|(2.0 >= 2.1)|
+------------+
| false|
+------------+
SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-07-30 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) >= to_date(2009-07-30 04:17:52))|
+--------------------------------------------------------------+
| true|
+--------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-08-01 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) >= to_date(2009-08-01 04:17:52))|
+--------------------------------------------------------------+
| false|
+--------------------------------------------------------------+
SELECT 1 >= NULL;
+-----------+
|(1 >= NULL)|
+-----------+
| NULL|
+-----------+
-- and
SELECT true and true;
+---------------+
|(true AND true)|
+---------------+
| true|
+---------------+
SELECT true and false;
+----------------+
|(true AND false)|
+----------------+
| false|
+----------------+
SELECT true and NULL;
+---------------+
|(true AND NULL)|
+---------------+
| NULL|
+---------------+
SELECT false and NULL;
+----------------+
|(false AND NULL)|
+----------------+
| false|
+----------------+
-- equal_null
SELECT equal_null(3, 3);
+----------------+
|equal_null(3, 3)|
+----------------+
| true|
+----------------+
SELECT equal_null(1, '11');
+-----------------+
|equal_null(1, 11)|
+-----------------+
| false|
+-----------------+
SELECT equal_null(true, NULL);
+----------------------+
|equal_null(true, NULL)|
+----------------------+
| false|
+----------------------+
SELECT equal_null(NULL, 'abc');
+---------------------+
|equal_null(NULL, abc)|
+---------------------+
| false|
+---------------------+
SELECT equal_null(NULL, NULL);
+----------------------+
|equal_null(NULL, NULL)|
+----------------------+
| true|
+----------------------+
-- ilike
SELECT ilike('Spark', '_Park');
+-------------------+
|ilike(Spark, _Park)|
+-------------------+
| true|
+-------------------+
SELECT '\\abc' AS S, S ilike r'\\abc', S ilike '\\\\abc';
+----+--------------------------------------+--------------------------------------+
| S|ilike(lateralAliasReference(S), \\abc)|ilike(lateralAliasReference(S), \\abc)|
+----+--------------------------------------+--------------------------------------+
|\abc| true| true|
+----+--------------------------------------+--------------------------------------+
SET spark.sql.parser.escapedStringLiterals=true;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....| true|
+--------------------+-----+
SELECT '%SystemDrive%\Users\John' ilike '\%SystemDrive\%\\users%';
+--------------------------------------------------------+
|ilike(%SystemDrive%\Users\John, \%SystemDrive\%\\users%)|
+--------------------------------------------------------+
| true|
+--------------------------------------------------------+
SET spark.sql.parser.escapedStringLiterals=false;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....|false|
+--------------------+-----+
SELECT '%SystemDrive%\\USERS\\John' ilike r'%SystemDrive%\\Users%';
+------------------------------------------------------+
|ilike(%SystemDrive%\USERS\John, %SystemDrive%\\Users%)|
+------------------------------------------------------+
| true|
+------------------------------------------------------+
SELECT '%SystemDrive%/Users/John' ilike '/%SYSTEMDrive/%//Users%' ESCAPE '/';
+--------------------------------------------------------+
|ilike(%SystemDrive%/Users/John, /%SYSTEMDrive/%//Users%)|
+--------------------------------------------------------+
| true|
+--------------------------------------------------------+
-- in
SELECT 1 in(1, 2, 3);
+----------------+
|(1 IN (1, 2, 3))|
+----------------+
| true|
+----------------+
SELECT 1 in(2, 3, 4);
+----------------+
|(1 IN (2, 3, 4))|
+----------------+
| false|
+----------------+
SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3));
+----------------------------------------------------------------------------------+
|(named_struct(a, 1, b, 2) IN (named_struct(a, 1, b, 1), named_struct(a, 1, b, 3)))|
+----------------------------------------------------------------------------------+
| false|
+----------------------------------------------------------------------------------+
SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3));
+----------------------------------------------------------------------------------+
|(named_struct(a, 1, b, 2) IN (named_struct(a, 1, b, 2), named_struct(a, 1, b, 3)))|
+----------------------------------------------------------------------------------+
| true|
+----------------------------------------------------------------------------------+
-- isnan
SELECT isnan(cast('NaN' as double));
+--------------------------+
|isnan(CAST(NaN AS DOUBLE))|
+--------------------------+
| true|
+--------------------------+
-- isnotnull
SELECT isnotnull(1);
+---------------+
|(1 IS NOT NULL)|
+---------------+
| true|
+---------------+
-- isnull
SELECT isnull(1);
+-----------+
|(1 IS NULL)|
+-----------+
| false|
+-----------+
-- like
SELECT like('Spark', '_park');
+----------------+
|Spark LIKE _park|
+----------------+
| true|
+----------------+
SELECT '\\abc' AS S, S like r'\\abc', S like '\\\\abc';
+----+-----------------------------------+-----------------------------------+
| S|lateralAliasReference(S) LIKE \\abc|lateralAliasReference(S) LIKE \\abc|
+----+-----------------------------------+-----------------------------------+
|\abc| true| true|
+----+-----------------------------------+-----------------------------------+
SET spark.sql.parser.escapedStringLiterals=true;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....| true|
+--------------------+-----+
SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\\Users%';
+-----------------------------------------------------+
|%SystemDrive%\Users\John LIKE \%SystemDrive\%\\Users%|
+-----------------------------------------------------+
| true|
+-----------------------------------------------------+
SET spark.sql.parser.escapedStringLiterals=false;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....|false|
+--------------------+-----+
SELECT '%SystemDrive%\\Users\\John' like r'%SystemDrive%\\Users%';
+---------------------------------------------------+
|%SystemDrive%\Users\John LIKE %SystemDrive%\\Users%|
+---------------------------------------------------+
| true|
+---------------------------------------------------+
SELECT '%SystemDrive%/Users/John' like '/%SystemDrive/%//Users%' ESCAPE '/';
+----------------------------------------------------------------+
|%SystemDrive%/Users/John LIKE /%SystemDrive/%//Users% ESCAPE '/'|
+----------------------------------------------------------------+
| true|
+----------------------------------------------------------------+
-- not
SELECT not true;
+----------+
|(NOT true)|
+----------+
| false|
+----------+
SELECT not false;
+-----------+
|(NOT false)|
+-----------+
| true|
+-----------+
SELECT not NULL;
+----------+
|(NOT NULL)|
+----------+
| NULL|
+----------+
-- or
SELECT true or false;
+---------------+
|(true OR false)|
+---------------+
| true|
+---------------+
SELECT false or false;
+----------------+
|(false OR false)|
+----------------+
| false|
+----------------+
SELECT true or NULL;
+--------------+
|(true OR NULL)|
+--------------+
| true|
+--------------+
SELECT false or NULL;
+---------------+
|(false OR NULL)|
+---------------+
| NULL|
+---------------+
-- regexp
SET spark.sql.parser.escapedStringLiterals=true;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....| true|
+--------------------+-----+
SELECT regexp('%SystemDrive%\Users\John', '%SystemDrive%\\Users.*');
+--------------------------------------------------------+
|REGEXP(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+--------------------------------------------------------+
| true|
+--------------------------------------------------------+
SET spark.sql.parser.escapedStringLiterals=false;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....|false|
+--------------------+-----+
SELECT regexp('%SystemDrive%\\Users\\John', '%SystemDrive%\\\\Users.*');
+--------------------------------------------------------+
|REGEXP(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+--------------------------------------------------------+
| true|
+--------------------------------------------------------+
SELECT regexp('%SystemDrive%\\Users\\John', r'%SystemDrive%\\Users.*');
+--------------------------------------------------------+
|REGEXP(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+--------------------------------------------------------+
| true|
+--------------------------------------------------------+
-- regexp_like
SET spark.sql.parser.escapedStringLiterals=true;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....| true|
+--------------------+-----+
SELECT regexp_like('%SystemDrive%\Users\John', '%SystemDrive%\\Users.*');
+-------------------------------------------------------------+
|REGEXP_LIKE(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+-------------------------------------------------------------+
| true|
+-------------------------------------------------------------+
SET spark.sql.parser.escapedStringLiterals=false;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....|false|
+--------------------+-----+
SELECT regexp_like('%SystemDrive%\\Users\\John', '%SystemDrive%\\\\Users.*');
+-------------------------------------------------------------+
|REGEXP_LIKE(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+-------------------------------------------------------------+
| true|
+-------------------------------------------------------------+
SELECT regexp_like('%SystemDrive%\\Users\\John', r'%SystemDrive%\\Users.*');
+-------------------------------------------------------------+
|REGEXP_LIKE(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+-------------------------------------------------------------+
| true|
+-------------------------------------------------------------+
-- rlike
SET spark.sql.parser.escapedStringLiterals=true;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....| true|
+--------------------+-----+
SELECT rlike('%SystemDrive%\Users\John', '%SystemDrive%\\Users.*');
+-------------------------------------------------------+
|RLIKE(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+-------------------------------------------------------+
| true|
+-------------------------------------------------------+
SET spark.sql.parser.escapedStringLiterals=false;
+--------------------+-----+
| key|value|
+--------------------+-----+
|spark.sql.parser....|false|
+--------------------+-----+
SELECT rlike('%SystemDrive%\\Users\\John', '%SystemDrive%\\\\Users.*');
+-------------------------------------------------------+
|RLIKE(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+-------------------------------------------------------+
| true|
+-------------------------------------------------------+
SELECT rlike('%SystemDrive%\\Users\\John', r'%SystemDrive%\\Users.*');
+-------------------------------------------------------+
|RLIKE(%SystemDrive%\Users\John, %SystemDrive%\\Users.*)|
+-------------------------------------------------------+
| true|
+-------------------------------------------------------+
Misc Functions
Function | Description |
---|---|
aes_decrypt(expr, key[, mode[, padding[, aad]]]) | Returns a decrypted value of `expr` using AES in `mode` with `padding`. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (`mode`, `padding`) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM. |
aes_encrypt(expr, key[, mode[, padding[, iv[, aad]]]]) | Returns an encrypted value of `expr` using AES in given `mode` with the specified `padding`. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (`mode`, `padding`) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional initialization vectors (IVs) are only supported for CBC and GCM modes. These must be 16 bytes for CBC and 12 bytes for GCM. If not provided, a random vector will be generated and prepended to the output. Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM. |
assert_true(expr [, message]) | Throws an exception if `expr` is not true. |
bitmap_bit_position(child) | Returns the bit position for the given input child expression. |
bitmap_bucket_number(child) | Returns the bucket number for the given input child expression. |
bitmap_count(child) | Returns the number of set bits in the child bitmap. |
current_catalog() | Returns the current catalog. |
current_database() | Returns the current database. |
current_schema() | Returns the current database. |
current_user() | user name of current execution context. |
from_avro(child, jsonFormatSchema, options) | Converts a binary Avro value into a Catalyst value. |
from_protobuf(data, messageName, descFilePath, options) | Converts a binary Protobuf value into a Catalyst value. |
hll_sketch_estimate(expr) | Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch. |
hll_union(first, second, allowDifferentLgConfigK) | Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Set allowDifferentLgConfigK to true to allow unions of sketches with different lgConfigK values (defaults to false). |
input_file_block_length() | Returns the length of the block being read, or -1 if not available. |
input_file_block_start() | Returns the start offset of the block being read, or -1 if not available. |
input_file_name() | Returns the name of the file being read, or empty string if not available. |
java_method(class, method[, arg1[, arg2 ..]]) | Calls a method with reflection. |
monotonically_increasing_id() | Returns monotonically increasing 64-bit integers. The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number within each partition. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records. The function is non-deterministic because its result depends on partition IDs. |
reflect(class, method[, arg1[, arg2 ..]]) | Calls a method with reflection. |
session_user() | user name of current execution context. |
spark_partition_id() | Returns the current partition id. |
to_avro(child[, jsonFormatSchema]) | Converts a Catalyst binary input value into its corresponding Avro format result. |
to_protobuf(child, messageName, descFilePath, options) | Converts a Catalyst binary input value into its corresponding Protobuf format result. |
try_aes_decrypt(expr, key[, mode[, padding[, aad]]]) | This is a special version of `aes_decrypt` that performs the same operation, but returns a NULL value instead of raising an error if the decryption cannot be performed. |
try_reflect(class, method[, arg1[, arg2 ..]]) | This is a special version of `reflect` that performs the same operation, but returns a NULL value instead of raising an error if the invoke method thrown exception. |
typeof(expr) | Return DDL-formatted type string for the data type of the input. |
user() | user name of current execution context. |
uuid() | Returns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string. |
version() | Returns the Spark version. The string contains 2 fields, the first being a release version and the second being a git revision. |
Examples
-- aes_decrypt
SELECT aes_decrypt(unhex('83F16B2AA704794132802D248E6BFD4E380078182D1544813898AC97E709B28A94'), '0000111122223333');
+------------------------------------------------------------------------------------------------------------------------+
|aes_decrypt(unhex(83F16B2AA704794132802D248E6BFD4E380078182D1544813898AC97E709B28A94), 0000111122223333, GCM, DEFAULT, )|
+------------------------------------------------------------------------------------------------------------------------+
| [53 70 61 72 6B]|
+------------------------------------------------------------------------------------------------------------------------+
SELECT aes_decrypt(unhex('6E7CA17BBB468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210'), '0000111122223333', 'GCM');
+--------------------------------------------------------------------------------------------------------------------------------+
|aes_decrypt(unhex(6E7CA17BBB468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210), 0000111122223333, GCM, DEFAULT, )|
+--------------------------------------------------------------------------------------------------------------------------------+
| [53 70 61 72 6B 2...|
+--------------------------------------------------------------------------------------------------------------------------------+
SELECT aes_decrypt(unbase64('3lmwu+Mw0H3fi5NDvcu9lg=='), '1234567890abcdef', 'ECB', 'PKCS');
+------------------------------------------------------------------------------+
|aes_decrypt(unbase64(3lmwu+Mw0H3fi5NDvcu9lg==), 1234567890abcdef, ECB, PKCS, )|
+------------------------------------------------------------------------------+
| [53 70 61 72 6B 2...|
+------------------------------------------------------------------------------+
SELECT aes_decrypt(unbase64('2NYmDCjgXTbbxGA3/SnJEfFC/JQ7olk2VQWReIAAFKo='), '1234567890abcdef', 'CBC');
+-----------------------------------------------------------------------------------------------------+
|aes_decrypt(unbase64(2NYmDCjgXTbbxGA3/SnJEfFC/JQ7olk2VQWReIAAFKo=), 1234567890abcdef, CBC, DEFAULT, )|
+-----------------------------------------------------------------------------------------------------+
| [41 70 61 63 68 6...|
+-----------------------------------------------------------------------------------------------------+
SELECT aes_decrypt(unbase64('AAAAAAAAAAAAAAAAAAAAAPSd4mWyMZ5mhvjiAPQJnfg='), 'abcdefghijklmnop12345678ABCDEFGH', 'CBC', 'DEFAULT');
+---------------------------------------------------------------------------------------------------------------------+
|aes_decrypt(unbase64(AAAAAAAAAAAAAAAAAAAAAPSd4mWyMZ5mhvjiAPQJnfg=), abcdefghijklmnop12345678ABCDEFGH, CBC, DEFAULT, )|
+---------------------------------------------------------------------------------------------------------------------+
| [53 70 61 72 6B]|
+---------------------------------------------------------------------------------------------------------------------+
SELECT aes_decrypt(unbase64('AAAAAAAAAAAAAAAAQiYi+sTLm7KD9UcZ2nlRdYDe/PX4'), 'abcdefghijklmnop12345678ABCDEFGH', 'GCM', 'DEFAULT', 'This is an AAD mixed into the input');
+--------------------------------------------------------------------------------------------------------------------------------------------------------+
|aes_decrypt(unbase64(AAAAAAAAAAAAAAAAQiYi+sTLm7KD9UcZ2nlRdYDe/PX4), abcdefghijklmnop12345678ABCDEFGH, GCM, DEFAULT, This is an AAD mixed into the input)|
+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| [53 70 61 72 6B]|
+--------------------------------------------------------------------------------------------------------------------------------------------------------+
-- aes_encrypt
SELECT hex(aes_encrypt('Spark', '0000111122223333'));
+-----------------------------------------------------------+
|hex(aes_encrypt(Spark, 0000111122223333, GCM, DEFAULT, , ))|
+-----------------------------------------------------------+
| 5B91EC3D47D630687...|
+-----------------------------------------------------------+
SELECT hex(aes_encrypt('Spark SQL', '0000111122223333', 'GCM'));
+---------------------------------------------------------------+
|hex(aes_encrypt(Spark SQL, 0000111122223333, GCM, DEFAULT, , ))|
+---------------------------------------------------------------+
| 0F410195886386B35...|
+---------------------------------------------------------------+
SELECT base64(aes_encrypt('Spark SQL', '1234567890abcdef', 'ECB', 'PKCS'));
+---------------------------------------------------------------+
|base64(aes_encrypt(Spark SQL, 1234567890abcdef, ECB, PKCS, , ))|
+---------------------------------------------------------------+
| 3lmwu+Mw0H3fi5NDv...|
+---------------------------------------------------------------+
SELECT base64(aes_encrypt('Apache Spark', '1234567890abcdef', 'CBC', 'DEFAULT'));
+---------------------------------------------------------------------+
|base64(aes_encrypt(Apache Spark, 1234567890abcdef, CBC, DEFAULT, , ))|
+---------------------------------------------------------------------+
| ZeiFPKKDj32IBry1/...|
+---------------------------------------------------------------------+
SELECT base64(aes_encrypt('Spark', 'abcdefghijklmnop12345678ABCDEFGH', 'CBC', 'DEFAULT', unhex('00000000000000000000000000000000')));
+---------------------------------------------------------------------------------------------------------------------+
|base64(aes_encrypt(Spark, abcdefghijklmnop12345678ABCDEFGH, CBC, DEFAULT, unhex(00000000000000000000000000000000), ))|
+---------------------------------------------------------------------------------------------------------------------+
| AAAAAAAAAAAAAAAAA...|
+---------------------------------------------------------------------------------------------------------------------+
SELECT base64(aes_encrypt('Spark', 'abcdefghijklmnop12345678ABCDEFGH', 'GCM', 'DEFAULT', unhex('000000000000000000000000'), 'This is an AAD mixed into the input'));
+------------------------------------------------------------------------------------------------------------------------------------------------+
|base64(aes_encrypt(Spark, abcdefghijklmnop12345678ABCDEFGH, GCM, DEFAULT, unhex(000000000000000000000000), This is an AAD mixed into the input))|
+------------------------------------------------------------------------------------------------------------------------------------------------+
| AAAAAAAAAAAAAAAAQ...|
+------------------------------------------------------------------------------------------------------------------------------------------------+
-- assert_true
SELECT assert_true(0 < 1);
+--------------------------------------------+
|assert_true((0 < 1), '(0 < 1)' is not true!)|
+--------------------------------------------+
| NULL|
+--------------------------------------------+
-- bitmap_bit_position
SELECT bitmap_bit_position(1);
+----------------------+
|bitmap_bit_position(1)|
+----------------------+
| 0|
+----------------------+
SELECT bitmap_bit_position(123);
+------------------------+
|bitmap_bit_position(123)|
+------------------------+
| 122|
+------------------------+
-- bitmap_bucket_number
SELECT bitmap_bucket_number(123);
+-------------------------+
|bitmap_bucket_number(123)|
+-------------------------+
| 1|
+-------------------------+
SELECT bitmap_bucket_number(0);
+-----------------------+
|bitmap_bucket_number(0)|
+-----------------------+
| 0|
+-----------------------+
-- bitmap_count
SELECT bitmap_count(X '1010');
+---------------------+
|bitmap_count(X'1010')|
+---------------------+
| 2|
+---------------------+
SELECT bitmap_count(X 'FFFF');
+---------------------+
|bitmap_count(X'FFFF')|
+---------------------+
| 16|
+---------------------+
SELECT bitmap_count(X '0');
+-------------------+
|bitmap_count(X'00')|
+-------------------+
| 0|
+-------------------+
-- current_catalog
SELECT current_catalog();
+-----------------+
|current_catalog()|
+-----------------+
| spark_catalog|
+-----------------+
-- current_database
SELECT current_database();
+----------------+
|current_schema()|
+----------------+
| default|
+----------------+
-- current_schema
SELECT current_schema();
+----------------+
|current_schema()|
+----------------+
| default|
+----------------+
-- current_user
SELECT current_user();
+--------------+
|current_user()|
+--------------+
| runner|
+--------------+
-- hll_sketch_estimate
SELECT hll_sketch_estimate(hll_sketch_agg(col)) FROM VALUES (1), (1), (2), (2), (3) tab(col);
+--------------------------------------------+
|hll_sketch_estimate(hll_sketch_agg(col, 12))|
+--------------------------------------------+
| 3|
+--------------------------------------------+
-- hll_union
SELECT hll_sketch_estimate(hll_union(hll_sketch_agg(col1), hll_sketch_agg(col2))) FROM VALUES (1, 4), (1, 4), (2, 5), (2, 5), (3, 6) tab(col1, col2);
+-----------------------------------------------------------------------------------------+
|hll_sketch_estimate(hll_union(hll_sketch_agg(col1, 12), hll_sketch_agg(col2, 12), false))|
+-----------------------------------------------------------------------------------------+
| 6|
+-----------------------------------------------------------------------------------------+
-- input_file_block_length
SELECT input_file_block_length();
+-------------------------+
|input_file_block_length()|
+-------------------------+
| -1|
+-------------------------+
-- input_file_block_start
SELECT input_file_block_start();
+------------------------+
|input_file_block_start()|
+------------------------+
| -1|
+------------------------+
-- input_file_name
SELECT input_file_name();
+-----------------+
|input_file_name()|
+-----------------+
| |
+-----------------+
-- java_method
SELECT java_method('java.util.UUID', 'randomUUID');
+---------------------------------------+
|java_method(java.util.UUID, randomUUID)|
+---------------------------------------+
| f5afe97e-f9b7-45c...|
+---------------------------------------+
SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2');
+-----------------------------------------------------------------------------+
|java_method(java.util.UUID, fromString, a5cf6c42-0c85-418f-af6c-3e4e5b1328f2)|
+-----------------------------------------------------------------------------+
| a5cf6c42-0c85-418...|
+-----------------------------------------------------------------------------+
-- monotonically_increasing_id
SELECT monotonically_increasing_id();
+-----------------------------+
|monotonically_increasing_id()|
+-----------------------------+
| 0|
+-----------------------------+
-- reflect
SELECT reflect('java.util.UUID', 'randomUUID');
+-----------------------------------+
|reflect(java.util.UUID, randomUUID)|
+-----------------------------------+
| 9e47fe2a-1619-44c...|
+-----------------------------------+
SELECT reflect('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2');
+-------------------------------------------------------------------------+
|reflect(java.util.UUID, fromString, a5cf6c42-0c85-418f-af6c-3e4e5b1328f2)|
+-------------------------------------------------------------------------+
| a5cf6c42-0c85-418...|
+-------------------------------------------------------------------------+
-- session_user
SELECT session_user();
+--------------+
|session_user()|
+--------------+
| runner|
+--------------+
-- spark_partition_id
SELECT spark_partition_id();
+--------------------+
|SPARK_PARTITION_ID()|
+--------------------+
| 0|
+--------------------+
-- try_aes_decrypt
SELECT try_aes_decrypt(unhex('6E7CA17BBB468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210'), '0000111122223333', 'GCM');
+------------------------------------------------------------------------------------------------------------------------------------+
|try_aes_decrypt(unhex(6E7CA17BBB468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210), 0000111122223333, GCM, DEFAULT, )|
+------------------------------------------------------------------------------------------------------------------------------------+
| [53 70 61 72 6B 2...|
+------------------------------------------------------------------------------------------------------------------------------------+
SELECT try_aes_decrypt(unhex('----------468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210'), '0000111122223333', 'GCM');
+------------------------------------------------------------------------------------------------------------------------------------+
|try_aes_decrypt(unhex(----------468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210), 0000111122223333, GCM, DEFAULT, )|
+------------------------------------------------------------------------------------------------------------------------------------+
| NULL|
+------------------------------------------------------------------------------------------------------------------------------------+
-- try_reflect
SELECT try_reflect('java.util.UUID', 'randomUUID');
+---------------------------------------+
|try_reflect(java.util.UUID, randomUUID)|
+---------------------------------------+
| f759bf62-0a08-4f1...|
+---------------------------------------+
SELECT try_reflect('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2');
+-----------------------------------------------------------------------------+
|try_reflect(java.util.UUID, fromString, a5cf6c42-0c85-418f-af6c-3e4e5b1328f2)|
+-----------------------------------------------------------------------------+
| a5cf6c42-0c85-418...|
+-----------------------------------------------------------------------------+
SELECT try_reflect('java.net.URLDecoder', 'decode', '%');
+-------------------------------------------+
|try_reflect(java.net.URLDecoder, decode, %)|
+-------------------------------------------+
| NULL|
+-------------------------------------------+
-- typeof
SELECT typeof(1);
+---------+
|typeof(1)|
+---------+
| int|
+---------+
SELECT typeof(array(1));
+----------------+
|typeof(array(1))|
+----------------+
| array<int>|
+----------------+
-- user
SELECT user();
+------+
|user()|
+------+
|runner|
+------+
-- uuid
SELECT uuid();
+--------------------+
| uuid()|
+--------------------+
|e200733e-f2c6-4a3...|
+--------------------+
-- version
SELECT version();
+--------------------+
| version()|
+--------------------+
|4.0.0 3c81f076ab9...|
+--------------------+
Generator Functions
Function | Description |
---|---|
collations() | Get all of the Spark SQL string collations |
explode(expr) | Separates the elements of array `expr` into multiple rows, or the elements of map `expr` into multiple rows and columns. Unless specified otherwise, uses the default column name `col` for elements of the array or `key` and `value` for the elements of the map. |
explode_outer(expr) | Separates the elements of array `expr` into multiple rows, or the elements of map `expr` into multiple rows and columns. Unless specified otherwise, uses the default column name `col` for elements of the array or `key` and `value` for the elements of the map. |
inline(expr) | Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise. |
inline_outer(expr) | Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise. |
posexplode(expr) | Separates the elements of array `expr` into multiple rows with positions, or the elements of map `expr` into multiple rows and columns with positions. Unless specified otherwise, uses the column name `pos` for position, `col` for elements of the array or `key` and `value` for elements of the map. |
posexplode_outer(expr) | Separates the elements of array `expr` into multiple rows with positions, or the elements of map `expr` into multiple rows and columns with positions. Unless specified otherwise, uses the column name `pos` for position, `col` for elements of the array or `key` and `value` for elements of the map. |
sql_keywords() | Get Spark SQL keywords |
stack(n, expr1, ..., exprk) | Separates `expr1`, ..., `exprk` into `n` rows. Uses column names col0, col1, etc. by default unless specified otherwise. |
Examples
-- collations
SELECT * FROM collations() WHERE NAME = 'UTF8_BINARY';
+-------+-------+-----------+--------+-------+------------------+----------------+-------------+-----------+
|CATALOG| SCHEMA| NAME|LANGUAGE|COUNTRY|ACCENT_SENSITIVITY|CASE_SENSITIVITY|PAD_ATTRIBUTE|ICU_VERSION|
+-------+-------+-----------+--------+-------+------------------+----------------+-------------+-----------+
| SYSTEM|BUILTIN|UTF8_BINARY| NULL| NULL| ACCENT_SENSITIVE| CASE_SENSITIVE| NO_PAD| NULL|
+-------+-------+-----------+--------+-------+------------------+----------------+-------------+-----------+
-- explode
SELECT explode(array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+
SELECT explode(collection => array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+
SELECT * FROM explode(collection => array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+
-- explode_outer
SELECT explode_outer(array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+
SELECT explode_outer(collection => array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+
-- inline
SELECT inline(array(struct(1, 'a'), struct(2, 'b')));
+----+----+
|col1|col2|
+----+----+
| 1| a|
| 2| b|
+----+----+
-- inline_outer
SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')));
+----+----+
|col1|col2|
+----+----+
| 1| a|
| 2| b|
+----+----+
-- posexplode
SELECT posexplode(array(10,20));
+---+---+
|pos|col|
+---+---+
| 0| 10|
| 1| 20|
+---+---+
SELECT * FROM posexplode(array(10,20));
+---+---+
|pos|col|
+---+---+
| 0| 10|
| 1| 20|
+---+---+
-- posexplode_outer
SELECT posexplode_outer(array(10,20));
+---+---+
|pos|col|
+---+---+
| 0| 10|
| 1| 20|
+---+---+
SELECT * FROM posexplode_outer(array(10,20));
+---+---+
|pos|col|
+---+---+
| 0| 10|
| 1| 20|
+---+---+
-- sql_keywords
SELECT * FROM sql_keywords() LIMIT 2;
+-------+--------+
|keyword|reserved|
+-------+--------+
| ADD| false|
| AFTER| false|
+-------+--------+
-- stack
SELECT stack(2, 1, 2, 3);
+----+----+
|col0|col1|
+----+----+
| 1| 2|
| 3|NULL|
+----+----+
Table Functions
Function | Description |
---|---|
range(start[, end[, step[, numSlices]]]) / range(end) | Returns a table of values within a specified range. |
Examples
-- range
SELECT * FROM range(1);
+---+
| id|
+---+
| 0|
+---+
SELECT * FROM range(0, 2);
+---+
| id|
+---+
| 0|
| 1|
+---+
SELECT * FROM range(0, 4, 2);
+---+
| id|
+---+
| 0|
| 2|
+---+
Variant Functions
Function | Description |
---|---|
is_variant_null(expr) | Check if a variant value is a variant null. Returns true if and only if the input is a variant null and false otherwise (including in the case of SQL NULL). |
parse_json(jsonStr) | Parse a JSON string as a Variant value. Throw an exception when the string is not valid JSON value. |
schema_of_variant(v) | Returns schema in the SQL format of a variant. |
schema_of_variant_agg(v) | Returns the merged schema in the SQL format of a variant column. |
to_variant_object(expr) | Convert a nested input (array/map/struct) into a variant where maps and structs are converted to variant objects which are unordered unlike SQL structs. Input maps can only have string keys. |
try_parse_json(jsonStr) | Parse a JSON string as a Variant value. Return NULL when the string is not valid JSON value. |
try_variant_get(v, path[, type]) | Extracts a sub-variant from `v` according to `path`, and then cast the sub-variant to `type`. When `type` is omitted, it is default to `variant`. Returns null if the path does not exist or the cast fails. |
variant_explode(expr) | It separates a variant object/array into multiple rows containing its fields/elements. Its result schema is `struct<pos int, key string, value variant>`. `pos` is the position of the field/element in its parent object/array, and `value` is the field/element value. `key` is the field name when exploding a variant object, or is NULL when exploding a variant array. It ignores any input that is not a variant array/object, including SQL NULL, variant null, and any other variant values. |
variant_explode_outer(expr) | It separates a variant object/array into multiple rows containing its fields/elements. Its result schema is `struct<pos int, key string, value variant>`. `pos` is the position of the field/element in its parent object/array, and `value` is the field/element value. `key` is the field name when exploding a variant object, or is NULL when exploding a variant array. It ignores any input that is not a variant array/object, including SQL NULL, variant null, and any other variant values. |
variant_get(v, path[, type]) | Extracts a sub-variant from `v` according to `path`, and then cast the sub-variant to `type`. When `type` is omitted, it is default to `variant`. Returns null if the path does not exist. Throws an exception if the cast fails. |
Examples
-- is_variant_null
SELECT is_variant_null(parse_json('null'));
+---------------------------------+
|is_variant_null(parse_json(null))|
+---------------------------------+
| true|
+---------------------------------+
SELECT is_variant_null(parse_json('"null"'));
+-----------------------------------+
|is_variant_null(parse_json("null"))|
+-----------------------------------+
| false|
+-----------------------------------+
SELECT is_variant_null(parse_json('13'));
+-------------------------------+
|is_variant_null(parse_json(13))|
+-------------------------------+
| false|
+-------------------------------+
SELECT is_variant_null(parse_json(null));
+---------------------------------+
|is_variant_null(parse_json(NULL))|
+---------------------------------+
| false|
+---------------------------------+
SELECT is_variant_null(variant_get(parse_json('{"a":null, "b":"spark"}'), "$.c"));
+----------------------------------------------------------------------+
|is_variant_null(variant_get(parse_json({"a":null, "b":"spark"}), $.c))|
+----------------------------------------------------------------------+
| false|
+----------------------------------------------------------------------+
SELECT is_variant_null(variant_get(parse_json('{"a":null, "b":"spark"}'), "$.a"));
+----------------------------------------------------------------------+
|is_variant_null(variant_get(parse_json({"a":null, "b":"spark"}), $.a))|
+----------------------------------------------------------------------+
| true|
+----------------------------------------------------------------------+
-- parse_json
SELECT parse_json('{"a":1,"b":0.8}');
+---------------------------+
|parse_json({"a":1,"b":0.8})|
+---------------------------+
| {"a":1,"b":0.8}|
+---------------------------+
-- schema_of_variant
SELECT schema_of_variant(parse_json('null'));
+-----------------------------------+
|schema_of_variant(parse_json(null))|
+-----------------------------------+
| VOID|
+-----------------------------------+
SELECT schema_of_variant(parse_json('[{"b":true,"a":0}]'));
+-------------------------------------------------+
|schema_of_variant(parse_json([{"b":true,"a":0}]))|
+-------------------------------------------------+
| ARRAY<OBJECT<a: B...|
+-------------------------------------------------+
-- schema_of_variant_agg
SELECT schema_of_variant_agg(parse_json(j)) FROM VALUES ('1'), ('2'), ('3') AS tab(j);
+------------------------------------+
|schema_of_variant_agg(parse_json(j))|
+------------------------------------+
| BIGINT|
+------------------------------------+
SELECT schema_of_variant_agg(parse_json(j)) FROM VALUES ('{"a": 1}'), ('{"b": true}'), ('{"c": 1.23}') AS tab(j);
+------------------------------------+
|schema_of_variant_agg(parse_json(j))|
+------------------------------------+
| OBJECT<a: BIGINT,...|
+------------------------------------+
-- to_variant_object
SELECT to_variant_object(named_struct('a', 1, 'b', 2));
+-------------------------------------------+
|to_variant_object(named_struct(a, 1, b, 2))|
+-------------------------------------------+
| {"a":1,"b":2}|
+-------------------------------------------+
SELECT to_variant_object(array(1, 2, 3));
+---------------------------------+
|to_variant_object(array(1, 2, 3))|
+---------------------------------+
| [1,2,3]|
+---------------------------------+
SELECT to_variant_object(array(named_struct('a', 1)));
+--------------------------------------------+
|to_variant_object(array(named_struct(a, 1)))|
+--------------------------------------------+
| [{"a":1}]|
+--------------------------------------------+
SELECT to_variant_object(array(map("a", 2)));
+-----------------------------------+
|to_variant_object(array(map(a, 2)))|
+-----------------------------------+
| [{"a":2}]|
+-----------------------------------+
-- try_parse_json
SELECT try_parse_json('{"a":1,"b":0.8}');
+-------------------------------+
|try_parse_json({"a":1,"b":0.8})|
+-------------------------------+
| {"a":1,"b":0.8}|
+-------------------------------+
SELECT try_parse_json('{"a":1,');
+-----------------------+
|try_parse_json({"a":1,)|
+-----------------------+
| NULL|
+-----------------------+
-- try_variant_get
SELECT try_variant_get(parse_json('{"a": 1}'), '$.a', 'int');
+------------------------------------------+
|try_variant_get(parse_json({"a": 1}), $.a)|
+------------------------------------------+
| 1|
+------------------------------------------+
SELECT try_variant_get(parse_json('{"a": 1}'), '$.b', 'int');
+------------------------------------------+
|try_variant_get(parse_json({"a": 1}), $.b)|
+------------------------------------------+
| NULL|
+------------------------------------------+
SELECT try_variant_get(parse_json('[1, "2"]'), '$[1]', 'string');
+-------------------------------------------+
|try_variant_get(parse_json([1, "2"]), $[1])|
+-------------------------------------------+
| 2|
+-------------------------------------------+
SELECT try_variant_get(parse_json('[1, "2"]'), '$[2]', 'string');
+-------------------------------------------+
|try_variant_get(parse_json([1, "2"]), $[2])|
+-------------------------------------------+
| NULL|
+-------------------------------------------+
SELECT try_variant_get(parse_json('[1, "hello"]'), '$[1]');
+-----------------------------------------------+
|try_variant_get(parse_json([1, "hello"]), $[1])|
+-----------------------------------------------+
| "hello"|
+-----------------------------------------------+
SELECT try_variant_get(parse_json('[1, "hello"]'), '$[1]', 'int');
+-----------------------------------------------+
|try_variant_get(parse_json([1, "hello"]), $[1])|
+-----------------------------------------------+
| NULL|
+-----------------------------------------------+
-- variant_explode
SELECT * from variant_explode(parse_json('["hello", "world"]'));
+---+----+-------+
|pos| key| value|
+---+----+-------+
| 0|NULL|"hello"|
| 1|NULL|"world"|
+---+----+-------+
SELECT * from variant_explode(parse_json('{"a": true, "b": 3.14}'));
+---+---+-----+
|pos|key|value|
+---+---+-----+
| 0| a| true|
| 1| b| 3.14|
+---+---+-----+
-- variant_explode_outer
SELECT * from variant_explode_outer(parse_json('["hello", "world"]'));
+---+----+-------+
|pos| key| value|
+---+----+-------+
| 0|NULL|"hello"|
| 1|NULL|"world"|
+---+----+-------+
SELECT * from variant_explode_outer(parse_json('{"a": true, "b": 3.14}'));
+---+---+-----+
|pos|key|value|
+---+---+-----+
| 0| a| true|
| 1| b| 3.14|
+---+---+-----+
-- variant_get
SELECT variant_get(parse_json('{"a": 1}'), '$.a', 'int');
+--------------------------------------+
|variant_get(parse_json({"a": 1}), $.a)|
+--------------------------------------+
| 1|
+--------------------------------------+
SELECT variant_get(parse_json('{"a": 1}'), '$.b', 'int');
+--------------------------------------+
|variant_get(parse_json({"a": 1}), $.b)|
+--------------------------------------+
| NULL|
+--------------------------------------+
SELECT variant_get(parse_json('[1, "2"]'), '$[1]', 'string');
+---------------------------------------+
|variant_get(parse_json([1, "2"]), $[1])|
+---------------------------------------+
| 2|
+---------------------------------------+
SELECT variant_get(parse_json('[1, "2"]'), '$[2]', 'string');
+---------------------------------------+
|variant_get(parse_json([1, "2"]), $[2])|
+---------------------------------------+
| NULL|
+---------------------------------------+
SELECT variant_get(parse_json('[1, "hello"]'), '$[1]');
+-------------------------------------------+
|variant_get(parse_json([1, "hello"]), $[1])|
+-------------------------------------------+
| "hello"|
+-------------------------------------------+