Hive: Misc

This tutorial will show you some misc usage for working with Hive. If you have no installed Hive yet please follow this tutorial.

NVL:
Check if value is null then substitute other value.

SELECT NVL(columnA, 'was null')
FROM test;

CAST:
If columnE was a date and you wanted it to be a string.

SELECT CAST(columnE AS STRING)
FROM test;

Concat:
This will concat the strings together giving “Test the code!”

SELECT CONCAT('Test', ' the ', 'code!');

CONCAT_WS:
This will concat the strings together starting at “Test” and use the first position as the seperator giving “Test the code”

SELECT CONCAT_WS(' ', 'Test', 'the', 'code');

MIN:
This will give the minimum value of columnC

SELECT MIN(columnC) AS Max_ColumnC
FROM test;

MAX:
This will give the maximum value of columnC

SELECT MAX(columnC) AS Max_ColumnC
FROM test;

DISTINCT:
This will select the distinct columnA and columnB

SELECT DISTINCT columnA, columnB
FROM test;

CASE:

SELECT CASE WHEN columnA = 'val' THEN 'Is Val' ELSE 'Not Val' END
FROM test;

COLLECT_SET:
This will collect all values in columnC and select the first index.

SELECT COLLECT_SET(columnC)[0]
FROM test;

CURRENT_DATE:
This will give you the current date in the format YYYY-MM-dd.

SELECT CURRENT_DATE();

UNIX_TIMESTAMP:
This will give you the current timestamp from EPOCH. IE: 1549591492

SELECT UNIX_TIMESTAMP();

FROM_UNIXTIME:
This will take a timestamp and display it in the format YYYY-MM-dd HH:MM:SS.

SELECT FROM_UNIXTIME(1549591492);

TO_DATE:
This will convert a date to YYYY-MM-dd.

SELECT TO_DATE('2019-02-01 01:01:01');

YEAR:

SELECT YEAR(columnE)
FROM test;

MONTH:

SELECT MONTH(columnE)
FROM test;

DAY:

SELECT DAY(columnE)
FROM test;

DATE_ADD:

SELECT DATE_ADD('2019-01-01', 4);

UPPER:
This will upper case columnA.

SELECT UPPER(columnA)
FROM test;

LOWER:
This will lower case columnA.

SELECT LOWER(columnA)
FROM test;

TRIM:
This will trim leading and trailing spaces in columnA.

SELECT TRIM(columnA)
FROM test;

LITERALS:
If a column contains a space you will need to use literal in order to use the AS keyword or when you are defining it in the create table command. Although I don’t recommend this it is possible.

SELECT columnA AS `test column`
FROM test;

 

 

 

Distinct Records in Object Array

Sometimes you need to determine the distinct objects in an array or distinct values in array. There are so many ways to do this. One way which I have used at times can be a bit slow depending on the size of your array.
From my investigation there is a lodash version that is much better. Once I do some testing I will update this but for now here is an example.
I expanded on the idea from Stack Exchange.

 var distinct = function(objectArray, param){
      var distinctResult = [];

      $.each(objectArray, function(i, currentObject){
            if (param !== null) {
                  if (distinctResult.filter(function(v) { return v[param] == currentObject[param]; }).length == 0)
                  {
                        distinctResult.push(currentObject);
                  }
            } else {
                  if(!exists(distinctResult, currentObject))
            {
                  distinctResult.push(currentObject);
            }
            }
      });

      return distinctResult;
};

var exists = function(arr, object){
    var compareToJson = JSON.stringify(object);
    var result = false;
    $.each(arr, function(i, existingObject){
        if(JSON.stringify(existingObject) === compareToJson) {
            result = true;
            return false; // break
        }
    });

    return result;
};