Snowflake JavaScript stored procedures

Snowflake JavaScript stored procedures

8-minute read

 

Snowflake JavaScript stored procedures

 

As an enterprise data warehouse tool, Snowflake has seen wide adoption for analytics workloads. It was built from the ground up as a cloud-native platform and, while it shares many traits with other database products, the Snowflake team took a pretty different path when it comes to stored procedures. This article specifically focuses on Snowflake stored procedures written in JavaScript. It will cover some of their strengths and limitations, followed by a few practical examples that will run on any trial Snowflake account.

 

Strengths

• JavaScript is very well documented and there are many examples online for nearly every scenario.

◦ While JavaScript has its intricacies, you can get a lot of mileage by using basic, entry-level JavaScript.

◦ JavaScript excels at string parsing, array handling, loops, control structures, and procedural programming in general, where SQL excels in set-based programming. With JavaScript stored procedures, you get the best of both worlds.

◦ Snowflake stored procedures written in JavaScript support the use of internal functions which is very helpful for complex scenarios.

• Stored procedures written in JavaScript support DDL and DML operations. This empowers developers to write stored procedures for a wide range of processes such as metadata driven frameworks, administrative functions, and loading tables just to name a few.

 

Limitations

The Snowflake documentation on JavaScript stored procedures is very good and is well worth a read. The list below calls out a few limitations that can be especially problematic.

• JavaScript stored procedures support only a single return value and are not able to return record-sets like traditional stored procedures. To work around this limitation, one of the below options may work, depending on your requirements:

◦ Return the data in JSON format. This works well if the total JSON object is under 16 MB in size.

◦ Use the stored procedure to load a landing table and query the data from there.

• Asynchronous operations (e.g., setTimeout, setInterval) are not supported.

• External JavaScript libraries are not supported. This is an unfortunate limitation if your stored procedure utilizes internal functions (example below). The potential for code redundancy is something to keep in mind when deciding whether or not to use stored procedures in Snowflake.

• Debugging complex JavaScript stored procedures can be a challenge. The code samples below include a few techniques that are useful for dealing with this.

 

Examples: setup

All of the following techniques/exercises will use data from the SNOWFLAKE_SAMPLE_DATA database that is provisioned with every Snowflake account. The SNOWFLAKE_SAMPLE_DATA database is read-only so, for training purposes, we’ll create a sandbox database and copy one of the tables over to it. These examples also assume the use of a trial account (link below) where the operator can use the built-in ACCOUNTADMIN role.

 

 
USE ROLE ACCOUNTADMIN;
USE WAREHOUSE COMPUTE_WH;
USE DATABASE SNOWFLAKE_SAMPLE_DATA;
USE SCHEMA TPCH_SF1;

CREATE DATABASE DEV_SANDBOX;
USE DATABASE DEV_SANDBOX;

CREATE SCHEMA TPCH_SF1;
USE SCHEMA TPCH_SF1;

CREATE TABLE DEV_SANDBOX.TPCH_SF1.CUSTOMER AS
SELECT * 
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER;

--Create a logging table for later use.
CREATE OR REPLACE TABLE TPCH_SF1.NEGATIVE_BALANCE_CUSTOMERS (
	 ROW_ID			NUMBER(38, 0) AUTOINCREMENT(1, 1)
	,CUSTOMER_KEY		NUMBER(38, 0)
	,CUSTOMER_NAME		VARCHAR(25)
	,MARKET_SEGMENT	VARCHAR(25)
	,ACCOUNT_BALANCE	NUMBER(12, 2)
	,LOG_TIMESTAMP		TIMESTAMP_NTZ(6) DEFAULT CURRENT_TIMESTAMP(6)
);

 

Starter query

 
--This starter query is the basis for some of the examples below.
SELECT
	 C_CUSTKEY
	,C_NAME
	,C_ADDRESS
	,C_MKTSEGMENT
FROM TPCH_SF1.CUSTOMER
WHERE 1=1
	AND C_NATIONKEY 	= 8
	AND C_MKTSEGMENT 	IN('AUTOMOBILE', 'BUILDING', 'MACHINERY')
AND C_ACCTBAL		> 9900;

 

Starter query—formatted as JSON

 
--This update to the starter query formats the data as a JSON object.
SELECT '[' || LISTAGG(CAST(D.CUSTOMER_ROWS AS VARCHAR), ',') || ']' AS CUSTOMER_DETAILS
FROM
(
	SELECT OBJECT_CONSTRUCT
	(
		 'customer_key', 	C.C_CUSTKEY
		,'customer_name', 	C.C_NAME
		,'customer_address', 	C.C_ADDRESS
		,'market_segment', 	C.C_MKTSEGMENT
	) AS CUSTOMER_ROWS
	FROM TPCH_SF1.CUSTOMER AS C
	WHERE 1=1 
		AND C_NATIONKEY 	= 8
		AND C_MKTSEGMENT 	IN('AUTOMOBILE', 'BUILDING', 'MACHINERY')
AND C_ACCTBAL		> 9900
) AS D;

 

Basic stored procedure

The basic stored procedure example below accepts a single parameter and simply returns a row count. It is used to demonstrate how all the pieces fit together.

 
--DROP PROCEDURE TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER(VARCHAR)
CREATE OR REPLACE PROCEDURE TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER(MARKET_SEGMENT VARCHAR(10))
	RETURNS FLOAT
	LANGUAGE JAVASCRIPT
	EXECUTE AS OWNER
AS
$$
/****************************************************************************
Project Name: 		Logic20/20 Training

Desc: 	This stored procedure runs a SQL statement and returns the number of rows in the CUSTOMER table for the specified market segment.

Inputs:	MARKET_SEGMENT - The market segment for which the row count should be returned.  Examples: AUTOMOBILE, BUILDING, MACHINERY.

Outputs:		The number of rows returned by the SQL statement.

Created Date: 		06/10/2022
Created By: 		Jason Kind

*****************************************************************************
CHANGE HISTORY
*****************************************************************************
Date		Author		Description
**********	*********	************************************************
06/10/2022	Jason Kind	Created stored procedure
*****************************************************************************/
	let retVal 	= -1;
	let sql 	= "SELECT * FROM TPCH_SF1.CUSTOMER WHERE C_MKTSEGMENT = '" + MARKET_SEGMENT + "';";
	let rs 		= snowflake.execute({sqlText: sql});
	
	retVal 		= rs.getRowCount();  //The Snowflake getRowCount() method returns the number of rows returned from a SQL query.

	return retVal;
$$;
/*
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER('AUTOMOBILE');
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER('BUILDING');
*/

 

Stored procedure with JSON output

This example expands on the first one by issuing a more complex SQL statement. A few call-outs:

• It is important to note the use of backticks (`) to enclose the multi-line SQL statement. Back-ticks (`) are different from single ticks (') and will not work for multi-line enclosures.

• Note the use of the “return” statement that is currently commented out in the code. When not commented out, it can be used to return, as a string, the SQL statement that will be executed. This debugging technique is particularly helpful when debugging complex, parameterized SQL statements.

• Snowflake is case-sensitive so passing 'MACHINERY' as an argument to the MARKET_SEGMENT parameter will be evaluated differently than 'Machinery'. See the next sample for a solution that makes this more flexible.

 

 
--DROP PROCEDURE TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER_JSON_OUTPUT(VARCHAR)
CREATE OR REPLACE PROCEDURE TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER_JSON_OUTPUT(MARKET_SEGMENT VARCHAR(10))
	RETURNS VARIANT
	LANGUAGE JAVASCRIPT
	EXECUTE AS OWNER
AS
$$
/****************************************************************************
Project Name: 		Logic20/20 Training

Desc: 	 		This stored procedure runs a SQL statement and returns the output data 
			in JSON format.

Inputs:			MARKET_SEGMENT - The market segment for which the JSON formatted data
			should be returned.  Examples: AUTOMOBILE, BUILDING, MACHINERY.

Outputs:		The output data in JSON format.

Created Date: 		06/10/2022
Created By: 		Jason Kind

*****************************************************************************
CHANGE HISTORY
*****************************************************************************
Date		Author		Description
**********	*********	************************************************
06/10/2022	Jason Kind	Created stored procedure
*****************************************************************************/
	let retVal 	= "";
	
	/*
Note: There are a few different options for handling complex, multi-line SQL statements.  Using backticks	is probably the cleanest approach.  See the following for more details: 	https://docs.snowflake.com/en/sql-reference/stored-procedures-javascript.html#line-continuation
	*/
let sql = `SELECT '[' || LISTAGG(CAST(D.CUSTOMER_ROWS AS VARCHAR), ',') || ']' AS CUSTOMER_DETAILS
		FROM
		(
			SELECT OBJECT_CONSTRUCT
			(
				 'customer_key', 	C.C_CUSTKEY
				,'customer_name', 	C.C_NAME
				,'customer_address', 	C.C_ADDRESS
				,'market_segment', 	C.C_MKTSEGMENT
			) AS CUSTOMER_ROWS
			FROM TPCH_SF1.CUSTOMER AS C
			WHERE 1=1 
				AND C_NATIONKEY 	= 8
				AND C_MKTSEGMENT 	= '` + MARKET_SEGMENT + `'
				AND C_ACCTBAL		> 9900
		) AS D;`
	
	/*
	Use the return statement temporarily throughout your code to output the results.  In
	this case, we want to see if the SQL string we're trying to execute is properly
	formatted.  This is a good debugging technique and becomes more important as the 
	SQL statements get more complex.
	*/
	//return sql;
	
	let rs 		= snowflake.execute({sqlText: sql});
	rs.next();	//Advance to the first and, in this case only, row in the recordset.
	
	retVal = JSON.parse(rs.getColumnValue("CUSTOMER_DETAILS"));

	return retVal;
$$;
/*
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER_JSON_OUTPUT('AUTOMOBILE');
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER_JSON_OUTPUT('MACHINERY');
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER_JSON_OUTPUT('MACHINERY  ');	//Invalid spaces in the MARKETING_SEGMENT argument - No data returned.
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_PARAMETER_JSON_OUTPUT('Machinery');		//Invalid casing in the MARKETING_SEGMENT argument - No data returned.
*/

 

Stored procedure with JSON output and internal functions

One compelling feature of Snowflake stored procedures is the ability to include internal JavaScript functions for utility purposes and to encapsulate business logic. This example builds on the above example and uses internal functions to perform some simple preparation tasks. It also demonstrates how to make a separate SQL call to get the current system date. These same techniques can be applied to call external stored procedures for logging, or other purposes.

 

 
--DROP PROCEDURE TPCH_SF1.SP_BASIC_SQL_CALL_WITH_INTERNAL_FUNCTIONS(VARCHAR)
CREATE OR REPLACE PROCEDURE TPCH_SF1.SP_BASIC_SQL_CALL_WITH_INTERNAL_FUNCTIONS(MARKET_SEGMENT VARCHAR(10))
	RETURNS VARIANT
	LANGUAGE JAVASCRIPT
	EXECUTE AS OWNER
AS
$$
/****************************************************************************
Project Name: 		Logic20/20 Training

Desc: 	 		This stored procedure runs a SQL statement and returns the output data 
			in JSON format.

Inputs:			MARKET_SEGMENT - The market segment for which the JSON formatted data
					  should be returned.  Examples: AUTOMOBILE, BUILDING, 
					  MACHINERY.

Outputs:		The output data in JSON format.

Created Date: 		06/10/2022
Created By: 		Jason Kind

*****************************************************************************
CHANGE HISTORY
*****************************************************************************
Date		Author			Description
**********	*********		*************************************************
06/10/2022	Jason Kind		Created stored procedure
*****************************************************************************/

	/********** BEGIN - Global functions **********/
	/*
	Applies logic to prep inputs for use.  Specifically, the following corrective
	actions are applied:
	 - Trims whitespace from the beginning and/or end
	 - Converts input to uppercase if needed
	*/
	function cleanInputs(inputToClean, convertToUpper) {
		let retVal = inputToClean.trim();
		
		if(convertToUpper) {
			retVal = retVal.toUpperCase();
		}
		
		return retVal;
	}
	
	/*
	Retrieve the current system date as a formatted string.
	*/
	function getCurrentDate() {
		let retVal 	= "";
		
		let rs 		= snowflake.execute({sqlText: "SELECT TO_VARCHAR(CURRENT_TIMESTAMP(), 'YYYY-MM-DD') AS CURRENT_DATE;"});
		rs.next();
		
		retVal 		= rs.getColumnValue("CURRENT_DATE");
		
		return retVal;
	};

	/********** END   - Global functions **********/


	let retVal 		= "";
	let currentDate 	= getCurrentDate();
	let sql 		= `SELECT '[' || LISTAGG(CAST(D.CUSTOMER_ROWS AS VARCHAR), ',') || ']' AS CUSTOMER_DETAILS
		FROM
		(
			SELECT OBJECT_CONSTRUCT
			(
				 'customer_key', 		C.C_CUSTKEY
				,'customer_name', 		C.C_NAME
				,'customer_address', 	C.C_ADDRESS
				,'market_segment', 		C.C_MKTSEGMENT
				,'system_date',			'` + currentDate + `'
			) AS CUSTOMER_ROWS
			FROM TPCH_SF1.CUSTOMER AS C
			WHERE 1=1 
				AND C_NATIONKEY 	= 8
				AND C_MKTSEGMENT 	= '` + cleanInputs(MARKET_SEGMENT, true) + `'
				AND C_ACCTBAL		> 9900
		) AS D;`

	/*
	Use the return statement temporarily throughout your code to output the results.  In
	this case, we want to see if the SQL string we're trying to execute is properly
	formatted.  This is a good debugging technique and becomes more important as the 
	SQL statements get more complex.
	*/
	//return sql;
	
	let rs 		= snowflake.execute({sqlText: sql});
	rs.next();	//Advance to the first and, in this case only, row in the recordset.
	
	retVal = JSON.parse(rs.getColumnValue("CUSTOMER_DETAILS"));

	return retVal;
$$;

/*
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_INTERNAL_FUNCTIONS('AUTOMOBILE');
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_INTERNAL_FUNCTIONS('MACHINERY');
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_INTERNAL_FUNCTIONS('MACHINERY  ');	//Invalid spaces in the MARKETING_SEGMENT argument
CALL TPCH_SF1.SP_BASIC_SQL_CALL_WITH_INTERNAL_FUNCTIONS('Machinery  ');	//Invalid casing in the MARKETING_SEGMENT argument
*/

 

Stored procedure with threshold logging

This stored procedure demonstrates how JSON data can be utilized, and even updated during runtime. In this contrived example, the JSON object is looped through to identify records where the account balance is at or below a specified threshold. The outcome is added to the JSON object and also logged to a separate table. A few call-outs:

• One advantage of JSON is that it can be looped through any number of times where a recordset object is forward-only. While this capability isn’t demonstrated here, it’s a good thing to remember if your use-case calls for multiple loops through the same dataset.

• As noted in the code comments, but worth calling out separately, the MINIMUM_BALANCE_THRESHOLD parameter expects a FLOAT data type rather than the NUMBER data type. This is because JavaScript will ultimately be working with the value provided and JavaScript doesn't support the NUMBER data type. Attempting to set the parameter to the NUMBER data type (e.g.,NUMBER(12,2)) will result in the following compilation error: SQL Error [90215] [42601]: Language JAVASCRIPT does not support type 'NUMBER(12,2)' for argument or return type.

 

 
--DROP PROCEDURE TPCH_SF1.SP_JSON_PROCESSING(VARCHAR, FLOAT)
CREATE OR REPLACE PROCEDURE TPCH_SF1.SP_JSON_PROCESSING(MARKET_SEGMENT VARCHAR(10), MINIMUM_BALANCE_THRESHOLD FLOAT)
	RETURNS VARIANT
	LANGUAGE JAVASCRIPT
	EXECUTE AS OWNER
AS
$$
/****************************************************************************
Project Name: 		Logic20/20 Training

Desc: 	This stored procedure runs a SQL statement to retrieve data in JSON format and then loops through it to add additional details.

Inputs:			MARKET_SEGMENT - The market segment for which the JSON formatted data
 			should be returned.  Examples: AUTOMOBILE, BUILDING, MACHINERY.
									 
MINIMUM_BALANCE_THRESHOLD - The minimum negative account balance allowed.  Any accounts that are <= to this number should be logged.  Note: The parameter expects a FLOAT data type rather than the NUMBER(12,2) data type. This is because JavaScript will ultimately be working with the value provided and JavaScript doesn't support the NUMBER data type.  Attempting to set the parameter to NUMBER(12,2) will result in the following compilation error:  SQL Error [90215] [42601]: Language JAVASCRIPT does not support type 'NUMBER(12,2)' for argument or return type.

Outputs:	The output data in JSON format, with a new attribute named flag_account_balance added.

Created Date: 		06/10/2022
Created By: 		Jason Kind

*****************************************************************************
CHANGE HISTORY
*****************************************************************************
Date		Author		Description
**********	*********	************************************************
06/10/2022	Jason Kind	Created stored procedure
*****************************************************************************/

	/********** BEGIN - Global functions **********/
	/*
	Applies logic to prep inputs for use.  Specifically, the following corrective
	actions are applied:
	 - Trims whitespace from the beginning and/or end
	 - Converts input to uppercase if needed
	*/
	function cleanInputs(inputToClean, convertToUpper) {
		let retVal = inputToClean.trim();
		
		if(convertToUpper) {
			retVal = retVal.toUpperCase();
		}
		
		return retVal;
	}
	



	/*
	Wrapper function to log accounts with balances below threshold.
	*/
	function logNegativeBalanceBelowThreshold
	(
		 customerKey
		,customerName
		,marketSegment
		,accountBalance
	) {
		let sql = `INSERT INTO TPCH_SF1.NEGATIVE_BALANCE_CUSTOMERS (CUSTOMER_KEY, CUSTOMER_NAME, MARKET_SEGMENT, ACCOUNT_BALANCE)
			VALUES (` + customerKey + `, '` + customerName + `', '` + marketSegment + `', ` + accountBalance + `);
		`
		snowflake.execute({sqlText: sql});
	}
	/********** END   - Global functions **********/


	let customerJSON	= {}; //Initialize JSON container variables to an empty JSON object.
	let sql 			= `SELECT '[' || LISTAGG(CAST(D.CUSTOMER_ROWS AS VARCHAR), ',') || ']' AS CUSTOMER_DETAILS
		FROM
		(
			SELECT OBJECT_CONSTRUCT
			(
				 'customer_key', 		C.C_CUSTKEY
				,'customer_name', 		C.C_NAME
				,'market_segment', 		C.C_MKTSEGMENT
				,'account_balance', 	C.C_ACCTBAL
			) AS CUSTOMER_ROWS
			FROM TPCH_SF1.CUSTOMER AS C
			WHERE 1=1 
				AND C_NATIONKEY 	= 8
				AND C_MKTSEGMENT 	= '` + cleanInputs(MARKET_SEGMENT, true) + `'
				AND C_ACCTBAL		< 0
		) AS D;
		`

	/*
	Use the return statement temporarily throughout your code to output the results.  In
	this case, we want to see if the SQL string we're trying to execute is properly
	formatted.  This is a good debugging technique and becomes more important as the 
	SQL statements get more complex.
	*/
	//return sql;
	
	let rs 		= snowflake.execute({sqlText: sql});
	rs.next();	//Advance to the first and, in this case only, row in the recordset.
	
	customerJSON = JSON.parse(rs.getColumnValue("CUSTOMER_DETAILS"));

	/*
Note: flag_account_balance is not an attribute in the original JSON.  JSON allows the addition or modification of attributes during runtime.
	*/
	for(customerRow in customerJSON) {
		if(customerJSON[customerRow].account_balance < MINIMUM_BALANCE_THRESHOLD) {
			customerJSON[customerRow].flag_account_balance = true;  
			
			logNegativeBalanceBelowThreshold
			(
				 customerJSON[customerRow].customer_key
				,customerJSON[customerRow].customer_name
				,customerJSON[customerRow].market_segment
				,customerJSON[customerRow].account_balance
			);
		}
		else {
			customerJSON[customerRow].flag_account_balance = false;
		}
		
//Convert the account balance to a string for nicer formatting.  Otherwise, it will be returned in scientific notation.
		customerJSON[customerRow].account_balance = customerJSON[customerRow].account_balance.toString();
	}

	return customerJSON;
$$;


/*
CALL TPCH_SF1.SP_JSON_PROCESSING('AUTOMOBILE', -980);
CALL TPCH_SF1.SP_JSON_PROCESSING('MACHINERY',  -980);

SELECT * FROM TPCH_SF1.NEGATIVE_BALANCE_CUSTOMERS ORDER BY ROW_ID;
*/

 

Helpful tools

While I won’t endorse any specific tools or sites, below are some that I have found useful and may be a good starting point to get you started.

  1. Snowflake free trial
  2. Snowflake stored procedure documentation
  3. Online JavaScript Editor: Useful for testing out your JavaScript code outside of Snowflake
  4. JavaScript tutorials:
  5. Notepad++: Full-featured text editor
  6. JSTool: Notepad++ plugin for formatting JSON code
  7. DBeaver - Universal Database Tool - Community Edition (free): While the Snowflake Web UI is perfectly functional for running SQL, using a third-party tool like DBeaver allows you to save your work locally and has numerous helper features. 

 

 

Explore developer careers at Logic20/20

 

Jason Kind

Jason Kind is a lead developer in the Logic20/20 Advanced Analytics practice.