Overview
Standard and Legacy BigQuery SQL don't always provide the functions or methods needed to return the desired results. UDFs are not always the best answer as they have a strict format that must be followed as well as some limits/limitations.
Many solutions can be found in existing scripts. Expanding the ability of a variety of scripting languages to interact with Google BigQuery data also opens the field for those who are proficient in using one over another. Magnus offers access to a list of scripting languages via Script Task. Scripts can be written in JavaScript, Lua, or Go to perform functions that are not available in BigQuery SQL or are easier to create in the scripting language.
There are a variety of ways to transform and interact with data. Some of these solutions may be found easier to program in a scripting language. To meet these needs Magnus features Script Task.
Script Task provides a selection of scripting languages (JavaScript, Lua, and Go) in a rich text editor and returns results to a parameter available for further use in the Workflow
Front Panel
Add Script Task
1. Under Add Task, select the desired position from the "Add to position" drop down list, then click +Script.
2. Select a script language from the drop down list. Supported languages are: JavaScript, Lua, and Go.
3. Provide the script body in the rich Editor. See Rich Editor for more details.
Parameters
4. Parameters can be used within the script body.
5. The returned value from the script is assigned to the Return Parameter specified. By default, the returned value is assigned to the parameter <var_taskId_return>
6. User can choose a custom parameter of type String, Number, Boolean, or Record for the Return parameter.
In this case, the returned value from the script will be assigned to the custom parameter chosen.
The selected custom parameter will become workflow-owned.
Expanding Textbox
7. The script textbox can be expanded to a popup mode by clicking the icon .
8. The script textbox can be displayed in full screen mode by clicking the icon .
Time out
9. Script task has a time out set for 5 minutes. If the script task runs for more than 5 minutes, the task will fail because of the time out limit
Standard Output
10 When multiple calls are placed to standard output, only the standard output from the last call will be available for viewing in Workflows History. For example with JavaScript:
There are three calls placed to standard output. Only the last one will be visible in Workflows History:
Script Task with JavaScript
1. Example of return value:
a. Script that returns a string:
b. Script that returns a number:
c. Script that returns a boolean:
d. Script that returns a record:
Example of referencing a custom parameter var_record of record type within the script body:
Support Packages
Packages | Description |
xml2js |
A simple XML to JavaScript object converter. To convert XML to JS object: xml2js.parseString(xml, function (err, result) { //code to work with result, a JS object }); |
FAQ - Script Task with JavaScript
1. Can I define a function and use it within the JavaScript?
Answer: Yes
2. Can I return multiple results from the JavaScript?
Answer: Yes, you can return the results in a form of an array.
3. Can I return an associative array from JavaScript? For example, can I do this?
Answer: No, this is not supported by Magnus.
4. Can I return from the JavaScript a multiple-dimensional table that is integer index based? For example, can I do this?
Answer: Yes, but the returned table will be flattened to one dimension, with string representation of each element. The return record type parameter will contain 2 elements:
var_record[1] will contain the string "[apple, red]"
var_record[2] will contain the string "[lemon, yellow]"
5. Can I work with XML?
Answer: Yes, you can use package xml2js to convert XML to JS object. For example:
Script Task with Go
mainReturn function
1. The wrapper function mainReturn is required for Go script task. The function mainReturn will be called and the value is returned when the script task is executed.
Go Types
2. These Go types are supported for the return value:
Supported golang type | Respective Magnus type |
bool | boolean |
string | string |
int, int8, int16, int32, int64 | number |
uint, uint8, uint16, uint32, uint64 | number |
float32, float64 | number |
slice of supported basic types (bool, string, int*, uint*, float*) | record |
Return Value
3. Example of return value:
a. Script that returns a string:
b. Script that returns a number:
c. Script that returns a boolean:
d. Script that returns a record of homogenous type:
e. Script that returns a record of mixed type:
f. This script will fail because Magnus supports Go type of slice of basic type only.
Custom Parameter Reference
4. Example of referencing a custom parameter var_record of record type within the script body:
Import Statements
5. Import statement is not supported in Golang script task. These Golang packages are supported: reflect,encoding/json,errors,time,math,strconv,strings,regexp,crypto/sha256,crypto/hmac,crypto/md5,bytes
Support Packages
6. The package fmt is supported via alias:
Supported fmt methods | Alias to use in Magnus script task |
fmt.Sprint | xSprint |
fmt.Sprintf | xSprintf |
fmt.Sprintln | xSprintln |
fmt.Fprint | xFprint |
fmt.Fprintf | xFprintf |
fmt.Fprintln | xFprintln |
7. Click on the icon to Open the Go script in Goliath:
Any used Parameters will also be carried over to Goliath.
Script Task with Lua
Return Value
1. Example of return value:
a. Script that returns a string:
b. Script that returns a number:
c. Script that returns a boolean:
d. Script that returns a record:
Custom Parameter Reference
2. Example of referencing a custom parameter var_record of record type within the script body:
JSON Parsing
3. For json parsing, Magnus provides a built-in Lua function: JSON.parse.
FAQ - Script Task with Lua
1. Can I return multiple results from Lua script?
Answer: Yes. In this case, the return parameter will be of type record. You can access the elements of the results by index starting with 1, not 0.
For example, you can access the element by var_my_record[1], var_my_record[2].
2. Can I return from the Lua script a table that is not integer index based? For example:
Answer: No, this is not supported. The task will not fail, but you will not be able to retrieve anything useful from the return parameter.
3. Can I return from the Lua script a multiple-dimensional table that is integer index based?
Answer: Yes, but the returned table will be flattened to one dimension, with string representation of each element.
For example, the return parameter <var_my_record> will contain 2 elements:
<var_my_record[1]> will contain the string "[apple,red]"
<var_my_record[2]> will contain the string "[kiwi, green]"
Limitation on memory usage
Magnus places memory usage limit on Script Task. If the Script Task's memory usage exceeds that of the limit, the Script Task will fail. The limits are:
- Lua: 64 MB
- JavaScript: 650 MB
- Go: 650 MB
Examples - JavaScript
Use Script Task and API Task to monitor BQ table metadata
a. Add an API call task to your workflow to get table metadata:
The Request URL: https://www.googleapis.com/bigquery/v2/projects/\[project\]/datasets/\[dataset\]/tables/\[tableId\]
b. Add a Script task to parse the json response:
Use Script Task and Loop Task to format query results
a. Add a BigQuery task to your workflow to query some data:
b. Add a Loop task to loop through the results and format each row:
To access the field of data from JavaScript, you can either access by array index or field name.
FAQ
Can I define a function and use if within the JavaScript?
Answer: Yes, for example:
Can I return multiple results from the JavaScript?
Answer: Yes, you can return the results in a form of an array.
Can I return an associative array from JavaScript?
For example, can I do this?
Answer: No, this is not supported by Magnus. You will get an error that says "Object not supported".
Can I return from the JavaScript a multiple-dimensional table that is integer index based?
For example, can I do this?
Answer: Yes, but the returned table will be flattened to one dimension, with string representation of each element.
The return record type parameter will contain 2 elements:
var_obj[1] will contain the string "[apple, red]"
var_obj[2] will contain the string "[lemon, yellow]"
Examples - Golang
Use Script Task and API Task to monitor BQ table metadata
a. Add an API call task to your workflow to get table metadata:
The Request URL: https://www.googleapis.com/bigquery/v2/projects/\[project\]/datasets/\[dataset\]/tables/\[tableId\]
b. Add a Script task to parse the json response:
Use Script Task and Loop Task to format query results
a. Add a BigQuery task to your workflow to query some data:
Add a Loop task to loop through the results and format each row:
To access the field of data from JavaScript, you can access by array index only, not be field name.
FAQ
Is there any golang tutorial?
Answer: https://tour.golang.org/list
What golang packages are supported?
Answer: reflect, encoding/json, errors, time, math, strconv, strings, regexp, crypto/sha256, crypto/hmac, crypto/md5, bytes, encoding/base64, encoding/xml. Reference: https://golang.org/pkg/
What about the fmt package?
Answer: The complete fmt package is not supported. But 3 methods are exposed for you to use through alias. They are:
Supported fmt methods | Alias to use in Magnus Script Task |
fmt.Sprint | xSprint |
fmt.Sprintf | xSprintf |
fmt.Sprintln | xSprintln |
fmt.Fprint | xFprint |
fmt.Fprintf | xFprintf |
fmt.Fprintln | xFprintln |
Advantages of golang over Javascript:
Programmatically golang is more efficient than JavaScript.
Some libraries are native to golang, for example, if you want to perform encryption or hashing, the libraries are readily available to you in golang.
Documentations are centralized in golang
Examples - Lua
Use Script Task to validate parameters
a. Add a script task that returns a boolean:
b. Validate parameters using Lua script:
c. Add Go-To task to check the results of the validations. If any parameter is invalid, fail the workflow:
Use Script Task and API Task to monitor BQ table metadata
(like last modified date of the table)
a. Add an API Call task to your workflow to get table metadata:
The Request URL: https://www.googleapis.com/bigquery/v2/projects/\[project\]/datasets/\[dataset\]/tables/\[tableId\]
For example: https://www.googleapis.com/bigquery/v2/projects/ord2-us-dc/datasets/dw_test/tables/clokTest
b. Add a Script task to parse the json response:
c. Parse the json response using Lua script:
d. Display the table metadata for debugging purpose:
Raise Failure:
numBytes=<var_parse_table_metadata_return[1]>;numRows=<var_parse_table_metadata_return[2]>;creationTime=<var_parse_table_metadata_return[3]>;lastModifiedTime=<var_parse_table_metadata_return[4]>;
e. Calculate how long it has been the BQ table was last modified:
Lua script:
f. Add an email task to report about this information:
Use Script Task to gather duration of specific tasks in a Loop Task
a. Say there is a BQ task inside the Loop task, and you want to gather the duration of the BQ task for each iteration of the loop.
b. Add a Script task right before the BQ task to get the starting timestamp.
c. Add a Script task after the BQ task to get duration of the BQ task:
d. Add a Script task before the end of the loop to append the durations gathered so far:
Lua script:
e. Add a BQ task to append these durations to a BQ table:
FAQ
Does the value of the Return Parameter of a Script task remember its value from last execution of the workflow?
Answer: Yes, only if the Return Parameter of the Script task is a custom parameter that you have created. If you create a custom parameter and set the Return Parameter of the Script task to this custom parameter, this custom dynamic parameter is now owned by the workflow and its value from the last execution of the workflow will be remembered.If the Return Parameter of the Script task is the default auto parameter that is created by Magnus, namely var_<script_task_id>_return, its last value will not be remembered.
Can I refer to custom or built-in parameters of the workflow from within the Lua script?
Answer: Yes, to refer to a custom or built-in parameter of the workflow from Lua script, use :var_parameter_name format, don't use the format <var_parameter_name>.
Can I define a function and use if within Lua script?
Answer: Yes, this is an example:
Can I return multiple results from the Lua script?
Answer: Yes. In this case, the Return Parameter will be of type record. You can access the elements of the results by index, starting with 1, not 0.
You can access the elements of the results by: var_s1_return[1] and var_s1_return[2]
Can I return from the Lua script table that is not integer index based?
For example, can I do this (workflow: clokTest_script_return)?
Answer: No, this is not supported by Magnus. The script task will not fail, but you will not be able to retrieve anything useful from the Return Parameter.
Can I return from the Lua script a multiple-dimensional table that is integer index based?
Answer: Yes, but the returned table will be flattened to one dimension, with string representation of each element.
For example(workflow: clokTest_script_return):
The return parameter will contain 2 elements:
var_return_table_6_return[1] will contain the string "[apple, lemon]"
var_return_table_6_return[1] will contain the string "[kiwi, berry]"