This article demonstrates how you can call Amazon web services (AWS) from Magnus workflows.
AWS like s3, ec2, iam require authenticating the requests using AWS Signature Version 4.
Magnus provides helper code in this respect.
AWS references
AWS signature version 4: http://docs.amazonaws.cn/en_us/AmazonS3/latest/API/sig-v4-header-based-auth.html
To obtain access and secret keys: http://docs.aws.amazon.com/general/latest/gr/getting-aws-sec-creds.html
To calculate signature key: http://docs.aws.amazon.com/general/latest/gr/signature-v4-examples.html#signature-v4-examples-python
Examples: http://docs.aws.amazon.com/general/latest/gr/sigv4-signed-request-examples.html
Other useful links:
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
http://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html
Uploading data to Amazon S3 using single upload method
Reference: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html
Here is an example of uploading data to S3 in AWS request call.
The max data size limit is 10 MB using this method in Magnus.
1. Define the following custom parameters:
Parameter Name | Description |
var_bucket | The bucket where you want to load your data to |
var_region | The region of the bucket |
var_object | The name of the bucket object |
var_method | PUT |
var_service | s3 |
var_canonical_querystring | Leave this as empty |
var_access_key | Your access key (see reference) |
var_secret_key | Your access key (see reference) |
var_report | The content you want to upload |
var_payload_hash | The SHA256 hash of the report content |
var_params | This contains the parameters needed for the helper function |
var_authorization_header | The resulting authorization http header to be sent |
var_sys_double_eval_params | Set to true to indicate that this workflow requires double parameters substitution |
Generate the data to be uploaded:
2. Format the data into one string and store it in an Anonymous Destination Table:
3. Get payload has of the formatted report using BQ's Standard SQL function SHA256. Store the resulting has in var_payload_hash:
This Standard SQL query takes the formatted report from previous BQ Task and generates a SHA256 Hex string.
The query Text is:
CREATE TEMPORARY FUNCTION convertToHexString(x ARRAY<int64>)
RETURNS STRING
LANGUAGE js AS """
var result="";
for(var i=0; i < x.length; i++) {
//Make sure the hex string is padded with leading 0
var hex = "00" + Number(x[i]).toString(16);
result = result + hex.substr(-2);
}
return result;
""";
WITH t AS
(SELECT TO_CODE_POINTS(SHA256(data)) AS x
FROM `<var_format_report_output>`)
SELECT convertToHexString(x) AS hexString
FROM t;
4. Get the report out into the custom parameter var_report:
5. Initialize var_params, this contains inputs to the helper code to be used later.
amz_date is the UTC date in the format required by AWS
host is the host of the AWS
canonical_uri is the part of the AWS request URI from domain to querystring
canonical_headers are the headers to be sent to AWS
aws_auth_helper is the helper code from [projectId.datasetId.AWSHelper]. See helper code in AWS Auth Helper section below
The query text is:
SELECT
amz_date,
host,
'/<var_bucket>/<var_object>' as canonical_uri,
CONCAT(
'host:', host, '\n',
'x-amz-content-sha256:<var_payload_hash>\n',
'x-amz-date:', amz_date, '\n'
) as canonical_headers,
aws_auth_helper
FROM
(
SELECT
STRFTIME_UTC_USEC(NOW(), "%Y%m%dT%H%M%SZ") as amz_date,
CASE
WHEN '<var_region>' = 'us-east-1' THEN 's3.amazonaws.com'
ELSE 's3-<var_region>.amazonaws.com'
END AS HOST,
MIN(code) as aws_auth_helper
FROM [projectId.datasetId.AWSHelper]
)
See AWS Auth Code section below for help creating AWSHelper table.
6. Execute the helper code to get the authorization header; this requires var_sys_double_eval_params to be true because we are doing parameter substitution twice instead of just once.
These custom parameters are used as input fields to the helper code:
var_payload_hash
var_params[amz_date]
var_method
var_service
var_params[host]
var_region
var_canonical_querystring
var_params[canonical_uri]
var_params[canonical_headers]
var_access_key
var_secret_key
Store the returned authorization header in the parameter var_authorization_header.
This is what the helper code does:
- create a canonical request
- create the string to sign
- calculate the signature
- returns the authorization header
7. Call AWS to upload the data:
Request URL is https://<var_params[host]><var_params[canonical_uri]>?<var_canonical_querystring>
Request method is PUT.
Request body is <var_report>
In the advanced properties panel:
Make sure that "Include Google Authentication Token" is not checked.
Since we are calling AWS, we have to provide Amazon specific authentication information.
For this particular example, to upload content to s3, it requires these 3 headers:
x-amz-content-sha256: <var_payload_hash>
x-amz-date: <var_params[amz_date]>
Authorization: <var_authorization_header>
8. Clear report parameter for good practice to free up disk space
AWS Auth Helper
The following AWS Auth Helper Code is used in this How To. To use the Helper Code create a table and populate it with results from the below SELECT statement. In this How To the AWS AUTH Helper Code table is referred to as projectId.datasetId.AWSHelper.
SELECT
'get_authorization_header' AS name,
'SELECT\n authorization_header\nFROM\njs\n (\n //input table\n (\n select\n \'<var_payload_hash>\' as payload_hash,\n \'<var_params[amz_date]>\' as amz_date,\n \'<var_method>\' as method,\n \'<var_service>\' as service,\n \'<var_params[host]>\' as host,\n \'<var_region>\' as region,\n \'<var_canonical_querystring>\' as canonical_querystring,\n \'<var_params[canonical_uri]>\' as canonical_uri,\n \'<var_params[canonical_headers]>\' as canonical_headers,\n \'<var_access_key>\' as access_key,\n \'<var_secret_key>\' as secret_key\n ),\n\n //input columns\n payload_hash,\n amz_date,\n method,\n service,\n host,\n region,\n canonical_querystring,\n canonical_uri,\n canonical_headers,\n access_key,\n secret_key,\n\n //output schema\n \"[\n {name: \'authorization_header\', type:\'string\'}\n ]\",\n\n //function\n \"function(r, emit)\n {\n\n var CryptoJS=CryptoJS||function(h,s){var f={},g=f.lib={},q=function(){},m=g.Base={extend:function(a){q.prototype=this;var c=new q;a&&c.mixIn(a);c.hasOwnProperty(\'init\')||(c.init=function(){c.$super.init.apply(this,arguments)});c.init.prototype=c;c.$super=this;return c},create:function(){var a=this.extend();a.init.apply(a,arguments);return a},init:function(){},mixIn:function(a){for(var c in a)a.hasOwnProperty(c)&&(this[c]=a[c]);a.hasOwnProperty(\'toString\')&&(this.toString=a.toString)},clone:function(){return this.init.prototype.extend(this)}},\n r=g.WordArray=m.extend({init:function(a,c){a=this.words=a||[];this.sigBytes=c!=s?c:4*a.length},toString:function(a){return(a||k).stringify(this)},concat:function(a){var c=this.words,d=a.words,b=this.sigBytes;a=a.sigBytes;this.clamp();if(b%4)for(var e=0;e<a;e++)c[b+e>>>2]|=(\n d[e>>>2]>>>24-8*(e%4)&255)<<24-8*((b+e)%4);else if(65535<d.length)for(e=0;e<a;e+=4)c[b+e>>>2]=d[e>>>2];else c.push.apply(c,d);this.sigBytes+=a;return this},clamp:function(){var a=this.words,c=this.sigBytes;a[c>>>2]&=4294967295<<\n 32-8*(c%4);a.length=h.ceil(c/4)},clone:function(){var a=m.clone.call(this);a.words=this.words.slice(0);return a},random:function(a){for(var c=[],d=0;d<a;d+=4)c.push(4294967296*h.random()|0);return new r.init(c,a)}}),l=f.enc={},k=l.Hex={stringify:function(a){var c=a.words;a=a.sigBytes;for(var d=[],b=0;b<a;b++){var e=c[b>>>2]>>>24-8*(b%4)&255;d.push((e>>>4).toString(16));d.push((e&15).toString(16))}return d.join(\'\')},parse:function(a){for(var c=a.length,d=[],b=0;b<c;b+=2)d[b>>>3]|=parseInt(a.substr(b,\n 2),16)<<24-4*(b%8);return new r.init(d,c/2)}},n=l.Latin1={stringify:function(a){var c=a.words;a=a.sigBytes;for(var d=[],b=0;b<a;b++)d.push(String.fromCharCode(c[b>>>2]>>>24-8*(b%4)&255));return d.join(\'\')},parse:function(a){for(var c=a.length,d=[],b=0;b<c;b++)d[b>>>2]|=(a.charCodeAt(b)&255)<<24-8*(b%4);return new r.init(d,c)}},j=l.Utf8={stringify:function(a){try{return decodeURIComponent(escape(n.stringify(a)))}catch(c){throw Error(\'Malformed UTF-8 data\');}},parse:function(a){return n.parse(unescape(encodeURIComponent(a)))}},\n u=g.BufferedBlockAlgorithm=m.extend({reset:function(){this._data=new r.init;this._nDataBytes=0},_append:function(a){\'string\'==typeof a&&(a=j.parse(a));this._data.concat(a);this._nDataBytes+=a.sigBytes},_process:function(a){var c=this._data,d=c.words,b=c.sigBytes,e=this.blockSize,f=b/(4*e),f=a?h.ceil(f):h.max((f|0)-this._minBufferSize,0);a=f*e;b=h.min(4*a,b);if(a){for(var g=0;g<a;g+=e)this._doProcessBlock(d,g);g=d.splice(0,a);c.sigBytes-=b}return new r.init(g,b)},clone:function(){var a=m.clone.call(this);\n a._data=this._data.clone();return a},_minBufferSize:0});g.Hasher=u.extend({cfg:m.extend(),init:function(a){this.cfg=this.cfg.extend(a);this.reset()},reset:function(){u.reset.call(this);this._doReset()},update:function(a){this._append(a);this._process();return this},finalize:function(a){a&&this._append(a);return this._doFinalize()},blockSize:16,_createHelper:function(a){return function(c,d){return(new a.init(d)).finalize(c)}},_createHmacHelper:function(a){return function(c,d){return(new t.HMAC.init(a,\n d)).finalize(c)}}});var t=f.algo={};return f}(Math);\n (function(h){for(var s=CryptoJS,f=s.lib,g=f.WordArray,q=f.Hasher,f=s.algo,m=[],r=[],l=function(a){return 4294967296*(a-(a|0))|0},k=2,n=0;64>n;){var j;a:{j=k;for(var u=h.sqrt(j),t=2;t<=u;t++)if(!(j%t)){j=!1;break a}j=!0}j&&(8>n&&(m[n]=l(h.pow(k,0.5))),r[n]=l(h.pow(k,1/3)),n++);k++}var a=[],f=f.SHA256=q.extend({_doReset:function(){this._hash=new g.init(m.slice(0))},_doProcessBlock:function(c,d){for(var b=this._hash.words,e=b[0],f=b[1],g=b[2],j=b[3],h=b[4],m=b[5],n=b[6],q=b[7],p=0;64>p;p++){if(16>p)a[p]=\n c[d+p]|0;else{var k=a[p-15],l=a[p-2];a[p]=((k<<25|k>>>7)^(k<<14|k>>>18)^k>>>3)+a[p-7]+((l<<15|l>>>17)^(l<<13|l>>>19)^l>>>10)+a[p-16]}k=q+((h<<26|h>>>6)^(h<<21|h>>>11)^(h<<7|h>>>25))+(h&m^~h&n)+r[p]+a[p];l=((e<<30|e>>>2)^(e<<19|e>>>13)^(e<<10|e>>>22))+(e&f^e&g^f&g);q=n;n=m;m=h;h=j+k|0;j=g;g=f;f=e;e=k+l|0}b[0]=b[0]+e|0;b[1]=b[1]+f|0;b[2]=b[2]+g|0;b[3]=b[3]+j|0;b[4]=b[4]+h|0;b[5]=b[5]+m|0;b[6]=b[6]+n|0;b[7]=b[7]+q|0},_doFinalize:function(){var a=this._data,d=a.words,b=8*this._nDataBytes,e=8*a.sigBytes; d[e>>>5]|=128<<24-e%32;d[(e+64>>>9<<4)+14]=h.floor(b/4294967296);d[(e+64>>>9<<4)+15]=b;a.sigBytes=4*d.length;this._process();return this._hash},clone:function(){var a=q.clone.call(this);a._hash=this._hash.clone();return a}});s.SHA256=q._createHelper(f);s.HmacSHA256=q._createHmacHelper(f)})(Math);\n (function(){var h=CryptoJS,s=h.enc.Utf8;h.algo.HMAC=h.lib.Base.extend({init:function(f,g){f=this._hasher=new f.init;\'string\'==typeof g&&(g=s.parse(g));var h=f.blockSize,m=4*h;g.sigBytes>m&&(g=f.finalize(g));g.clamp();for(var r=this._oKey=g.clone(),l=this._iKey=g.clone(),k=r.words,n=l.words,j=0;j<h;j++)k[j]^=1549556828,n[j]^=909522486;r.sigBytes=l.sigBytes=m;this.reset()},reset:function(){var f=this._hasher;f.reset();f.update(this._iKey)},update:function(f){this._hasher.update(f);return this},finalize:function(f){var g=\n this._hasher;f=g.finalize(f);g.reset();return g.finalize(this._oKey.clone().concat(f))}})})();\n\n\t var getSignatureKey = function(key, dateStamp, regionName, serviceName) \n\t {\n\t\t var kDate = CryptoJS.HmacSHA256(dateStamp, \'AWS4\' + key, { asBytes: true});\n\t\t var kRegion = CryptoJS.HmacSHA256(regionName, kDate, { asBytes: true });\n\t\t var kService = CryptoJS.HmacSHA256(serviceName, kRegion, { asBytes: true });\n\t\t var kSigning = CryptoJS.HmacSHA256(\'aws4_request\', kService, { asBytes: true });\n\n\t\t return kSigning;\n\t };\n\n\t var hash256 = function(x)\n\t {\n\t\t var datahash = CryptoJS.SHA256(x);\n\t\t return datahash.toString(CryptoJS.enc.Hex);\n\t }; \n\n\t var extractHeaderNames = function(x)\n\t {\n\t\t var headerNames = \'\\\';\n\t\t var parts = x.split(\'\\n\');\n\t\t for(var pi = 0; pi < parts.length; pi++)\n\t\t {\n\t\t var subparts = parts[pi].split(\':\');\n\t\t\t if(subparts.length == 2 && subparts[0].length > 0)\n\t\t\t {\n\t\t\t if(headerNames.length > 0)\n\t\t\t\t {\n\t\t\t\t\t headerNames += \';\'\\n\t\t\t\t }\n\n\t\t\t\t headerNames += subparts[0];\n\t\t\t\t }\n\t\t\t } \n\t\t\t return headerNames;\n\t\t };\n\n\t\t var datestamp = r.amz_date.substring(0, 8);\n\n\t\t var signed_headers = extractHeaderNames(r.canonical_headers);\n\n\t\t var canonical_request = r.method + \'\\n\' +\n\t\t\t r.canonical_uri + \'\\n\' + r.canonical_querystring + \'\\n\' +\n\t\t\t r.canonical_headers + \'\\n\' + signed_headers + \'\\n\' + r.payload_hash;\n\t\t\t\n\t\t var algorithm = \'AWS4-HMAC-SHA256\';\n\n\t\t var credential_scope = datestamp + \'/\' + r.region + \'/\' + r.service + \'/\' + \'aws4_request\';\n\n\t\t var string_to_sign = algorithm + \'\\n\' + r.amz_date + \'\\n\' + credential_scope + \'\\n\' +\n\t\t\t hash256(canonical_request); var signing_key = getSignatureKey(r.secret_key, datestamp, r.region, r.service);\n\n\t\t var signature = CryptoJS.HmacSHA256(string_to_sign, signing_key, { asBytes: true });\n\n\t\tvar authorization_header = algorithm + \' \' +\n\t\t\t\t \'Credential=\' + r.access_key + \'/\' + credential_scope + \', \' +\n\t\t\t\t \'SignedHeaders=\' + signed_headers + \', \' + \'Signature=\' + signature.toString(CryptoJS.enc.Hex);\n\t\t emit({\n\t\t\t\t \'authorization_header\': authorization_header});\n\t }\"\n )' AS code