Overview
Google BigQuery data is customarily displayed as results. Sometimes the data needs to be saved to an external location for sharing, storing, or analysis. A popular location for loading files is to an FTP server. Being able to utilize FTP and transfer files with Google BigQuery data is vital. Magnus provides FTP Export Task to satisfy this need.
Magnus features FTP Export Task to provide ability of transferring data to an FTP location
FTP Export Task accepts a data location and exports the data in a variety of formats, compressions and delimiters to an FTP URI location supporting SFTP, FTP, and FTPS protocols
Front Panel
Flip Panel
Add a FTP Task
1. Under Add Task, select the desired position from the "Add to position" drop down list, then click +FTP.
2. Specify a Source BQ Table.
Source BigQuery Table Options
a. Click the icon to open the BQ selector to select a BigQuery table as the source.
b. The source table can also be specified manually.
- It is of the format projectID:datasetID.tableID with optional project ID.
- If project ID is not provided, for example:myDataset.myResult - user's billing project will be used.
c. Parameters can be used.
d. User must have read permission to the table.
3. Specify the Destination FTP URI.
a. Parameters can be used.
Processing Location
4. Specify the Processing Location
a. From behind the scene, Magnus first submits a BQ job that extracts the BQ table to a temporary location in Google Cloud Storage. Then the file is downloaded from Google Cloud Storage and sent to the destination FTP location. Select a location where the BQ extract job will run. BQ jobs that run in a specific location may only reference data in that location. For data in either the US or EU, you may choose auto-select to run the BQ job in the location where the data resides. For data in other locations, you must specify the processing location explicitly.
Export Options
5. Specify the Export Format.
a. Click on the link Export Format.
b. These export formats are supported:
i. CSV
ii. JSON
iii. Avro
c. For CSV export format:
i. User has the option to compress the data to GZIP format
ii. User can specify the field delimiter
iii. User has the option to include header
d. For JSON export format:
i. User has the option to compress the data to GZIP format
ii. Field delimiter does not apply
iii. User has the option to include header
e. For Avro export format:
i. User has the option to compress the data to DEFLATE or SNAPPY format
ii. Field delimiter does not apply
iii. User has the option to include header
FTP Type and Credentials
6. Provide FTP credential.
a. Click on the link Export Format.
b. Specify the FTP type
c. Specify the Username and Password. Parameters can be used. Though not recommended since the password will be displayed as plain text.
d. Note: if the password contains reference to a parameter and that parameter exists in the workflow, parameter substitution will apply. For example, if the password is p<var_n>wd and var_n happens to be a parameter, <var_n> will be substituted with the parameter value
e. Password length cannot be more than 245 in characters. The password is encrypted so there is no way to edit it. To change an existing password, a new password must be provided to replace it.
Internals
From behind the scene, Magnus first submits a BQ job that extracts the BQ table to a temporary location in Google Cloud Storage.
Then the file is downloaded from Google Cloud Storage and sent to the destination FTP location.
Limitations
- FTP Task is limited to BigQuery table of not more than 1 GB in size.
- The maximum file size that can be exported to the FTP destination is 32 MB.