Overview
Sometimes process needs to be run for each value in a list. For example for each Company an email needs to be sent out customized with the Company's name and latest billing total. Repeating the same steps for each Company is time consuming. One way to speed up delivery is to create a generic template process that can be performed for all Companies in the list and be cycled through so that each email is processed and sent. Magnus provides Loop Task to meet this need.
Magnus features Loop Task granting the ability to iterate through set of records and process data/actions in additional tasks.
Loop Task supports For Each looping over a BigQuery Table.
Loop failures can be ignored, skip iteration, or exit loop
Other tasks can be added to the Loop Task and are cycled through for each iteration
Front Panel
Add a Loop Task
- Under Add Task, select the desired position from the "Add to position" drop down list, then click +Loop.
- Specify the Iteration Data.
Iterate Data
- Iteration Data must be a BigQuery table.
- Click the icon to open the BQ selector to select a BigQuery table for the Iteration Data.
- The BigQuery table can be specified manually.
It is of the format projectID:datasetID.tableID with optional project ID. If project ID is not provided,
for example:
myDataset.myResult
User's billing project will be used. - Parameters can be used.
- The BigQuery table can be the output from one of a previous BigQuery task.
Iteration Parameter
- Specify the Iteration Parameter.
- As each record of the BigQuery is looped through, the record is assigned to the Iteration Parameter.
- Select a custom parameter of record from the drop down list.
- Each field of the record can be accessed by index (1-based) or field name:
<var_record[1]> or <var_record[fieldName]>
Handling Loop Failures
- Specify the behavior when error occurs.
- Ignore Failure, Exit Loop
- Check the box for:
- When error occurs, the Loop Task will exit and continue to the next task that follows right after the Loop Task.
- Check the box for:
- Skip to the Next iteration.
- Check the box for:
- When error occurs, the Loop Task will skip to the next iteration.
- Check the box for:
Additional Tasks in Loop
- Ignore Failure, Exit Loop
- Add tasks to the Loop.
- Tasks of any type except the Loop Task can be added to the Loop Task. Nested Loop is not supported.
- Tasks can be added by using the "Add Task" footer within the Loop Task.
Additional Tasks Navigation
- Tasks can be added by moving tasks into the Loop Task.
- Move a task into the Loop Task from above.
For example, to move the API task "get_meta" into the Loop Task, simply click the icon to move the API task into the Loop Task.
- Move a task into the Loop Task from below.
For example, to move the API task "get_meta" into the Loop Task, simply click the icon to move the API task into the Loop Task.
- Move a task into the Loop Task from above.
- All tasks in the Loop Task are surrounded by a green box.
Limitations
- Nested Loop Tasks (Loop inside another Loop) is not supported
- Only "For Each" loop type is supported.
- Loop Task only supports up to 999 iterations.
The 1000th and beyond iterations will fail with error "Iteration data returned too many records".
FAQ
- How do I break out from a Loop Task when certain condition is met?
Answer: Add a Go-To Task to the Loop Task. Specify the Go-To Task to jump to a task that is outside of the Loop Task when the condition is met
Overview
Async Mode can be enabled on a Loop Task, and allows the loop to run multiple iterations at the same time, rather than one at a time in sequential order. Concurrent Iterations specifies how many loop iterations can run in parallel. For example, when Async Mode is enabled and Concurrent Iterations is 5, it means it will run up to 5 loop iterations in parallel; when one iteration finishes, it will start another iteration, until 5 are running or the loop is completed.
User can set Async Mode on the Flip Side of Loop Task
If Loop is set for Async Mode – respective icon indicates this on Task’s Front Side
Tasks
When a Loop Task is in Async Mode, there are some differences or considerations in how the tasks within the loop will be executed.
Go-To Task
A Go-To Task within the body of an async loop has some restrictions. Jumping to one of the next tasks within the loop is allowed, as well as fail-workflow; however, jumping out of the loop or exit-workflow are not allowed. Jumping out of the loop, which is when one of the targets of the Go-To is a task after the loop body, is not possible because other iterations can be in any state when that happens, and jumping out of the loop would lead to undefined state. Same for exit-workflow, it would not be possible to exit the workflow while iterations are still executing.
In the above screenshot, because the loop is in Async Mode, the Go-To to within_task is valid, but the Go-To (else) to after_task is not valid.
Execute Workflow Task
If an async loop has an execute workflow task, it is subject to the rules of that task. In particular, if the workflow run is not a shared workflow, it can only be executed one at a time, an attempt to run it again at the same time will wait for it to finish and may eventually time out with an error. A shared workflow (users are listed in Shared With) can be run multiple times at the same time.
API Task
Please be considerate when calling APIs, especially in this case when iterations can run at the same time and thus call the API multiple times in rapid succession.
When calling Magnus Remote Workflow Execution (v1), the workflow can only be executed one at a time. Please see Remote Workflow Execution for more info.
When calling Magnus Queued Remote Workflow Execution v2, the workflow can be executed multiple times and run concurrently. Please see Queued Remote Workflow Execution v2 for more info.
All Tasks
Please be considerate when calling APIs and other services, especially in this case when iterations can run at the same time and thus call the API multiple times in rapid succession.
See the documentation for each task used and how it may be impacted by being run in parallel.
Task Failures
- “Ignore Failure, Exit Loop” is not supported in Async Mode because iterations can be in any state when that happens, and jumping out of the loop would lead to undefined state.
- “Skip to Next Iteration” on failure is allowed in Async Mode, it will simply ignore the failure and exit that iteration only, allowing other iterations to proceed.
- Otherwise, failures are not ignored, a failure will cause the workflow to fail.
Workflow Parameters
Parameters can be referenced by the async loop as usual, however any parameter updates within the async loop will be accessible only within that iteration and will not persist to the next iterations or after the loop completes.
Parameters used by an async loop are subject to a size limit, please see the Limitations section.
Appearance of Asynchronous Iterations
Because the iterations of an async loop are not as straightforward as a regular loop and can run at the same time, they appear differently than regular loops.
In the workflow’s history details, each async loop iteration has a separate link to execution details:
Clicking on the folder icon next to iteration details will pull up the details for just that iteration. Use the Iteration drop down box to select a specific iteration.
When viewing a single iteration’s details, the top left of the box will have a link back to the main workflow.
The dashboard will show async loop iterations separately alongside the workflow.
When to use Async Loop or Regular Loop
Regular loops run iterations sequentially, one after the other. This is very fast for simple loops. When loops become more complex, they may benefit from async mode, however this has an associated cost.
Async loops have to be queued, scheduled and given a dedicated environment for execution so that the iterations cannot interfere with each other, and finally communicate information to and from the main workflow. Because of this, it can take anywhere from a few seconds to, under certain circumstances, several minutes to start executing an iteration. This means that if a loop is already executing its iterations fairly quickly, such as under 30 seconds each, using async mode might only make things slower.
Here is a cheat sheet for quickly considering if async mode is appropriate for a loop. Note that this is just a starting point and may not consider all scenarios or nuances.
Operation |
Consider Async Loop? |
Loop iterations typically complete under 30 seconds |
No |
Loop iterations usually take over 30 seconds to complete, and often wait on external jobs, workflows or resources |
Yes |
Need to jump out of the loop early (goto) in some cases |
No |
Need to update parameters within the loop and read them after the loop |
No |
One iteration depends on another iteration being completed |
No |
All iterations are independent of each other and can run at the same time |
Yes |
Loop iterations run a shared workflow via Workflow Task |
Yes |
Loop iterations run a non-shared workflow via Workflow Task |
No |
Loop iterations run a workflow via Remote Execution v2 (queued) and wait via Hub |
Yes |
Async Loop Use Case
There are various scenarios where an async loop may be advantageous, here are a few examples:
- Run a shared workflow multiple times concurrently. You can run a shared workflow with the Execute Workflow Task in the async loop. This will make it easier as you won’t need to use the Magnus remote API or Hub Task.
- Run several large jobs and wait for the results. If you want to start multiple instances of a job or wait on external resources that usually take minutes, and if doing several of these at the same time is appropriate from an async loop.
Execution of Asynchronous Iterations
Iterations of an async loop are run in parallel, multiple iterations at the same time and isolated from each other. They are run using the same features as the Fair Scheduler and Queued Remote Workflow Execution v2. For this reason, async loops are subject to the same limitations, please see Queued Remote Workflow Execution v2 for more info. In summary, all such queued workflows and async loop iterations are subject to quotas and can fail if too many are attempted to be scheduled.
Limitations
- Async loops cannot be resumed. If the workflow fails and an async loop has already started, it will not be eligible for being resumed. If you need to be able to resume a workflow upon failure, either use a regular loop, or use logic within the workflow to check for failures and explicitly handle them within the workflow.
- Parameters used by an async loop are subject to a size limit, if they exceed 5 MB combined, then execution will fail.
- Async loops are subject to the load of magnus. If there are a lot of workflows queued or async loops running, a workflow with an async loop can wait longer than usual to execute iterations.
- Async loop iterations must complete within a reasonable time, the maximum time is currently 6 hours.
- Concurrent Iterations is currently limited to 10, as in it only supports running up to 10 iterations in parallel. Depending on the use case, using a lower limit may be desired so that too many operations don’t happen at the same time, such as hitting a service too frequently, which can result in Too Many Requests, other errors or denial of service.
- For async loop, granting refresh token is required if any nested task requires access token, such as BQ Task, GS-Export Task, FTP Task, etc
- Changes to nested task in async loop will be picked up immediately by current running workflow. This behavior is different from sync loop.
- If the workflow is running, and user changes the async loop to sync loop, current running workflow will get error “Cannot run archived workflow”