In many use cases, there are processes that need to execute multiple tasks. We build microservices or server-less functions like Amazon Web Services (AWS) Lambda functions to carry out these tasks. Most of these services are stateless functions; hence there's a need for queues or databases to maintain the state of individual tasks and the process. Writing code that orchestrates these tasks can be painful and hard to debug and maintain.
AWS Step Functions is a web service that enables you to coordinate the components of distributed applications and microservices using visual workflows. Step Functions have been around since 2016, it was announced at re:Invent 2016.
1. Enables us to define visual workflows for processes by writing minimal or no code
2. Scales out automatically
3. Deals with failures and timeouts of tasks
4. Maintains an auditable log of all state changes
Step Functions are based on the concept of tasks, states, and state machines. The tasks do all the work. A task can be a Lambda Function, an activity which can be fulfilled on other AWS-hosted resources, or even an activity defined on our own machines.
A state machine is the workflow and the biggest component of Step Functions. State machines are defined using JavaScript Object Notation (JSON) text written in the syntax of the AWS States Language.
The states are blocks that represent some action. Here's a list of states available in the States Language:
1. Pass: does nothing. Mainly used for debugging or as a placeholder state
2. Task: execute some code or run an activity
3. Choice: adds branching logic to the state machine (if/else?)
4. Wait: waits for a specified time before moving on to the next state
5. Succeed: terminates the execution with a success
6. Fail: terminates the execution with a failure
7. Parallel: adds parallel branching to the state function
You can use the AWS Step Functions console to start making simple flows (state machines) using Step Functions. There are pre-configured examples available. We’ll go through the example — which extracts Exchangeable Image File (EXIF) metadata from an image, crops the object from the image, and saves it to S3. We'll assume that Lambda functions are deployed. We'll use Step Functions to orchestrate the Lambda functions to process the images.
Goal: Extract metadata from images, resize images to medium size, and a thumbnail. Upload the details to the database.
We have two Lambda functions already set up
1. get-exif-lambda: downloads image from S3, extracts EXIF data
Input: S3 image URI
Output: image metadata
2. process-image-lambda: downloads image from S3, resizes it to the desired size
Input: S3 image URI, desired size of the output image
Output: S3 URIs of the processed images
A state machine can be created by the AWS Command Line Interface (CLI), or the AWS Step Functions console. It’s easier to create it from the console because you can see a visual representation of it. It's also easy to set up the required Identity and Access Management (IAM) role to give the state machine permission to the required AWS resources from the console.
Open the Step Functions console and create a new state machine with the following JSON.
{
"StartAt": "GetExif",
"States": {
"GetExif": {
"Type": "Task",
"Resource": "<get-exif-lambda arn>",
"Next": "ResizeImage"
},
"ResizeImage": {
"Type": "Parallel",
"Next": "WriteToDb",
"ResultPath": "$.resizedLinks",
"Branches": [
{
"StartAt": "MediumSize",
"States": {
"MediumSize": {
"Type": "Task",
"Resource": "<resize-image-lambda arn>",
"Parameters": {
"thumbnail": false,
"source.$": "$.source",
"maxHeight": 600,
"maxWidth": 600
},
"End": true
}
}
},
{
"StartAt": "Thumbnail",
"States": {
"Thumbnail": {
"Type": "Task",
"Resource": "<resize-image-lambda arn>",
"Parameters": {
"thumbnail": true,
"source.$": "$.source",
"maxHeight": 128,
"maxWidth": 128
},
"End": true
}
}
}
]
},
"WriteToDb": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:putItem",
"Parameters": {
"TableName": "image-details",
"Item": {
"key": {
"S.$": "$.source.key"
},
"exif": {
"S.$": "$.exif"
},
"mediumURL": {
"S.$": "$.resizedLinks[0]"
},
"thumbnailURL": {
"S.$": "$.resizedLinks[1]"
}
}
},
"End": true
}
}
}
In the next step, provide an IAM role for the state machine. Select “Create an IAM role,” give a valid role name, and proceed. This will create a new IAM role, which provides it access only to the resources that are used by the state machine. The state machine created will look like this
The flow starts at the GetExif state, which executes a Lambda function to retrieve metadata from the image.
Then it executes two image conversion tasks:MediumSize, and Thumbnail, parallelly.
This data is passed to WriteToDb, which writes the output to DynamoDB.
This is a simple example. It's very easy to design more complex workflows with other states available in the States Language.
We can also add error handling and retry functionality to the states.
The States Language allows us to manipulate and control the JSON data that flows between states.
In Amazon State Language, these fields manipulate and control the flow of data between states:
1. InputPath
2. OutputPath
3. ResultPath
4. Parameters
This diagram shows the sequence in which these fields are applied to the JSON data.
Let's assume the state receives the following input.
{
"message": {
"title": "Msg Title",
"content": "Hello World!"
},
"timestamp": 12312432
}
We can add InputPath and Parameters to the state.
"InputPath": "$.message",
"Parameters": {
"messageType": "text",
"messageTitle.$": "$.title",
"messageContent.$": "$.content"
}
This will give us the following as input to the worker.
{
"messageType": "text",
"messageTitle": "Msg Title",
"messageContent": "Hello World!"
}
Assume the worker returns the following output for the input in the previous example.
"HELLO WORLD!"
We can add ResultPath to add the output to the input.
"ResultPath": "$.taskOutput"
This will include the result of the worker to the input.
{
"messageType": "text",
"messageTitle": "Msg Title",
"messageContent": "Hello World!"
"taskOutput": "HELLO WORLD!"
}
OutputPath field filters data to be sent to the next state. In this case, "OutputPath":"$.messageContent" will send "Hello World!" as input to the next task.
Read the Input and Output Processing documentation for more details on this.
We can run tests, debug, and monitor Step Functions executions on the Step Functions console. If you are using Lambda functions to run tasks, logs will be delivered to the Lambda’s CloudWatch log group as usual.
It also has a CloudWatch metrics integration to monitor failures in production, which are available under CloudWatch > Metrics > States.
Step Functions is an easy-to-use service for orchestrating backend workflows. Very complex flows can be designed easily with the States Language. It maintains the state of all the tasks and orchestrates them to run when needed, and scale automatically. Step Functions are easily pluggable with existing architecture. Since an SFN can stay alive for one year, it can also be used for long running workflows using activity worker.
Step Functions is probably the coolest console among all the AWS services.
A version of this blog was published earlier on Medium.
In many use cases, there are processes that need to execute multiple tasks. We build microservices or server-less functions like Amazon Web Services (AWS) Lambda functions to carry out these tasks. Most of these services are stateless functions; hence there's a need for queues or databases to maintain the state of individual tasks and the process. Writing code that orchestrates these tasks can be painful and hard to debug and maintain.
AWS Step Functions is a web service that enables you to coordinate the components of distributed applications and microservices using visual workflows. Step Functions have been around since 2016, it was announced at re:Invent 2016.
1. Enables us to define visual workflows for processes by writing minimal or no code
2. Scales out automatically
3. Deals with failures and timeouts of tasks
4. Maintains an auditable log of all state changes
Step Functions are based on the concept of tasks, states, and state machines. The tasks do all the work. A task can be a Lambda Function, an activity which can be fulfilled on other AWS-hosted resources, or even an activity defined on our own machines.
A state machine is the workflow and the biggest component of Step Functions. State machines are defined using JavaScript Object Notation (JSON) text written in the syntax of the AWS States Language.
The states are blocks that represent some action. Here's a list of states available in the States Language:
1. Pass: does nothing. Mainly used for debugging or as a placeholder state
2. Task: execute some code or run an activity
3. Choice: adds branching logic to the state machine (if/else?)
4. Wait: waits for a specified time before moving on to the next state
5. Succeed: terminates the execution with a success
6. Fail: terminates the execution with a failure
7. Parallel: adds parallel branching to the state function
You can use the AWS Step Functions console to start making simple flows (state machines) using Step Functions. There are pre-configured examples available. We’ll go through the example — which extracts Exchangeable Image File (EXIF) metadata from an image, crops the object from the image, and saves it to S3. We'll assume that Lambda functions are deployed. We'll use Step Functions to orchestrate the Lambda functions to process the images.
Goal: Extract metadata from images, resize images to medium size, and a thumbnail. Upload the details to the database.
We have two Lambda functions already set up
1. get-exif-lambda: downloads image from S3, extracts EXIF data
Input: S3 image URI
Output: image metadata
2. process-image-lambda: downloads image from S3, resizes it to the desired size
Input: S3 image URI, desired size of the output image
Output: S3 URIs of the processed images
A state machine can be created by the AWS Command Line Interface (CLI), or the AWS Step Functions console. It’s easier to create it from the console because you can see a visual representation of it. It's also easy to set up the required Identity and Access Management (IAM) role to give the state machine permission to the required AWS resources from the console.
Open the Step Functions console and create a new state machine with the following JSON.
{
"StartAt": "GetExif",
"States": {
"GetExif": {
"Type": "Task",
"Resource": "<get-exif-lambda arn>",
"Next": "ResizeImage"
},
"ResizeImage": {
"Type": "Parallel",
"Next": "WriteToDb",
"ResultPath": "$.resizedLinks",
"Branches": [
{
"StartAt": "MediumSize",
"States": {
"MediumSize": {
"Type": "Task",
"Resource": "<resize-image-lambda arn>",
"Parameters": {
"thumbnail": false,
"source.$": "$.source",
"maxHeight": 600,
"maxWidth": 600
},
"End": true
}
}
},
{
"StartAt": "Thumbnail",
"States": {
"Thumbnail": {
"Type": "Task",
"Resource": "<resize-image-lambda arn>",
"Parameters": {
"thumbnail": true,
"source.$": "$.source",
"maxHeight": 128,
"maxWidth": 128
},
"End": true
}
}
}
]
},
"WriteToDb": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:putItem",
"Parameters": {
"TableName": "image-details",
"Item": {
"key": {
"S.$": "$.source.key"
},
"exif": {
"S.$": "$.exif"
},
"mediumURL": {
"S.$": "$.resizedLinks[0]"
},
"thumbnailURL": {
"S.$": "$.resizedLinks[1]"
}
}
},
"End": true
}
}
}
In the next step, provide an IAM role for the state machine. Select “Create an IAM role,” give a valid role name, and proceed. This will create a new IAM role, which provides it access only to the resources that are used by the state machine. The state machine created will look like this
The flow starts at the GetExif state, which executes a Lambda function to retrieve metadata from the image.
Then it executes two image conversion tasks:MediumSize, and Thumbnail, parallelly.
This data is passed to WriteToDb, which writes the output to DynamoDB.
This is a simple example. It's very easy to design more complex workflows with other states available in the States Language.
We can also add error handling and retry functionality to the states.
The States Language allows us to manipulate and control the JSON data that flows between states.
In Amazon State Language, these fields manipulate and control the flow of data between states:
1. InputPath
2. OutputPath
3. ResultPath
4. Parameters
This diagram shows the sequence in which these fields are applied to the JSON data.
Let's assume the state receives the following input.
{
"message": {
"title": "Msg Title",
"content": "Hello World!"
},
"timestamp": 12312432
}
We can add InputPath and Parameters to the state.
"InputPath": "$.message",
"Parameters": {
"messageType": "text",
"messageTitle.$": "$.title",
"messageContent.$": "$.content"
}
This will give us the following as input to the worker.
{
"messageType": "text",
"messageTitle": "Msg Title",
"messageContent": "Hello World!"
}
Assume the worker returns the following output for the input in the previous example.
"HELLO WORLD!"
We can add ResultPath to add the output to the input.
"ResultPath": "$.taskOutput"
This will include the result of the worker to the input.
{
"messageType": "text",
"messageTitle": "Msg Title",
"messageContent": "Hello World!"
"taskOutput": "HELLO WORLD!"
}
OutputPath field filters data to be sent to the next state. In this case, "OutputPath":"$.messageContent" will send "Hello World!" as input to the next task.
Read the Input and Output Processing documentation for more details on this.
We can run tests, debug, and monitor Step Functions executions on the Step Functions console. If you are using Lambda functions to run tasks, logs will be delivered to the Lambda’s CloudWatch log group as usual.
It also has a CloudWatch metrics integration to monitor failures in production, which are available under CloudWatch > Metrics > States.
Step Functions is an easy-to-use service for orchestrating backend workflows. Very complex flows can be designed easily with the States Language. It maintains the state of all the tasks and orchestrates them to run when needed, and scale automatically. Step Functions are easily pluggable with existing architecture. Since an SFN can stay alive for one year, it can also be used for long running workflows using activity worker.
Step Functions is probably the coolest console among all the AWS services.
A version of this blog was published earlier on Medium.
In many use cases, there are processes that need to execute multiple tasks. We build microservices or server-less functions like Amazon Web Services (AWS) Lambda functions to carry out these tasks. Most of these services are stateless functions; hence there's a need for queues or databases to maintain the state of individual tasks and the process. Writing code that orchestrates these tasks can be painful and hard to debug and maintain.
AWS Step Functions is a web service that enables you to coordinate the components of distributed applications and microservices using visual workflows. Step Functions have been around since 2016, it was announced at re:Invent 2016.
1. Enables us to define visual workflows for processes by writing minimal or no code
2. Scales out automatically
3. Deals with failures and timeouts of tasks
4. Maintains an auditable log of all state changes
Step Functions are based on the concept of tasks, states, and state machines. The tasks do all the work. A task can be a Lambda Function, an activity which can be fulfilled on other AWS-hosted resources, or even an activity defined on our own machines.
A state machine is the workflow and the biggest component of Step Functions. State machines are defined using JavaScript Object Notation (JSON) text written in the syntax of the AWS States Language.
The states are blocks that represent some action. Here's a list of states available in the States Language:
1. Pass: does nothing. Mainly used for debugging or as a placeholder state
2. Task: execute some code or run an activity
3. Choice: adds branching logic to the state machine (if/else?)
4. Wait: waits for a specified time before moving on to the next state
5. Succeed: terminates the execution with a success
6. Fail: terminates the execution with a failure
7. Parallel: adds parallel branching to the state function
You can use the AWS Step Functions console to start making simple flows (state machines) using Step Functions. There are pre-configured examples available. We’ll go through the example — which extracts Exchangeable Image File (EXIF) metadata from an image, crops the object from the image, and saves it to S3. We'll assume that Lambda functions are deployed. We'll use Step Functions to orchestrate the Lambda functions to process the images.
Goal: Extract metadata from images, resize images to medium size, and a thumbnail. Upload the details to the database.
We have two Lambda functions already set up
1. get-exif-lambda: downloads image from S3, extracts EXIF data
Input: S3 image URI
Output: image metadata
2. process-image-lambda: downloads image from S3, resizes it to the desired size
Input: S3 image URI, desired size of the output image
Output: S3 URIs of the processed images
A state machine can be created by the AWS Command Line Interface (CLI), or the AWS Step Functions console. It’s easier to create it from the console because you can see a visual representation of it. It's also easy to set up the required Identity and Access Management (IAM) role to give the state machine permission to the required AWS resources from the console.
Open the Step Functions console and create a new state machine with the following JSON.
{
"StartAt": "GetExif",
"States": {
"GetExif": {
"Type": "Task",
"Resource": "<get-exif-lambda arn>",
"Next": "ResizeImage"
},
"ResizeImage": {
"Type": "Parallel",
"Next": "WriteToDb",
"ResultPath": "$.resizedLinks",
"Branches": [
{
"StartAt": "MediumSize",
"States": {
"MediumSize": {
"Type": "Task",
"Resource": "<resize-image-lambda arn>",
"Parameters": {
"thumbnail": false,
"source.$": "$.source",
"maxHeight": 600,
"maxWidth": 600
},
"End": true
}
}
},
{
"StartAt": "Thumbnail",
"States": {
"Thumbnail": {
"Type": "Task",
"Resource": "<resize-image-lambda arn>",
"Parameters": {
"thumbnail": true,
"source.$": "$.source",
"maxHeight": 128,
"maxWidth": 128
},
"End": true
}
}
}
]
},
"WriteToDb": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:putItem",
"Parameters": {
"TableName": "image-details",
"Item": {
"key": {
"S.$": "$.source.key"
},
"exif": {
"S.$": "$.exif"
},
"mediumURL": {
"S.$": "$.resizedLinks[0]"
},
"thumbnailURL": {
"S.$": "$.resizedLinks[1]"
}
}
},
"End": true
}
}
}
In the next step, provide an IAM role for the state machine. Select “Create an IAM role,” give a valid role name, and proceed. This will create a new IAM role, which provides it access only to the resources that are used by the state machine. The state machine created will look like this
The flow starts at the GetExif state, which executes a Lambda function to retrieve metadata from the image.
Then it executes two image conversion tasks:MediumSize, and Thumbnail, parallelly.
This data is passed to WriteToDb, which writes the output to DynamoDB.
This is a simple example. It's very easy to design more complex workflows with other states available in the States Language.
We can also add error handling and retry functionality to the states.
The States Language allows us to manipulate and control the JSON data that flows between states.
In Amazon State Language, these fields manipulate and control the flow of data between states:
1. InputPath
2. OutputPath
3. ResultPath
4. Parameters
This diagram shows the sequence in which these fields are applied to the JSON data.
Let's assume the state receives the following input.
{
"message": {
"title": "Msg Title",
"content": "Hello World!"
},
"timestamp": 12312432
}
We can add InputPath and Parameters to the state.
"InputPath": "$.message",
"Parameters": {
"messageType": "text",
"messageTitle.$": "$.title",
"messageContent.$": "$.content"
}
This will give us the following as input to the worker.
{
"messageType": "text",
"messageTitle": "Msg Title",
"messageContent": "Hello World!"
}
Assume the worker returns the following output for the input in the previous example.
"HELLO WORLD!"
We can add ResultPath to add the output to the input.
"ResultPath": "$.taskOutput"
This will include the result of the worker to the input.
{
"messageType": "text",
"messageTitle": "Msg Title",
"messageContent": "Hello World!"
"taskOutput": "HELLO WORLD!"
}
OutputPath field filters data to be sent to the next state. In this case, "OutputPath":"$.messageContent" will send "Hello World!" as input to the next task.
Read the Input and Output Processing documentation for more details on this.
We can run tests, debug, and monitor Step Functions executions on the Step Functions console. If you are using Lambda functions to run tasks, logs will be delivered to the Lambda’s CloudWatch log group as usual.
It also has a CloudWatch metrics integration to monitor failures in production, which are available under CloudWatch > Metrics > States.
Step Functions is an easy-to-use service for orchestrating backend workflows. Very complex flows can be designed easily with the States Language. It maintains the state of all the tasks and orchestrates them to run when needed, and scale automatically. Step Functions are easily pluggable with existing architecture. Since an SFN can stay alive for one year, it can also be used for long running workflows using activity worker.
Step Functions is probably the coolest console among all the AWS services.
A version of this blog was published earlier on Medium.