Overview

Q: What is AWS Step Functions?

AWS Step Functions is a fully managed service that makes it easier to coordinate the components of distributed applications and microservices using visual workflows. Building applications from individual components that each perform a discrete function helps you scale more easily and change applications more quickly.

Step Functions is a reliable way to coordinate components and step through the functions of your application. Step Functions provides a graphical console to arrange and visualize the components of your application as a series of steps. This makes it easier to build and run multi-step applications.

Step Functions automatically triggers and tracks each step and retries when there are errors, so your application executes in order and as expected. Step Functions logs the state of each step, so when things do go wrong, you can diagnose and debug problems more quickly. You can change and add steps without even writing code, so you can more easily evolve your application and innovate faster.

Q: What are the benefits of designing my application using orchestration?

Breaking an application into service components (or steps) ensures that the failure of one component does not bring the whole system down. Each component scales independently and that component may be updated without requiring the entire system to be redeployed after each change.

The coordination of service components involves managing execution dependencies and scheduling, and concurrency in accordance with the logical flow of the application. In such an application, you can use service orchestration to do this and to handle failures.

Q: What are some common Step Functions use cases?

Step Functions helps with any computational problem or business process that can be subdivided into a series of steps. It’s also useful for creating end-to-end workflows to manage jobs with interdependencies. Common use cases include:

  • Data processing: consolidate data from multiple databases into unified reports, refine and reduce large data sets into useful formats, iterate and process millions of files in an Amazon Simple Storage Service (S3) bucket with high concurrency workflows, or coordinate multi-step analytics and machine learning workflows
  • Building serverless generative AI applications: leverage Step Functions for orchestrating interactions with Amazon Bedrock’s Foundation Models, prompt chaining, fine-tuning, and enriching with capabilities from over 220 AWS services
  • DevOps and IT automation: build tools for continuous integration and continuous deployment, or create event-driven applications that automatically respond to changes in infrastructure
  • E-commerce: automate mission-critical business processes, such as order fulfillment and inventory tracking
  • Web applications: implement robust user registration processes and sign-on authentication

For more details, explore AWS Step Functions use cases and customer testimonials.

Q: How does AWS Step Functions work?

When you use Step Functions, you define state machines that describe your workflow as a series of steps, their relationships, and their inputs and outputs. State machines contain a number of states, each of which represents a step in a workflow diagram.

States can perform work, make choices, pass parameters, initiate parallel execution, manage timeouts, or terminate your workflow with a success or failure.

The visual console automatically graphs each state in the order of execution, making it easier to design multi-step applications. The console highlights the real-time status of each step and provides a detailed history of every execution.

For more information, see How Step Functions Works in the Step Functions developer guide.

Q: How does Step Functions connect to my resources?

You can orchestrate any AWS service using service integrations or any self-managed application component using Activity Tasks.

Service integrations help you construct calls to AWS services and include the response in your workflow. AWS–SDK service integrations help you invoke one of over 9,000 AWS API actions from over 200 services directly from your workflow.

Optimized service integrations further simplify use of common services such as AWS Lambda, Amazon Elastic Container Service (ECS), AWS Glue, or Amazon EMR with capabilities including IAM policy generation and the RunAJob pattern that will automatically wait for completion of asynchronous jobs.

Activity Tasks incorporate integration with activity works that you run in a location of your choice, including in Amazon Elastic Compute Cloud (EC2), in Amazon ECS, on a mobile device, or on an on-premises server. The activity worker polls Step Functions for work, takes any inputs from Step Functions, performs the work using your code, and returns results. Since activity workers request work, it is easier to use workers that are deployed behind a firewall.

A Step Functions state machine can contain combinations of service integrations and Activity Tasks. Step Functions applications can also combine activity workers running in a data center with service tasks that run in the cloud. The workers in the data center continue to run as usual, along with any cloud-based service tasks.

Q: How do I get started with Step Functions?

There are a number of ways you can get started with Step Functions:

Q: What language does Step Functions use?

AWS Step Functions state machines are defined in JSON using the declarative Amazon States Language.

To create an activity worker, you may use any programming language, as long as you can communicate with Step Functions using web service APIs.

For convenience, you may use an AWS SDK in the language of your choosing. Lambda supports code written in Node.js (JavaScript), Python, Golang (Go), and C# (using the .NET Core runtime and other languages). For more information on the Lambda programming model, see the Lambda Developer Guide.

Q: My workflow has some of the properties of Standard Workflows and some properties of Express Workflows. How do I get the best of both?

You can compose the two workflow types:

  • By running Express Workflows as a child workflow of Standard Workflows: The Express Workflow is invoked from a Task state in the parent orchestration workflow and succeeds or fails as a whole from the parent's perspective. It is subject to the parent's retry policy for that Task.
  • By calling Express Workflows from within an Express Workflow, so long as all workflows do not exceed the duration limit of the parent: You might choose to factor your workflows this way if your use case has a combination of long-running or exactly-once, and short-lived high-rate steps.

Q: How does Step Functions support parallelism?

Step Functions includes a Map state for dynamic parallelism. The Map state has two operating modes, Inline and Distributed, and both modes execute the same set of steps for a collection of items. A Map in Inline mode can support concurrency of 40 parallel branches and execution history limits of 25,000 events or approximately 6,500 state transitions in a workflow. With the Distributed mode, you can run at concurrency of up to 10,000 parallel branches. The Distributed Map has been optimized for Amazon S3, helping you more easily iterate over objects in an S3 bucket. See the FAQ in the integration section. The iterations of a Distributed Map are split into parallel executions to help you overcome payload and execution history limits. You can also choose whether each iteration is performed by a Standard Workflow, which is idempotent, or Express Workflow, which is a higher speed and lower cost, but not idempotent. Learn more about the Map state.

Comparisons

Q: When should I use Step Functions vs. Amazon Simple Queue Service (SQS)?

You should use AWS Step Functions when you need to coordinate service components in the development of highly scalable and auditable applications. Amazon Simple Queue Service (Amazon SQS), is used for when you need a reliable, highly scalable, hosted queue for sending, storing, and receiving messages between services.
 
  • Step Functions keeps track of all tasks and events in an application, SQS requires you to implement your own application-level tracking, especially if your application uses multiple queues.
  • The Step Functions console and visibility APIs provide an application-centric view that lets you search for executions, drill-down into an execution's detail, and administer executions. SQS would require implementing additional functionality.
  • Step Functions offers serveral features that facilitate application development, such as passing data between tasks and flexibility in distributing tasks, whereas SQS would require you to implement application-level functionality.
  • Step Functions has out-of-the-box capabilities to build workflows to coordinate your distributed application. SQS allows you to build basic workflows, but has limited functionality.

When should I use Step Functions vs. Amazon Simple Workflow Service (SWF)?

You should consider using Step Functions for all your new applications, since it provides a more productive and agile approach to coordinating application components using visual workflows. If you require external signals to intervene in your processes or you would like to launch child processes that return a result to a parent, then you should consider Amazon Simple Workflow Service (Amazon SWF).

With SWF, instead of writing state machines in declarative JSON, you can write a decider program to separate activity steps from decision steps. This provides you complete control over your orchestration logic, but increases the complexity of developing applications. You may write decider programs in the programming language of your choice, or you may use the Flow framework to use programming constructs that structure asynchronous interactions for you.

How does Step Functions’ HTTPS endpoints integration relate to Amazon EventBridge’s API Destinations?

Amazon EventBridge is a serverless service that uses events to connect application components together making it easier for developers to build scalable event-driven applications. API Destinations is a feature of EventBridge that enables you to create rules to forward events to third-party endpoints to decouple event producers and consumers.

AWS Step Functions’ HTTPS endpoints integration will enable you to invoke HTTPS-based services and receive a response that can be used to control the flow of your execution based on your business logic. Amazon EventBridge focuses on routing events, whereas Step Functions focuses on the orchestration of workflows and management of state. EventBridge API Destinations and Step Functions' HTTPS endpoints integration can support a connection for authentication, so you can reuse authentication credentials across services. Both services can be used together to build highly scalable and robust distributed applications.

Integration

Q: How does Step Functions connect and coordinate other AWS services?

Workflows that you create with Step Functions can connect and coordinate over 200 AWS services using service integrations. For example, you can:

  • Invoke an AWS Lambda function
  • Run an ECS or AWS Fargate task
  • Get an existing item from an Amazon DynamoDB table or put a new item into a DynamoDB table
  • Submit an AWS Batch job and wait for it to complete
  • Invoke Amazon Bedrock Foundation Model
  • Publish a message to an SNS topic
  • Send a message to an Amazon SQS queue
  • Start an AWS Glue job run
  • Create an Amazon SageMaker job to train a machine learning model or batch transform a data set

To learn more about using Step Functions to connect to other AWS services, see the Step Functions developer guide. You can also create tasks in your state machines that run applications, see the FAQ in the Overview section, How does Step Functions connect to my resources?

For the most common use-cases of Step Functions, visit the use cases page, where there is detailed cases, alongside their architecture visualizations.

Q: How does Step Functions integrate with third-party applications?

Using AWS Step Functions’ HTTPS endpoints integration you can directly integrate with HTTP-based services, including SaaS applications. Using a visual interface, you can build and orchestrate distributed applications composed of AWS services and SaaS applications.

Q: How can I test, analyze, or debug my executions?

You can use the TestState API to test a single step of your workflow, enabling faster feedback cycles to accelerate development. The TestState enables you to call services and endpoints directly, modify the input to mimic different scenarios, and review the response. You can access the TestState through Workflow Studio, making it easy to test as you build without the need to deploy your workflow. TestState accepts a single state definition and input, then synchronously returns the state output along with intermediate data transformations. After you run your workflow, you can analyze and debug executions through Amazon CloudWatch Logs, AWS X-Ray, and directly in the Step functions console through a visual operator experience that helps you quickly identify problem areas. 

Q: How can Step Functions help me process a large dataset in Amazon S3?

You can create workflows using a Map state in Distributed mode to perform large-scale processing of data such as logs, media files, sales transactions, or IoT sensor data. Step Functions will iterate through the items and start up parallel workflow executions instantly allowing you to build on-demand data processing at scale. The Distributed Map state has been optimized to work with S3. You can specify an S3 bucket with filter criteria, a S3 manifest file, JSON collection or CSV file stored in S3 as inputs for your workflow. You can also specify a S3 bucket for the execution outputs of a Distributed Map.

How does Step Functions work with Amazon API Gateway?

You can associate your Step Functions APIs with Amazon API Gateway so that these APIs invoke your state machines when an HTTPS request is sent to an API method that you define.

You can use an API Gateway API to start Step Functions state machines that coordinate the components of a distributed backend application, and integrate human activity tasks into the steps of your application such as approval requests and responses.

You can also make serverless asynchronous calls to the APIs of services that your application uses. For more information, try our tutorial, Creating a Step Functions API Using API Gateway.

Q: How does AWS Step Functions work with Amazon EventBridge?

Choregraphy and orchestration are two different models for how distributed services can communicate with one another. In orchestration, communication is more tightly controlled and Step Functions, an orchestration service, coordinates the interaction and order in which services are invoked.

Choreography achieves communication without tight control. With Amazon EventBridge events flow between services without centralized coordination. Many applications use both choreography and orchestration for different use cases. 

Examples of how you may use Step Functions and EventBridge together could include sending an event or creating a schedule with EventBridge Scheduler to trigger an AWS Step Functions workflow, followed by emitting events at different steps of your workflow.

What is AWS Step Functions vs. AWS Lambda

AWS Lambda is a serverless, event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. Step Functions is a serverless orchestration service that lets you easily coordinate multiple Lambda functions into flexible workflows that are easy to debug and change. Step Functions will keep your Lambda functions free of additional logic by triggering and tracking each step of your application for you.

Is AWS Step Functions Serverless?

Yes, Step Functions is a serverless orchestration service. Step Functions automatically scales the operations and underlying compute to run the steps of your application for you in response to changing workloads. Step Functions has built-in fault tolerance and maintains service capacity across multiple Availability Zones in each region to protect applications against individual machine or data center failures. This helps ensure high availability for both the service itself and for the application workflow it operates.
 
Step Functions offers pay-for-use billing model to increase agility and optimize costs. Learn more about Step Functions Pricing.

Q: How does logging and monitoring work for Step Functions?

AWS Step Functions sends metrics to Amazon CloudWatch and AWS CloudTrail for application monitoring. CloudWatch collects and track metrics, sets alarms, and automatically reacts to changes in AWS Step Functions.

CloudTrail captures all API calls for Step Functions as events, including calls from the Step Functions console and from code calls to the Step Functions APIs. Step Functions also supports CloudWatch Events managed rules for each integrated service in your workflow, and will create and manage CloudWatch Events rules in your AWS account as needed.

For more information, see monitoring and logging in the Step Functions developer guide.

Q: What happens if my Express Workflow fails due to exhausted retries or an unmanaged exception?

By default, Express Workflows report all outcomes to CloudWatch Logs including workflow input, output, and completed steps. You may select different levels of logging to only log errors, and you can choose to not log input and output. Workflows that exhaust retries or have an unmanaged exception should be re-run from the start.

Q: How does Step Functions help you build generative AI applications?

Step Functions has an optimized integration with Amazon Bedrock. You can invoke Bedrock’s Foundation Models directly from your Step Functions’ workflow using natural language. This gives you the ability to:   

  • Enrich your data processed by Step Functions with generative AI capabilities to reduce the complexity of handling your data, such as text summarization, image generation, or personalization.
  • Retrieve information from databases such as your latest product pricing and user personalization data and use Step Functions intrinsic functions to inject it into the prompt, making sure the LLM uses the most current data to improve the accuracy of the response.
  • Generate embeddings by having Step Functions go through docs, extract data, chunk the documents, and then transform the data from digital text to embedding as a multi-step process. This can be scheduled as a recurring process.
  • Use Step Function workflows for prompt chaining. You can orchestrate multiple LLM calls and choose the best model for each stage of the chain, forming a customized chain of processing stages, curating more contextually-aware and accurate responses from the foundational model.
  • Build Human-in-the-loop (HITL) interactions with your generative AI workflow to moderate answers to avoid hallucination or build in logic to handle responses that are not supported by the foundational model.

Security

Q: Can I access Step Functions from resources behind my Amazon VPC without connecting to the internet?

Step Functions also supports VPC Endpoints (VPCE) using AWS PrivateLink. You can access Step Functions from VPC-enabled AWS Lambda functions and other AWS services without traversing the public internet.

For more information, refer to the Amazon VPC Endpoints for Step Functions in the Step Functions developer guide.

Compliance

Q: What are the compliance standards supported by Step Functions?

Step Functions conforms to HIPAA, FedRAMP, SOC, GDPR, and other common compliance standards. See the AWS Cloud Security site to get a detailed list of supported compliance standards.

Get started with AWS Step Functions

Visit the getting started page
Ready to get started?
Sign in to the AWS Step Functions console
Have more questions?
Contact us