AWS HPC Blog
Building an AI simulation assistant with agentic workflows
Simulations have become indispensable tools which enable organizations to predict outcomes, evaluate risks, and make informed decisions. Simulations provide valuable insights that drive strategic decision-making – running the gamut from supply chain optimization to exploration of design alternatives for products and processes.
But running and analyzing simulations can be a time-consuming task because it requires specialized teams of data scientists, analysts, and subject matter experts. In the manufacturing sector, these experts are in high demand to model and optimize complex production processes. This regularly leads to backlogs and delays in obtaining critical insights. In the healthcare industry, specialized teams of epidemiologists and statisticians run the simulations for infectious disease modeling that public health officials need to make decisions. The limited bandwidth of these specialists creates bottlenecks and inefficiencies that impact the ability to rapidly respond to emerging health crises in a data-driven manner.
In this post, we’ll examine an generative AI-based “Simulation Assistant” demo application built using LangChain Agents and Anthropic’s recently released Claude V3 large language model (LLM) on Amazon Bedrock.
By leveraging the latest advancements in LLMs and AWS, we’ll show you how to streamline and democratize your simulation workflows using a scalable and serverless architecture for an application with a chatbot-style interface. This will allow users to launch and interact with simulations using natural language prompts.
How does this help experts?
The Simulation Assistant demo offers a blueprint for providing significant value to organizations in two key ways. It:
- Democratizes simulation-driven problem solving: While regulated industries may still require certified personnel for final sign-off, this solution demonstrates a way to democratize simulation use beyond specialist teams. By enabling knowledgeable personnel across functions, such as analysts, managers, and decision-makers, to launch and analyze simulations under the guidance of experts, organizations can increase the utilization of simulation capabilities and free up the bandwidth of their simulation experts.
- Enhances efficiency for simulation experts: Allowing a wider user-base to run routine simulations lets experts focus on high value tasks like performance tuning or building new simulations. Streamlined, automated workflows accessible through a single chatbot interface improves productivity, standardizes processes, enables knowledge sharing, and increases result reliability.
Whether you’re a business analyst, product manager, researcher, or a simulation expert, this demonstration offers an intuitive and efficient way to harness the power of simulations by leveraging the capabilities of generative AI through Amazon Bedrock and the scalability of AWS – driving innovation and operational excellence across diverse industries.
Solution overview
Figure 1 depicts the architecture of the Simulation Assistant application. A web application, built using Streamlit, serves as the user interface. Streamlit is an open-source Python library that allows you to create interactive web applications for machine learning and data science use cases. We’ve containerized this app using Docker and stored it in an Amazon Elastic Container Registry (ECR) repository.
The containerized web application is deployed as a load-balanced AWS Fargate Service within an Amazon Elastic Container Service (ECS) cluster. AWS Fargate is a serverless compute engine that allows you to run containers without managing servers or clusters. By using Fargate, the Simulation Assistant application can scale its compute resources up or down automatically based on the incoming traffic, ensuring optimal performance and cost-efficiency.
The web application is fronted by an Application Load Balancer (ALB). The ALB distributes incoming traffic across multiple targets, like Fargate tasks, in a balanced manner. This load balancing mechanism ensures that user requests are efficiently handled, even during periods of high traffic, by dynamically routing requests to available container instances.
Life cycle of a request
When a user accesses the Simulation Assistant application, their request is received by the ALB, which then forwards the request to one of the healthy Fargate tasks running the Streamlit web application. This serverless deployment approach, combined with the load-balancing capabilities of the ALB, provides a highly available and scalable architecture for the Simulation Assistant, allowing it to handle varying levels of user traffic without the need for manually provisioning and managing servers.
The Streamlit web application acts as the central hub, orchestrating the interaction between different AWS services to enable seamless simulation capabilities for users. Within the Streamlit app, we’ve used Amazon Bedrock to process user queries by leveraging state-of-the-art language models. Bedrock is a fully-managed service that makes these art foundation models from both Amazon and leading AI startups available via a unified API, while abstracting away complex model management.
For simple simulations, like price inflation scenarios, the Streamlit app integrates with AWS Lambda functions. These serverless functions can encapsulate lightweight simulation logic, allowing for efficient execution and scalability without the need for provisioning and managing dedicated servers.
Additionally, we’re also leveraging Amazon Kendra, an intelligent search service, to enable retrieval augmented generation (RAG). Amazon Kendra indexes and searches through documents stored in an Amazon S3 bucket, acting as a source repository. This integration empowers the application to provide relevant information from existing documents, enhancing the simulation capabilities and enabling more informed decision-making.
For more computationally intensive simulations, like running sets of investment portfolio simulations in a Monte Carlo-style manner, the Simulation Assistant uses AWS Batch. AWS Batch is a fully managed batch processing service that efficiently runs batch computing workloads across AWS resources. The Simulation Assistant submits jobs to AWS Batch, which then dynamically provisions the compute resources needed to run them in parallel, enabling faster execution times and scalability.
Once the simulations are complete, the results are stored in an Amazon DynamoDB database, a fully managed NoSQL database service. DynamoDB provides fast and predictable performance with seamless scalability, making it well-suited for storing and retrieving simulation data efficiently. Furthermore, the application integrates with Amazon EventBridge, a serverless event bus service. When a simulation batch is finished, EventBridge triggers a notification, which is sent to the user via email using Amazon Simple Notification Service (SNS). This notification system keeps users informed about the completion of their simulation requests, allowing them to promptly access and analyze the results.
LLM-based “agents” with “tools”
The Streamlit web application houses the logic of the key technological concept underlying the Simulation Assistant application, the enablement of what is called “agentic behavior” of an LLM. The precise definition of an LLM agent is elusive, because the field is relatively new and rapidly evolving. But the general idea is to augment the capabilities of LLMs by enabling the models to break down tasks into individual steps, make plans, and take actions including the use of tools to solve specific tasks, and even work together as a team of multiple agents that can collaborate and influence each other.
One design pattern for enabling agentic behavior of is called “Tool Use”, in which an LLM is taught (through prompt engineering or fine-tuning) how to trigger pieces of additional software, wrapped in a standardized form like a function call. These additional pieces of software are called “tools”. The Simulation Assistant employs tools to augment the behavior of an underlying foundation model. In our demo, the underlying foundation model is Claude V3 Sonnet.
Tools help LLMs solve problems that are not reliably solved by direct generation using the underlying transformer network. LLMs are demonstrating incredible ability at solving problems and performing mathematical reasoning – try asking Claude V3 Sonnet to “simulate the price of milk over the next 20 years with a dynamic inflation rate where the mean is 4% and standard deviation 2%”. But their ability to simulate more complex systems like financial markets or the spread of wildfires is still (for now) best left to trusted simulation codes. By introducing tools that can execute those trusted codes and instructing the LLM on how and when tools should be used, LLMs can gain profound new abilities.
There are many possibilities for what tools can be and do, and the application of tools and agentic behavior in general is a concept that will have wide ranging uses cases far beyond simulation. Tools can help LLMs to perform web searches, query a database, or schedule a meeting using your calendar system, just to name a few options.
How we applied agents and tools
The Simulation Assistant instantiates a LangChain agent, based on Claude V3 Sonnet. LangChain is an open-source framework for building applications with LLMs. With LangChain Agents and Tools, developers can create powerful generative AI-based applications that leverage LLM agentic behavior and integrate with existing systems and workflows.
There are several kinds of agents that can be constructed with LangChain, but not all agent types support tools with multiple inputs. Since simulations often require users to specify an array of input parameters to define the system mechanics to be simulated, we’ll want to choose an agent type that supports multi-input tools and can be used in concert with Amazon Bedrock. The Simulation Assistant demo uses a structured chat agent, which satisfies both of these requirements.
Building the Simulation Assistant involved addressing several key technical challenges:
- Designing and implementing an agent-tool architecture: To create an effective agent-tool system, careful design of the tool interfaces and integration with the LangChain agent framework was required. We handled multi-input tools by defining structured input schemas. We enabled communication between the LLM and the tools using LangChain’s tool abstraction layer.
- Prompt engineering for agentic behavior: Crafting prompts that guide the LLM to exhibit desired agentic behavior was an iterative process. The approach involved exploring the capabilities and limitations of Claude V3 Sonnet, designing prompts that promoted appropriate tool selection and usage, and refining the prompts based on performance in test scenarios.
- Scaling and managing simulation workloads: To handle computationally intensive simulations, we designed a scalable architecture using AWS Batch for running simulation jobs. This allowed us to efficiently manage compute resources and reliably execute simulations with varying workloads.
- Interpreting and visualizing simulation results: To help users interpret and visualize simulation outputs, we developed tools that process and summarize simulation data, and integrated visualizations within Simulation Assistant.
We gave the Simulation Assistant agent access to seven tools, including ones designed to perform a custom mathematical transform defined within the Streamlit application, perform simple inflation simulations housed in an AWS Lambda function, launch Monte Carlo-style simulation ensembles via AWS Batch to mimic an investment portfolio and visualize results, and to perform RAG over an internal database of documents.
These are simple examples, showing how tools can be built to handle some common elements of simulation workflows. But tools can do a lot more: they can also be designed to prepare configuration files, trigger pre- or post-processing jobs on related data – or even call other specialized LLMs that are able to interpret and summarize the results of simulations.
A sample workflow
When a user enters a natural language query into Simulation Assistant like, “run a set of 100 investment simulations starting at $10,000, with cash flow of $250, for 50 steps,” we pass the query to Amazon Bedrock inside a prompt specifically engineered to work well in an agent-tool setting. You can find examples of polished prompts you can use at the LangChain Hub – including prompts that work well with structured chat agents.
Behind the scenes, the agentic LLM breaks down the request into steps, decides whether each step can be completed with “bare hands” (i.e. without tools), or whether one of its tools can be used to complete the step. For our example request, the agentic LLM determines that it needs to use a tool that allows it to run batches of investment portfolio simulations, performs the entity extraction of the key parameters like cash flow, and uses these parameters to trigger an AWS Batch job to run the simulations. It does this completely independently.
The agent then responds to the user telling them they’ll receive an email when the simulations are complete and provides some helpful information that can be used to analyze the simulation results.
Figure 2 is a graphical depiction of this process.
For a deeper look into how we used these tools and to see the Simulation Assistant demo in action (including the batch simulation query described above) you can check out the video demo below. This will walk you through some potential interactions with the application, and shows the LLM-based agent interacting with various tools representing different aspects of a simulation workflow. You’ll see the agent list the available tools and provide instructions on how to use them when prompted.
Figure 3 – A video showing the Simulation Assistant demo in action. Click to play.
The core portion of the demo involves the user asking the agent to run a set of 100 investment portfolio simulations with specific parameters like starting amount, cash flow, and number of steps. The agent interprets this natural language query, extracts the necessary parameters, and triggers an AWS Batch job to execute the simulations in parallel. Once the simulations complete, the agent retrieves the results from a database and visualizes them using another tool.
Additionally, the video contrasts the agent’s capabilities with a standard LLM’s response to the same query, so you can see the enhanced abilities provided by the agent-tool architecture. You’ll also notice the agent breaking down a complex problem into steps and leveraging multiple tools in different orders to solve it, demonstrating the flexibility of the approach.
Future Work
While the Simulation Assistant demo achieves a lot, we want to extend the demo in the future to tackle some more technical challenges:
- Integrating with existing simulation codebases: A key goal of ours is to integrate existing simulation codebases as tools within the agent-tool architecture. This will require a deep understanding of the codebases and, in some cases, modifying them to fit the tool interface requirements. For example, to integrate OpenFOAM (a popular open-source computational fluid dynamics software) as a tool, the approach would involve wrapping OpenFOAM’s solver and utility executables as Python functions, defining input and output schemas, and potentially modifying the codebase to enable programmatic execution and data exchange.
- Ensuring simulation reproducibility and traceability: We’d like to enable comprehensive reproducibility and traceability of the simulation runs by implementing logging mechanisms that track input parameters, intermediate steps, and provide detailed documentation for each simulation execution.
- Establishing guardrails: Guardrails and safeguards are crucial to ensure the secure and responsible use of the simulation framework. This may involve setting limits on compute resources, enforcing access controls, and implementing validation checks to prevent potential misuse or unintended consequences. Additionally, ethical considerations should be considered, like ensuring privacy and data protection, avoiding biases, and promoting transparency in the simulation processes.
Conclusion
This post introduces an AWS-native Simulation Assistant demo that we hope will provide inspiration, and a blueprint, for organizations looking to leverage generative AI and other cutting-edge techniques like agentic LLM behavior to help their businesses.
The demo shows the potential of these technologies for revolutionizing simulation workflows across various industries. Using LangChain Agents, Amazon Bedrock, and the scalability of AWS services, the Simulation Assistant demo can offer a glimpse into a future where simulations might be more accessible and interactive. We hope this will let organizations unlock new insights and drive better decision-making.
The application of agentic LLM frameworks extend far beyond simulations, and the concept of tools can be applied to a countless number of domains or workflows. By enabling LLMs to interact with external systems, perform computations, and trigger actions, organizations can augment their existing processes in this way, fostering innovation and operational excellence.
Solutions like this can also pave the way for new paradigms in human-machine collaboration, amplifying human capabilities and accelerating the pace of discovery.
If your organization is interested in exploring how to implement these techniques in concert with your workflows, we encourage you to reach out to your AWS account team, or send an email to ask-hpc@amazon.com.