Networking & Content Delivery
Introducing Amazon Q support for network troubleshooting (preview)
An update was made on November 25, 2024: Amazon Q support for network troubleshooting is now generally available. Learn more.
This blog post explores how Amazon Q, the generative artificial intelligence (AI) powered assistant from AWS, helps you troubleshoot network-related issues by working with Amazon VPC Reachability Analyzer. These are exciting times for cloud networking! We’re a long way from the days of debugging connectivity issues with ping and traceroute. Now we ask questions in conversational language like “Why can’t I SSH to my server?” or “Do I have instances that can be accessed from the internet?” But, before we dive into how to use Amazon Q to diagnose connectivity problems in Amazon VPCs, let’s get to know Amazon Q a little better.
What is Amazon Q?
Amazon Q is a conversational assistant, powered by generative AI, that helps you improve productivity and build applications faster. Amazon Q helps you to get accurate answers, solve problems, generate content, and take actions using the data and expertise found in your company’s information repositories, including code-based and enterprise systems, eliminating undifferentiated work and allowing you to innovate faster than ever before.
Amazon Q is powerful. For example, drawing on expertise in the AWS Well-Architected Framework, best practices, documentation, and solutions, Amazon Q can help you explore new services, learn unfamiliar technologies, and architect solutions. Amazon Q in QuickSight makes business analysts and users’ more productive using generative business intelligence (BI) capabilities to build compelling visuals, summarize insights, answer data questions, and build data stories. In the contact center, Amazon Connect and Amazon Q work together to help your support team provide your customers with excellent service. More details on the Amazon Q capabilities, features and benefits can be found in the documentation.
Introducing Amazon Q network troubleshooting
Most applications use many AWS services and resources deployed across multiple AWS accounts, Amazon Virtual Private Clouds (VPCs), and AWS Regions. When you deploy your services within Amazon VPCs, you must configure VPC connectivity to different destinations on the internet, private, and hybrid networks. To identify and resolve connectivity issues, you need data on VPC components that allow you to connect your VPC, for example, internet gateways, security groups, network ACLs, subnets, route tables, and more. Compiling all this information in one place can be challenging, making it time consuming to troubleshoot network problems. And time is of the essence when these connectivity issues can cause application downtime, slow down deployments, or compromise security.
With Amazon Q network troubleshooting, you ask questions about your network in plain English. For example, you can ask questions like, “Why can’t my application communicate with my database?” or “Do I have instances that can be accessed from the internet?”
Amazon Q then uses large language models (LLMs) to understand the intent behind your questions. It looks at your current network configuration and metadata to infer context. For example, to answer a question like “why can’t my application reach the internet?”, it would look for an instance tagged as “web server” and the associated routing configuration. Amazon Q translates the natural language question and works with Reachability Analyzer to investigate the issue and provide a detailed answer with information on the underlying issue or how to troubleshoot further. More details on the Amazon Q network troubleshooting capabilities, features and benefits can be found in our documentation.
Prerequisites
We assume that you are familiar with the fundamentals of AWS networking, including Amazon VPCs, and VPC constructs like security groups, Network ACLs, route tables, and AWS Transit Gateway. We won’t focus on defining these components and services as we explore the capabilities of Amazon Q network troubleshooting. Rather, we review a few hypothetical use cases where Amazon Q helps you solve network connectivity problems.
Test setup
The architecture diagram shown in figure 1 describes our test environment for Amazon Q. The environment consists of a standard three tier architecture deployment in an Amazon VPC, in us-east-1 AWS Region.
Figure 1: Test three tier web application architecture used in this post
Before we start, our team tells us that the web, application, and database layers each have individual security groups, and each subnet network has a default access control list. The web servers have an associated Elastic IP address, and NAT gateways in the public subnets accommodate internet access for the private subnets in each availability zone.
We are also provided with the connectivity requirements, as follows:
- SSH access must be allowed to the web servers in Availability Zone 1 (AZ1)
- App servers must be able to communicate with the app database
But these flows are not working! We decide to troubleshoot these connectivity issues using Amazon Q, and see if we can quickly fix them, without knowing more details about the setup and environment.
Amazon Q network troubleshooting in action
To explore the new Amazon Q network troubleshooting capabilities, navigate to the Amazon Q console and simply start asking questions about network connectivity. Amazon Q then guides you to the experience, currently in preview release, that provides you with detailed information and answers to your questions. We start by asking Amazon Q why we can’t SSH into our web server in AZ1 as shown in the following screenshot (figure 2):
Figure 2: Asking Amazon Q network connectivity questions in the AWS Console
Following the prompt, we see that Amazon Q has correctly identified the EC2 instance we’re trying to SSH into (figure 3):
Figure 3: Initial Amazon Q network troubleshooting response
In seconds, we get the answer to our question, accompanied by a full path analysis provided by VPC Reachability Analyzer (figure 4):
Figure 4: Amazon Q response to our reachability question regarding web server in AZ1
Amazon Q discovered that there is no ingress rule configured in the security group associated with the web server in AZ1. We navigate to the security group configuration and discover that indeed, there are no inbound rules configured to allow SSH connections, as shown in Figure 5:
Figure 5: Web servers security group with no inbound allow rules
We add the necessary rule in the security group, as shown in Figure 6, and we can successfully SSH into the web server in AZ1:
Figure 6: Add security group rules to allow SSH
Then, app servers are not able to access the application database, so we ask Amazon Q again what the issue is. Amazon Q identifies that we have a Network Access Control List (Network ACL) rule that prevents traffic from being received by the database subnets, as shown in the following screenshot (figure 7):
Figure 7: Amazon Q analysis for connectivity issues between app servers and database
We navigate to the Network ACL configuration page and discover that rule id 90 is blocking inbound TCP traffic from the entire VPC IPv4 CIDR, 10.1.0.0/16, to the database subnets, as shown in Figure 8:
Figure 8: Network ACL configuration blocking app servers TCP traffic to databases
Troubleshooting connectivity between VPCs
After quickly resolving all network connectivity issues in VPC-01, we receive one more challenge from our team. The initial app in VPC-01 needs connectivity with a monitoring app hosted in VPC-02. We don’t know how the two VPCs are connected, or their routing configuration, so we decide to ask Amazon Q for help again, as shown in Figure 9:
Figure 9: Initial app connectivity issues to the monitoring app in VPC-02
Amazon Q analyzes the forward path between the initial app and the monitoring app, and discovers that connectivity is established through an AWS Transit Gateway. Moreover, it shows us two configuration errors that prevent reachability between the initial app and the monitoring app. First, the Network ACL associated with the app subnets in VPC-01 does not allow outbound traffic to the monitoring application in VPC-02, and second, the route table for private subnets in VPC-01, including the app subnets, does not have a route to the transit gateway.
Amazon Q also tells us that this is just the forward path analysis, and that we need to check how the return path is configured. We ask Amazon Q about the reverse path too, as shown in Figure 10:
Figure 10: Monitoring app return path to the initial app in VPC-01
Amazon Q helps us identify that we are missing a route to the transit gateway in the route table of the monitoring app subnets in VPC-02, and that the security group associated with the initial app servers in VPC-01 does not allow inbound traffic from the monitoring app.
We quickly fix these configuration errors and establish connectivity between the two apps. To test the changes, we use Amazon Q once more. We first verify if the forward path between the initial app and the monitoring app is configured correctly, and receive confirmation, as shown in Figure 11:
Figure 11: Confirmation that the initial app can communicate with the monitoring app
Then, we confirm that the return path is also configured correctly, as shown in Figure 12:
Figure 12: Confirmation that the monitoring app can communicate with the initial app
With the help of Amazon Q, we solved all connectivity problems and are back up and running!
Things to know
- Amazon Q network troubleshooting is currently available in preview in the US East (N. Virginia) Region. For pricing, visit the Amazon Q pricing page.
- The AWS Management Console provides the only means to interact with Amazon Q network troubleshooting.
- There is a limit of 20 questions per day, per account. The limit resets every 24 hours.
- The reachability analysis generated by Amazon Q will only be available in the chat window in the AWS Management Console. Once the chat is cleared (using start new troubleshooting) or after 24 hours, the conversation, including the reachability analysis, will be deleted.
Conclusion
In this blog post we reviewed how you can use Amazon Q, the recently launched generative AI assistant, to quickly identify and troubleshoot network-related issues in your AWS environment. Get started with Amazon Q network troubleshooting today by navigating to the Amazon Q console and simply start asking questions about network connectivity. If you have questions about this post, start a new thread on AWS re:Post, or contact AWS Support.