Networking & Content Delivery

Introducing Amazon Q support for network troubleshooting (preview)

This blog post explores how Amazon Q, the generative artificial intelligence (AI) powered assistant from AWS, helps you troubleshoot network-related issues by working with Amazon VPC Reachability Analyzer. These are exciting times for cloud networking! We’re a long way from the days of debugging connectivity issues with ping and traceroute. Now we ask questions in conversational language like “Why can’t I SSH to my server?” or “Do I have instances that can be accessed from the internet?” But, before we dive into how to use Amazon Q to diagnose connectivity problems in Amazon VPCs, let’s get to know Amazon Q a little better.

What is Amazon Q?

Amazon Q is a conversational assistant, powered by generative AI, that helps you improve productivity and build applications faster. Amazon Q helps you to get accurate answers, solve problems, generate content, and take actions using the data and expertise found in your company’s information repositories, including code-based and enterprise systems, eliminating undifferentiated work and allowing you to innovate faster than ever before.

Amazon Q is powerful. For example, drawing on expertise in the AWS Well-Architected Framework, best practices, documentation, and solutions, Amazon Q can help you explore new services, learn unfamiliar technologies, and architect solutions. Amazon Q in QuickSight makes business analysts and users’ more productive using generative business intelligence (BI) capabilities to build compelling visuals, summarize insights, answer data questions, and build data stories. In the contact center, Amazon Connect and Amazon Q work together to help your support team provide your customers with excellent service. More details on the Amazon Q capabilities, features and benefits can be found in the documentation.

Introducing Amazon Q network troubleshooting

Most applications use many AWS services and resources deployed across multiple AWS accounts, Amazon Virtual Private Clouds (VPCs), and AWS Regions. When you deploy your services within Amazon VPCs, you must configure VPC connectivity to different destinations on the internet, private, and hybrid networks. To identify and resolve connectivity issues, you need data on VPC components that allow you to connect your VPC, for example, internet gateways, security groups, network ACLs, subnets, route tables, and more. Compiling all this information in one place can be challenging, making it time consuming to troubleshoot network problems. And time is of the essence when these connectivity issues can cause application downtime, slow down deployments, or compromise security.

With Amazon Q network troubleshooting, you ask questions about your network in plain English. For example, you can ask questions like, “Why can’t my application communicate with my database?” or “Do I have instances that can be accessed from the internet?”

Amazon Q then uses large language models (LLMs) to understand the intent behind your questions. It looks at your current network configuration and metadata to infer context. For example, to answer a question like “why can’t my application reach the internet?”, it would look for an instance tagged as “web server” and the associated routing configuration. Amazon Q translates the natural language question and works with Reachability Analyzer to investigate the issue and provide a detailed answer with information on the underlying issue or how to troubleshoot further. More details on the Amazon Q network troubleshooting capabilities, features and benefits can be found in our documentation.

Prerequisites

We assume that you are familiar with the fundamentals of AWS networking, including Amazon VPCs, and VPC constructs like security groups, Network ACLs, route tables, and AWS Transit Gateway. We won’t focus on defining these components and services as we explore the capabilities of Amazon Q network troubleshooting. Rather, we review a few hypothetical use cases where Amazon Q helps you solve network connectivity problems.

Test setup

The architecture diagram shown in figure 1 describes our test environment for Amazon Q. The environment consists of a standard three tier architecture deployment in an Amazon VPC, in us-east-1 AWS Region.

Figure 1: Test three tier web application architecture used in this post

Before we start, our team tells us that the web, application, and database layers each have individual security groups, and each subnet network has a default access control list. The web servers have an associated Elastic IP address, and NAT gateways in the public subnets accommodate internet access for the private subnets in each availability zone.

We are also provided with the connectivity requirements, as follows:

  • SSH access must be allowed to the web servers in Availability Zone 1 (AZ1)
  • App servers must be able to communicate with the app database

But these flows are not working! We decide to troubleshoot these connectivity issues using Amazon Q, and see if we can quickly fix them, without knowing more details about the setup and environment.

Amazon Q network troubleshooting in action

To explore the new Amazon Q network troubleshooting capabilities, navigate to the Amazon Q console and simply start asking questions about network connectivity. Amazon Q then guides you to the experience, currently in preview release, that provides you with detailed information and answers to your questions. We start by asking Amazon Q why we can’t SSH into our web server in AZ1 as shown in the following screenshot (figure 2):

Figure 2: Asking Amazon Q network connectivity questions in the AWS Console

Following the prompt, we see that Amazon Q has correctly identified the EC2 instance we’re trying to SSH into (figure 3):

Figure 3: Initial Amazon Q network troubleshooting response

In seconds, we get the answer to our question, accompanied by a full path analysis provided by VPC Reachability Analyzer (figure 4):

Figure 4: Amazon Q response to our reachability question regarding web server in AZ1

Amazon Q discovered that there is no ingress rule configured in the security group associated with the web server in AZ1. We navigate to the security group configuration and discover that indeed, there are no inbound rules configured to allow SSH connections, as shown in Figure 5:

Figure 5: Web servers security group with no inbound allow rules

We add the necessary rule in the security group, as shown in Figure 6, and we can successfully SSH into the web server in AZ1:

Figure 6: Add security group rules to allow SSH

Then, app servers are not able to access the application database, so we ask Amazon Q again what the issue is. Amazon Q identifies that we have a Network Access Control List (Network ACL) rule that prevents traffic from being received by the database subnets, as shown in the following screenshot (figure 7):

Figure 7: Amazon Q analysis for connectivity issues between app servers and database

We navigate to the Network ACL configuration page and discover that rule id 90 is blocking inbound TCP traffic from the entire VPC IPv4 CIDR, 10.1.0.0/16, to the database subnets, as shown in Figure 8:

Figure 8: Network ACL configuration blocking app servers TCP traffic to databases

Troubleshooting connectivity between VPCs

After quickly resolving all network connectivity issues in VPC-01, we receive one more challenge from our team. The initial app in VPC-01 needs connectivity with a monitoring app hosted in VPC-02. We don’t know how the two VPCs are connected, or their routing configuration, so we decide to ask Amazon Q for help again, as shown in Figure 9:

Figure 9: Initial app connectivity issues to the monitoring app in VPC-02

Amazon Q analyzes the forward path between the initial app and the monitoring app, and discovers that connectivity is established through an AWS Transit Gateway. Moreover, it shows us two configuration errors that prevent reachability between the initial app and the monitoring app. First, the Network ACL associated with the app subnets in VPC-01 does not allow outbound traffic to the monitoring application in VPC-02, and second, the route table for private subnets in VPC-01, including the app subnets, does not have a route to the transit gateway.

Amazon Q also tells us that this is just the forward path analysis, and that we need to check how the return path is configured. We ask Amazon Q about the reverse path too, as shown in Figure 10:

Figure 10: Monitoring app return path to the initial app in VPC-01

Amazon Q helps us identify that we are missing a route to the transit gateway in the route table of the monitoring app subnets in VPC-02, and that the security group associated with the initial app servers in VPC-01 does not allow inbound traffic from the monitoring app.

We quickly fix these configuration errors and establish connectivity between the two apps. To test the changes, we use Amazon Q once more. We first verify if the forward path between the initial app and the monitoring app is configured correctly, and receive confirmation, as shown in Figure 11:

Figure 11: Confirmation that the initial app can communicate with the monitoring app

Then, we confirm that the return path is also configured correctly, as shown in Figure 12:

Figure 12: Confirmation that the monitoring app can communicate with the initial app

With the help of Amazon Q, we solved all connectivity problems and are back up and running!

Things to know

  • Amazon Q network troubleshooting is currently available in preview in the US East (N. Virginia) Region. For pricing, visit the Amazon Q pricing page.
  • The AWS Management Console provides the only means to interact with Amazon Q network troubleshooting.
  • There is a limit of 20 questions per day, per account. The limit resets every 24 hours.
  • The reachability analysis generated by Amazon Q will only be available in the chat window in the AWS Management Console. Once the chat is cleared (using start new troubleshooting) or after 24 hours, the conversation, including the reachability analysis, will be deleted.

Conclusion

In this blog post we reviewed how you can use Amazon Q, the recently launched generative AI assistant, to quickly identify and troubleshoot network-related issues in your AWS environment. Get started with Amazon Q network troubleshooting today by navigating to the Amazon Q console and simply start asking questions about network connectivity. If you have questions about this post, start a new thread on AWS re:Post, or contact AWS Support.

About the authors

Alex Huides

Alexandra Huides

Alexandra Huides is a Principal Networking Specialist Solutions Architect within Strategic Accounts at Amazon Web Services. She focuses on helping customers build and develop networking architectures for highly scalable and resilient AWS environments. Alex is also a public speaker for AWS, and is helping customers adopt IPv6. Outside work, she loves sailing, especially catamarans, traveling, discovering new cultures, and reading.

Matt Headshot1.jpg

Matt Lehwess

Matt Lehwess is a Senior Principal Solutions Architect for AWS. Matt has spent many years working as a network engineer in the network service provider space, building large-scale WAN networks in the Asia Pacific region and North America, as well as deploying data center technologies and their related network infrastructure. As a result, he is most at home working with Amazon VPC, AWS Direct Connect, and Amazon’s other infrastructure-focused products and services. Matt is also a public speaker for AWS, and he enjoys spending time helping customers solve large-scale problems using the AWS Cloud platform. Outside of work, Matt is an avid rock climber, both indoor and outdoor, and a keen surfer.

Nishant.jpg

Nishant Kumar

Nishant Kumar is a Senior Product Manager in the Amazon VPC team. He is interested in areas of network observability and network management. Outside work, he loves Formula 1 racing, cooking, and exploring wildlife.