Networking & Content Delivery

Setting up of AWS Site-to-Site VPN automated monitoring solution

In today’s interconnected world, businesses of all sizes rely on secure and efficient network connectivity to operate seamlessly across multiple locations. Amazon Web Services (AWS) Site-to-Site Virtual Private Networks (Site-to-Site VPN) offer a reliable way to extend a private network across public infrastructure such as the internet, enabling organizations to securely connect their offices, data centers, and Amazon Virtual Private Clouds (Amazon VPCs).

In the dynamic landscape of cloud networking, making sure of quick overall set-up time, seamless connectivity, and rapid issue resolution is paramount for your businesses using Site-to-Site VPNs. However, you would traditionally review your on-premises firewall logs and collaborate with AWS support engineers to analyze Site-to-Site VPN logs, aiming to pinpoint the root cause of issues and work toward resolution. This process often involves significant time and effort for manual troubleshooting and resolution. In this post, we show a real-time automated solution designed to streamline the troubleshooting process. This solution features early error detection, a swift notification system, and a comprehensive guide with detailed steps for resolution.

Key benefits of automated Site-to-Site VPN monitoring

Recognizing the need for faster and more proactive support, we have implemented automation solutions that relieve your team from repetitive tasks. This allows you to focus on more complex issues and strategic initiatives, enhancing overall operational efficiency and optimizing productivity. You are empowered to receive real-time notifications and detailed insights into your Site-to-Site VPN tunnel status, reducing the risk of undetected issues or downtime.

Implementing automated Site-to-Site VPN monitoring with AWS

The implementation of automated Site-to-Site VPN solution workflow involves a dual strategy. First, we use existing Amazon CloudWatch logs, which focus on tunnel options for IP Security (IPsec) tunnel establishment, Internet Key Exchange (IKE) negotiations, and dead peer detection (DPD) protocol messages. Second, we have implemented a custom monitoring solution that uses a subscription filter to deliver CloudWatch log events to an AWS Lambda function for further processing and analysis. The Lambda function tracks the error state and timestamp in an Amazon DynamoDB table and promptly sends out a notification through Amazon Simple Notification Service (Amazon SNS) whenever the Site-to-Site VPN tunnel encounters an error.

Architecture

The following figure shows the architecture for this solution.

An AWS architecture diagram showcasing a workflow involving AWS Site-to-Site VPN, CloudWatch, Lambda, DynamoDB, and SNS services

Figure 1: Resources deployed in the AWS environment by the solution

How it works

The solution is created through an AWS CloudFormation template that launches components as:

  1. DynamoDB table: A DynamoDB table is created to store the notification status for each VPN tunnel error’s last notification time. This table helps prevent the sending of duplicate notifications for the same error within a specified time frame.
  2. IAM role (optional): An AWS Identity and Access Management (IAM) role for the Lambda function execution with permissions to publish to the provided SNS topic and access the DynamoDB table. This role is created only if the CreateNewLambdaRole parameter in the CloudFormation template is set to yes.
  3. Lambda function: A Python Lambda function to monitor the CloudWatch logs for VPN tunnel errors, send notifications to the provided SNS topic, and update the DynamoDB table with the notification status.
  4. Lambda permission: Allows CloudWatch to invoke the Lambda function.
  5. CloudWatch log subscription filter: Subscribes the Lambda function to the provided CloudWatch log group for specific VPN tunnel error patterns.

Once a subscription filter is assigned with the error pattern on a CloudWatch log group destined for a Site-to-Site VPN connection, it relays the matched error to Lambda. Lambda processes the CloudWatch log events, identifies the errors based on predefined patterns, and sends notifications with detailed troubleshooting steps to the provided SNS topic. It also checks whether you were notified within the given timeframe earlier for the same event. This helps reduce the number of email notification if the error keeps coming and is not addressed. Here’s a breakdown of the Lambda function functionality:

  1. Log event processing: The function decodes and processes the CloudWatch log events received from the log subscription filter.
  2. Error identification: The function matches the log messages against predefined error patterns.
  3. Notification handling: If an error is identified, then the function checks the DynamoDB table to determine if a notification has already been sent within the specified time frame (NotificationFrequency parameter). If not, then it sends a notification to the SNS topic with detailed troubleshooting steps for the specific error.
  4. DynamoDB update: The function updates the DynamoDB table with the notification status for each error to avoid sending duplicate notifications within the specified time frame.

Solution deployment

The following sections outline the solution deployment.

Prerequisites

Before we dive into the CloudFormation template, make sure that you have the following prerequisites in place:

  1. An existing SNS topic to receive the notifications.
  2. A CloudWatch log group associated with your VPN connection.
  3. (Optional) An existing IAM role with permissions to publish to the SNS topic and access the DynamoDB table, if you don’t want to create a new role.

Steps:

  1. Download the CloudFormation template from here
  1. Stack details:
    • On the “Specify stack details” page, enter a Stack name (such as VPNTunnelMonitoring).
    • Provide the necessary input parameters:
      • SnsTopicARN: Enter the Amazon Resource Name (ARN) of the SNS topic where you want to receive notifications.
      • VpnLogGroup: Enter the name of the CloudWatch log group associated with your VPN connection.
      • CreateNewLambdaRole: Enter yes if you want CloudFormation to create a new IAM role for the Lambda function, or no if you want to use an existing role.
      • ExistingLambdaRoleARN (optional): If you chose no for CreateNewLambdaRole, then enter the ARN of an existing IAM role with the necessary permissions for the Lambda function.
      • NotificationFrequency (optional): Enter the frequency (in hours) at which you want to receive notifications for VPN errors until the error is fixed. An error can keep logging to the CloudWatch log group at periodic intervals if you do not rectify those. This parameter denotes how frequently you want to get notified until the error is fixed. For example, if you put this as two, then you get notified once every two hours if that error continues and is not fixed.

        CloudFormation template input parameters for the solution as they appear on the console

        Figure 2: CloudFormation template input parameters for the solution as they appear on the console

    • Select “Next”
  2. Review and Create:
  3. Stack Creation:
    • CloudFormation starts creating the stack and provisioning the necessary resources.
    • You can monitor the stack creation progress in the CloudFormation console or through the events tab.
  4. Stack Creation Complete:
    • Once the stack creation is complete, you should see the status “CREATE_COMPLETE” in the CloudFormation console.
    • Verify that the resources have been created successfully.

After completing these steps, the VPN Tunnel Error Monitoring and Notification automation is set up in your AWS account. Then, the Lambda function starts monitoring the specified CloudWatch log group and sends notifications to the SNS topic whenever a VPN tunnel error is detected.

The following is a sample error notification email:

Notification email for a sample error “No Proposal Match Found by AWS”

Figure 3: Notification email for a sample error “No Proposal Match Found by AWS”

Considerations:

  • You can customize the CloudFormation according to your use case and business needs. For example, you can choose to enable custom encryption on DynamoDB and choose to have the Lambda function inside of a VPC.
  • As some errors can continue at a regular frequency until they are fixed, you might receive multiple emails. The NotificationFrequency parameter helps you set how frequently you need to get notified for error discovery. For example, if a specific error is discovered for the first time, then you will get notified and this notification timestamp is stored in the DynamoDB table. If this specific error appears again within the NotificationFrequency, then you will not get notified. This gives you time to resolve the error. This also means that if the error is fixed but appears again due to misconfiguration, then you will not get notified until the NotificationFrequency time is passed.
  • The solution creates a subscription filter on the CloudWatch log group destined for your Site-to-Site VPN. Therefore, if there are multiple VPNs monitoring the same log group, then those VPN connections are monitored through this solution. If you have a non-production VPN or testing VPN that you don’t want to cover under this solution, then you can use another CloudWatch log group.
  • There is no additional cost for the subscription filter, and you only pay for the data logged and the ingestion of the logs in CloudWatch. You can find the pricing related to CloudWatch Logs in the Amazon CloudWatch pricing. For the pricing details of the Lambda and DynamoDB services, refer to their respective pricing pages: Lambda Pricing and DynamoDB Pricing.

Cleaning up

You can delete the solution by deleting the CloudFormation stack created as part of Step 4.

Conclusion: Elevating the VPN monitoring experience

In this post, we discussed a real-time monitoring solution for Site-to-Site VPN, using existing AWS VPN CloudWatch logs and setting up a customized solution to monitor availability and provide detailed insights into the status and issues of Site-to-Site VPN tunnels. Furthermore, we walked you through the steps to configure this automated solution using AWS Lambda, Amazon DynamoDB, and Amazon SNS services.

With real-time rapid notifications, detailed error messages, and resolution steps, businesses can facilitate a quicker response to outages, minimize downtime, improve operational efficiency, and maintain seamless connectivity across distributed environments.

If you have feedback about this post, submit comments in the Comments section. If you have questions about this post, contact AWS Support.

About the authors

Narinder Singh Kharbanda

Narinder Singh Kharbanda

Narinder Singh is a senior support engineer and networking expert at Amazon Web Services, Inc. He works with customers to design, troubleshoot, and implement cloud-based architectures. He enjoys working with customer and holds MS in computer systems networking and telecommunications from George Mason University, specializing in computer networking.

Bhuvan Jain

Bhuvan Jain

Bhuvan is a Technical Account Manager at AWS, supporting independent software vendor (ISV) customers. He is passionate about helping customers build Well-Architected solutions on AWS, with a focus on enterprise-scale networking. As a subject matter expert in Site-to-Site VPN, Bhuvan offers guidance on designing VPN network architectures that are highly available, resilient, and cost-effective. He holds a Master’s degree in Electrical and Computer Engineering from the University of Illinois at Chicago (UIC). In his free time, he enjoys playing basketball and volleyball, as well as watching movies and TV series.

Shrikant Davange

Shrikant is Sr. Cloud Support Engineer in AWS having expertise in Network Monitor and scale. Shrikant finds great pleasure in collaborating with customers to troubleshoot complex technical issues, identify performance improvement, cost optimization and resilience opportunities, design modern cloud architectures, design automations. Outside of work, he has a passion for sports badminton, Swimming and reading books.