Networking & Content Delivery

Using AWS Network Manager Events to manage and monitor your global network

AWS Network Manager is a great tool that lets you monitor changes in your network and create automations. In this post, we cover how to leverage events in Network Manager to get notified about network changes and how to use AWS Serverless technologies to enrich the information provided by these events.

Let’s start with a simple question: What is Network Manager? Network Manager lets you centrally monitor and manage your network across AWS accountsAWS Regions, and on-premises locations. With Network Manager, you can visualize your global networks using the AWS Management Console dashboard, obtain metrics for your resources and send them to Amazon CloudWatch, or deliver near-real time events describing changes in your network to Amazon EventBridge.

We are focusing on the last functionality: how to use the events and enhance your visibility and operations in the network. To help you get started, you can deploy the use case covered using the following aws-samples repository.

Setting up Network Manager Events

First thing is to set up the global network. We will focus on AWS Transit Gateway as the connectivity service, and we will monitor and automate actions from changes in the VPCs and Site-to-Site VPN attachments. It’s important to note that Network Manager’s home Region is US West (Oregon). Consequently, the EventBridge rules that capture Network Manager events need to be created there. The service monitors events in all the Regions where you have resources.

From the Network Manager console, you can review the global networks created under Global Networks, as shown in figure 1.

View of the Network Manager console showing a global network information.

Figure 1. Network Manager console – global network

To leverage Network Manager Events for Transit Gateways, you should register them to a global network. Move to the Transit gateway network – Transit gateways section to check the Transit Gateways registered to the global network. In our use case, we already have a Transit Gateway registered, as shown in figure 2.

View of the Network Manager console showing the global network dashboard. In addition, there's a red circle at the left-hand side showing the Transit Gateway section.

Figure 2. Global Network dashboard – Moving to Transit gateways section

View of the Network Manager console showing the Transit Gateway section. A list of one Transit Gateway registered is shown.

Figure 3. Network Manager console – Transit Gateways registered

Before moving to the automation solution, note that if it’s the first time you are configuring Network Manager Events, then you need to enable them. Move to the Transit gateway network and select Onboard cloudwatch log insight (under Events) as shown in figure 4. Future global networks in the Account have this configuration already enabled.

View of the Network Manager console showing the Transit Gateway network section. A red circle in the center shows how to active events.

Figure 4. Enabling Network Manager events

Now we are ready! Before we continue, let’s address a couple of the questions you may have:

  • Can I use Network Manager cross-Accounts? Yes, you can visualize Transit Gateways across multiple AWS Accounts within an AWS Organization. For more information, check the How to use AWS Network Manager to visualize Transit Gateways across multiple accounts in the AWS Organization post.
  • Can I use Network Manager to automate actions across AWS Regions? Yes, you must give appropriate permissions to the automation solution to perform cross-Region actions. We cover this later in the post.
  • Where can I find related Transit Gateway events’ information? You can check the documentation for the topology changes such as updates to Transit Gateway attachments, routing updates, and status updates.

Leveraging AWS Serverless with Network Manager Events

Once we have Transit Gateways registered in the global network, any event happening to those resources and its attachment is sent to EventBridge. From there, there are several possibilities to leverage Serverless technologies to curate the information received or perform several actions in the network. Figure 5 shows an example. Remember that you can deploy this architecture using the following aws-samples repository.

Architecture diagram showing a solution based on Network Manager events. First solution (on the top) shows the integration with Step Functions (and the use of AWS Lambda and Amazon SNS). Down, it shows the direct integration with Lambda and Amazon DynamoDB.

Figure 5. Event processing architecture diagram

From the events delivered to EventBridge, two automations are deployed:

Automating Transit Gateway actions from VPC attachment events

Let’s start by defining the actions the Step Functions state machine performs.

  • For created VPC attachments, it creates Transit Gateway association and propagation to the corresponding route table. After this, it sends a notification.
  • For deleted VPC attachments, it simply sends a notification.

In addition to our state machine, we are leveraging three AWS services to build the automation:

  • AWS Systems Manager Parameter Store is used to retrieve the Transit Gateway route table ID to which we associate and propagate the new VPC attachments.
    • The resource is created in the AWS Region where the Transit Gateway is located. In multi-Region environments you must consider creating Systems Manager parameters in each AWS Region.
    • We defined Tier – Standard, Type – String, and Data type – Text. For production environments, we recommend using SecureString to encrypt your parameter using AWS Key Management Service (AWS KMS).
View of Systems Manager console showing the description of a parameter created (/nm/automation/tgw-route-table)

Figure 6. Systems Manager parameter – Transit Gateway route table

  • An Amazon SNS topic for the notifications.
    • This resource is created in US West (Oregon) – the Region where the automation solution is built.
    • We defined Type – Standard.
View of the SNS console, showing the configuration of an SNS Topic (nm-sns-topic) with 1 subscription (email type).

Figure 7. Amazon SNS topic – Email subscription

  • A Lambda function to create the Transit Gateway routing, created in US West (Oregon).
    • General configuration:
      • Runtime Python 3.10.
      • 128 MB memory.
      • 512 MB ephemeral storage.
      • 1 min 30 sec timeout.
    • The function code can be found below. It obtains the Systems Manager parameter name (from an environment variable) and obtains the Transit Gateway route table ID. With this information—and the Transit Gateway VPC attachment ID obtained from the event—it creates the association and propagation with the Transit Gateway route table.
import json
import boto3
import os

def lambda_handler(event, context):
    try:
        # Obtaining information from the event
        region = event['detail']['transitGatewayAttachmentArn'].split(':')[3]
        tgw_attachment = event['detail']['transitGatewayAttachmentArn'].split('/')[1]
        
        # Obtaining SSM Parameter name from environment variables
        parameter = os.environ['PARAMETER_NAME']
        
        # Boto3 clients
        ssm = boto3.client('ssm', region_name=region)
        ec2 = boto3.client('ec2', region_name=region)

        # Obtaining the Transit Gateway route table ID from Systems Manager Parameter Store
        tgw_rt = ssm.get_parameter(Name=parameter)['Parameter']['Value']

        # We create Transit Gateway association and propagation to the route table
        tgw_association = ec2.associate_transit_gateway_route_table(
            TransitGatewayRouteTableId=tgw_rt,
            TransitGatewayAttachmentId=tgw_attachment
        )['Association']['ResourceId']
        tgw_propagation = ec2.enable_transit_gateway_route_table_propagation(
            TransitGatewayRouteTableId=tgw_rt,
            TransitGatewayAttachmentId=tgw_attachment
        )['Propagation']['ResourceId']
        
        response = {
            'transitGatewayAssociationId': tgw_association,
            'transitGatewayPropagationId': tgw_propagation
        }

        return {
            'statusCode': 200,
            'body': json.dumps(response)
        }
    
    except Exception as e:
        # Printing the error in logs
        print(e)
        return {
            'statusCode': 500,
            'body': json.dumps('Something went wrong. Please check the logs.')
        }

Regarding permissions, the AWS Identity and Access Management (IAM) role associated to the function should have permissions to perform cross-Region actions. Remember to attach the managed IAM policy AWSLambdaBasicExecutionRole to allow logging in to Amazon CloudWatch logs.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "ec2:AssociateTransitGatewayRouteTable",
                "ec2:EnableTransitGatewayRouteTablePropagation",
                "ssm:GetParameter"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

With these resources ready, it is time to check the Step Functions state machine definition.

{
  "Comment": "TGW Routing Automation",
  "StartAt": "ActionType",
  "States": {
    "ActionType": {
      "Choices": [
        {
          "Next": "CreateTGWRouting",
          "StringEquals": "VPC-ATTACHMENT-CREATED",
          "Variable": "$.detail.changeType"
        }
      ],
      "Default": "SendNotification",
      "Type": "Choice"
    },
    "CreateTGWRouting": {
      "Next": "SendNotification",
      "Parameters": {
        "FunctionName": "arn:aws:lambda:us-west-2:{ACCOUNT_ID}:function:{FUNCTION_NAME}",
        "Payload.$": "$"
      },
      "Resource": "arn:aws:states:::lambda:invoke",
      "ResultPath": null,
      "Retry": [
        {
          "BackoffRate": 2,
          "ErrorEquals": [
            "Lambda.ServiceException",
            "Lambda.AWSLambdaException",
            "Lambda.SdkClientException",
            "Lambda.TooManyRequestsException"
          ],
          "IntervalSeconds": 1,
          "MaxAttempts": 3
        }
      ],
      "Type": "Task"
    },
    "SendNotification": {
      "End": true,
      "Parameters": {
        "Message.$": "$",
        "TopicArn": "arn:aws:sns:us-west-2:{ACCOUNT_ID}:{TOPIC_NAME}"
      },
      "Resource": "arn:aws:states:::sns:publish",
      "Type": "Task"
    }
  }
}
  • The first state ActionType is a choice state that checks the event’s changeType:
    • If VPC-ATTACHMENT-CREATED, it creates the Transit Gateway association and propagation.
    • Otherwise (VPC-ATTACHMENT-DELETED), it sends the email notification directly.
  • If VPC-ATTACHMENT-CREATED, it invokes the Lambda function (CreateTGWRouting) to create the Transit Gateway association and propagation. Once the routing has been created, it moves to the SendNotification state to send an email notification.
  • The SendNotification state uses the Amazon SNS Publish task.
State machine showing the automation steps. Three steps are shown: ActionType, CreateTGWRouting, and SendNotification.

Figure 8. State machine workflow

With our state machine definition ready, the last thing we need to do is define its permissions. Given the cross-Region actions are performed by the Lambda function (and it already has those permissions configured), the IAM role associated to the state machine can be as granular as only allowing the invocation of the Lambda function and the use of the Amazon SNS topic.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "InvokeLambdaFunction",
            "Effect": "Allow",
            "Action": "lambda:InvokeFunction”,
            "Resource": "arn:aws:lambda:us-west-2:{ACCOUNT_ID}:function:{FUNCTION_NAME}"
        },
        {
            "Sid": "AllowSNSPublicsh",
            "Effect": "Allow",
            "Action": "sns:Publish",
            "Resource": "arn:aws:sns:us-west-2:{AWS_ACCOUNT}:{TOPIC_NAME}"
        }
    ]
}

With the Step Functions state machine ready, the last thing is the EventBridge Rule that invokes it. In the following figure you can see the configuration in our example.

View of the EventBridge console, showing the definition of the rule nm-tgw-routing.

Figure 9. EventBridge rule (nm-tgw-routing)

  • Event bus: default.
  • Event pattern:
    • Detail-type: “Network Manager Topology Change”.
    • Source: “aws.networkmanager”.
    • Detail – ChangeType: “VPC-ATTACHMENT-CREATED” and “VPC-ATTACHMENT-DELETED”.
  • Target: Step Functions state machine. If you create these resources using Infrastructure as Code (IaC) (recommended option) you must provide permissions to EventBridge to invoke the state machine.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "InvokeStateMachine",
            "Effect": "Allow",
            "Action": "states:StartExecution”,
            "Resource": "arn:aws:states:us-west-2:{ACCOUNT_ID}:stateMachine:{NAME}"
        }
    ]
}
View of the EventBridge console, showing a rule target - Step Functions state machine.

Figure 10. EventBridge rule target – Step Functions state machine

Time to test the automation! If you create a new VPC attachment (the code in the aws-samples repository provides you an example), you should see a successful state machine execution (shown in figure 11), in addition to an email with the EventBridge event details (as shown in figure 12).

View of the Step Functions console, showing a successful execution of the state machine.

Figure 11. Successful state machine execution after Transit Gateway attachment creation

Screenshot of an email showing the EventBridge information. In red squares, it is highlighted the changeType and the transitGatewayAttachmentArn parameters.

Figure 12. Email notification – New VPC attachment created

Finally, if you move to the VPC console (Transit Gateway route tables view) in the Region where the Transit Gateway has been created, then you can check the route table’s association and propagation of our recently created VPC attachment.

View of the VPC console view, showing a Transit Gateway route table with one association - the VPC attachment created.

Figure 13. Transit Gateway route table associations

For multi-Account environments (when the owner of the VPC and Transit Gateway is different), this automation can help to create the Transit Gateway routing configuration at scale with each new VPC attachment. We have kept it simple with a single route table, but you can use services like AWS Secrets Manager to send more information about the VPC attachment between Accounts, and create complex automations. Furthermore, take into account that you should provide cross-Account permissions either in the Step Functions state machine or the Lambda function.

Tracking VPN actions using Lambda and DynamoDB

It’s time now for the second automation in our solution: we want to track all of the actions from our VPNs (new attachments, BGP or IPSec changes, etc.) in a DynamoDB table for further processing. This data can be used for example in troubleshooting activities.

As before, let’s start checking the resources needed for the automation. In this case, as shown in Figure 5, the EventBridge rule directly targets a Lambda function, which writes into the DynamoDB table. The Lambda function is created in US West (Oregon).

  • Runtime Python 3.10
  • 128 MB memory
  • 512 MB ephemeral storage
  • 1 min 30 second timeout

The function code can be found below. It obtains the DynamoDB table name from an environment variable and does a PutItem action against the table.

import json
import boto3
import os

def lambda_handler(event, context):
    try:
        # Obtaining information from the event
        timestamp = event['time']
        change_type = event['detail']['changeType']
        vpn_id = event['detail']['vpnConnectionArn']
        region = event['detail']['region']
        
        # Boto3 client
        dynamodb = boto3.client('dynamodb')
        # DynamoDB table
        table = os.environ['TABLE_NAME']
        # We log the event in the DynamoDB table
        dynamodb.put_item(
            TableName=table,
            Item={
                'vpn-id': {"S": vpn_id},
                'changeType': {"S": change_type},
                'awsRegion': {"S": region},
                'timestamp': {"S": timestamp}
            }
        )
        
        return {
            'statusCode': 200,
            'body': json.dumps('Event logged!')
        }
    
    except Exception as e:
        # Printing the error in logs
        print(e)
        return {
            'statusCode': 500,
            'body': json.dumps('Something went wrong. Please check the logs.')
        }

Regarding permissions, the IAM role associated to the function should have permissions to perform dynamodb:PutItem actions against the DynamoDB table created. Remember also to attach the managed IAM policy AWSLambdaBasicExecutionRole to allow logging into Amazon CloudWatch logs.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "dynamodb:PutItem"
            ],
            "Resource": [
                "arn:aws:dynamodb:us-west-2:{ACCOUNT_ID}:table/{TABLE_NAME}"
            ],
            "Effect": "Allow"
        }
    ]
}

The DynamoDB table has been configured with vpn-id as partition key, and changeType as sort key.

View of the DynamoDB console, showing the configuration of the table network-manager-vpn-actions

Figure 14. DynamoDB table

Just as before, with the automation built, it’s time to create the EventBridge rule that invokes the Lambda function.

View of the EventBridge console, showing the rule event pattern configuration (filtering to only VPN actions)

Figure 15. EventBridge rule (nm-vpn-actions)

  • Event bus: default
  • Event pattern:
    • Detail-type: “Network Manager Topology Change”
    • Source: “aws.networkmanager”
    • We want to filter by changeType, and trigger the rule anytime there’s an event related to a VPN attachment, connection, or tunnel. To simplify the definition, we use prefix matching. You can find more information about how to filter content in the EventBridge event patterns documentation page.
  • Target: Lambda function. If you create these resources using IaC (recommended option) you must provide permissions to EventBridge to invoke the function.
View of the EventBridge console, showing how the rule target configured is a Lambda function.

Figure 16. EventBridge rule target – Lambda function

Time to test the new automation. We created a couple of VPN attachments in our Transit Gateway and performed some actions (BGP and IPsec up and down). In this VPN Gateway strongSwan aws-samples repository you can find the code we used to create the VPN connections. This is the information we have in our DynamoDB table:

View of the DynamoDB console showing a list of items - different VPN actions sorted via VPN ID and Change Type

Figure 17. VPN actions tracked in DynamoDB

As seen in figure 17, our event processing solution is tracking VPN events in DynamoDB. Now we have visibility into the different events happening to our VPNs, and when they happened. In our example, we have simply filtered some information provided by the event before adding it in the DynamoDB table. However, you can also use the Lambda function to enhance the event’s information or correlate several events to automate some actions.

Considerations

The following points are worth considering:

  • You can use Network Manager for multi-Account visibility with AWS Accounts inside the same Organization. Remember that you must enable trusted access for your delegated administrator.
  • Remember that Network Manager’s home Region is US West (Oregon). This means that you should process Network Manager Events in this AWS Region, regardless of where you have your resources.
  • If you want to leverage Network Manager Events for your Transit Gateway, you must first register them into a global network.
  • Network Manager is free of charge. However, when building the example we showed, you must consider the pricing of each specific Serverless service:

Conclusion

In this post, we have reviewed the events provided by Network Manager and how you can consume them using EventBridge rules. Additionally, we presented an example of how you can leverage AWS Serverless services to either get notifications or create automations in your environment.

To get started with the solution proposed in this post in your own AWS Account, clone the following aws-samples repository and check the resources created and their configuration.

Visit the Network Manager documentation page for additional information.

About the authors

Pablo Sánchez Carmona

Pablo Sánchez Carmona

Pablo is a Network Specialist Solutions Architect at AWS, where he helps customers to design secure, resilient and cost-effective networks. When not talking about Networking, Pablo can be found playing basketball or video-games. He holds a MSc in Electrical Engineering from the Royal Institute of Technology (KTH), and a Master’s degree in Telecommunications Engineering from the Polytechnic University of Catalonia (UPC).

Nishant Kumar

Nishant Kumar

Nishant Kumar is a Senior Product Manager in the Amazon VPC team. He is interested in areas of network observability and network management. Outside work, he loves Formula 1 racing, cooking, and exploring wildlife.

Andy Taylor

Andy Taylor

Andy is a Network Specialist Solutions Architect at AWS with a passion for Network Automation. Prior to joining AWS, Andy worked as a Network Architect in a number of industry segments such as Oil & Gas, Finance, UK Government and public sector specialising in Data Centre networking and network automation. In his spare time Andy has participated and taught various martial arts for the last 25 years.