Networking & Content Delivery
Using AWS Network Manager Events to manage and monitor your global network
AWS Network Manager is a great tool that lets you monitor changes in your network and create automations. In this post, we cover how to leverage events in Network Manager to get notified about network changes and how to use AWS Serverless technologies to enrich the information provided by these events.
Let’s start with a simple question: What is Network Manager? Network Manager lets you centrally monitor and manage your network across AWS accounts, AWS Regions, and on-premises locations. With Network Manager, you can visualize your global networks using the AWS Management Console dashboard, obtain metrics for your resources and send them to Amazon CloudWatch, or deliver near-real time events describing changes in your network to Amazon EventBridge.
We are focusing on the last functionality: how to use the events and enhance your visibility and operations in the network. To help you get started, you can deploy the use case covered using the following aws-samples repository.
Setting up Network Manager Events
First thing is to set up the global network. We will focus on AWS Transit Gateway as the connectivity service, and we will monitor and automate actions from changes in the VPCs and Site-to-Site VPN attachments. It’s important to note that Network Manager’s home Region is US West (Oregon). Consequently, the EventBridge rules that capture Network Manager events need to be created there. The service monitors events in all the Regions where you have resources.
From the Network Manager console, you can review the global networks created under Global Networks, as shown in figure 1.
To leverage Network Manager Events for Transit Gateways, you should register them to a global network. Move to the Transit gateway network – Transit gateways section to check the Transit Gateways registered to the global network. In our use case, we already have a Transit Gateway registered, as shown in figure 2.
Before moving to the automation solution, note that if it’s the first time you are configuring Network Manager Events, then you need to enable them. Move to the Transit gateway network and select Onboard cloudwatch log insight (under Events) as shown in figure 4. Future global networks in the Account have this configuration already enabled.
Now we are ready! Before we continue, let’s address a couple of the questions you may have:
- Can I use Network Manager cross-Accounts? Yes, you can visualize Transit Gateways across multiple AWS Accounts within an AWS Organization. For more information, check the How to use AWS Network Manager to visualize Transit Gateways across multiple accounts in the AWS Organization post.
- Can I use Network Manager to automate actions across AWS Regions? Yes, you must give appropriate permissions to the automation solution to perform cross-Region actions. We cover this later in the post.
- Where can I find related Transit Gateway events’ information? You can check the documentation for the topology changes such as updates to Transit Gateway attachments, routing updates, and status updates.
Leveraging AWS Serverless with Network Manager Events
Once we have Transit Gateways registered in the global network, any event happening to those resources and its attachment is sent to EventBridge. From there, there are several possibilities to leverage Serverless technologies to curate the information received or perform several actions in the network. Figure 5 shows an example. Remember that you can deploy this architecture using the following aws-samples repository.
From the events delivered to EventBridge, two automations are deployed:
- Any VPC attachment created or deleted invokes an AWS Step Functions state machine that automates the creation of Transit Gateway routing (we have default association and propagation disabled), and it also sends an email notification using Amazon Simple Notification Service (Amazon SNS).
- Any AWS Site-to-Site VPN event is processed using an AWS Lambda function and logged in an Amazon DynamoDB table.
Automating Transit Gateway actions from VPC attachment events
Let’s start by defining the actions the Step Functions state machine performs.
- For created VPC attachments, it creates Transit Gateway association and propagation to the corresponding route table. After this, it sends a notification.
- For deleted VPC attachments, it simply sends a notification.
In addition to our state machine, we are leveraging three AWS services to build the automation:
- AWS Systems Manager Parameter Store is used to retrieve the Transit Gateway route table ID to which we associate and propagate the new VPC attachments.
- The resource is created in the AWS Region where the Transit Gateway is located. In multi-Region environments you must consider creating Systems Manager parameters in each AWS Region.
- We defined Tier – Standard, Type – String, and Data type – Text. For production environments, we recommend using SecureString to encrypt your parameter using AWS Key Management Service (AWS KMS).
- An Amazon SNS topic for the notifications.
- This resource is created in US West (Oregon) – the Region where the automation solution is built.
- We defined Type – Standard.
- A Lambda function to create the Transit Gateway routing, created in US West (Oregon).
- General configuration:
- Runtime Python 3.10.
- 128 MB memory.
- 512 MB ephemeral storage.
- 1 min 30 sec timeout.
- The function code can be found below. It obtains the Systems Manager parameter name (from an environment variable) and obtains the Transit Gateway route table ID. With this information—and the Transit Gateway VPC attachment ID obtained from the event—it creates the association and propagation with the Transit Gateway route table.
- General configuration:
import json
import boto3
import os
def lambda_handler(event, context):
try:
# Obtaining information from the event
region = event['detail']['transitGatewayAttachmentArn'].split(':')[3]
tgw_attachment = event['detail']['transitGatewayAttachmentArn'].split('/')[1]
# Obtaining SSM Parameter name from environment variables
parameter = os.environ['PARAMETER_NAME']
# Boto3 clients
ssm = boto3.client('ssm', region_name=region)
ec2 = boto3.client('ec2', region_name=region)
# Obtaining the Transit Gateway route table ID from Systems Manager Parameter Store
tgw_rt = ssm.get_parameter(Name=parameter)['Parameter']['Value']
# We create Transit Gateway association and propagation to the route table
tgw_association = ec2.associate_transit_gateway_route_table(
TransitGatewayRouteTableId=tgw_rt,
TransitGatewayAttachmentId=tgw_attachment
)['Association']['ResourceId']
tgw_propagation = ec2.enable_transit_gateway_route_table_propagation(
TransitGatewayRouteTableId=tgw_rt,
TransitGatewayAttachmentId=tgw_attachment
)['Propagation']['ResourceId']
response = {
'transitGatewayAssociationId': tgw_association,
'transitGatewayPropagationId': tgw_propagation
}
return {
'statusCode': 200,
'body': json.dumps(response)
}
except Exception as e:
# Printing the error in logs
print(e)
return {
'statusCode': 500,
'body': json.dumps('Something went wrong. Please check the logs.')
}
Regarding permissions, the AWS Identity and Access Management (IAM) role associated to the function should have permissions to perform cross-Region actions. Remember to attach the managed IAM policy AWSLambdaBasicExecutionRole to allow logging in to Amazon CloudWatch logs.
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:AssociateTransitGatewayRouteTable",
"ec2:EnableTransitGatewayRouteTablePropagation",
"ssm:GetParameter"
],
"Resource": "*",
"Effect": "Allow"
}
]
}
With these resources ready, it is time to check the Step Functions state machine definition.
{
"Comment": "TGW Routing Automation",
"StartAt": "ActionType",
"States": {
"ActionType": {
"Choices": [
{
"Next": "CreateTGWRouting",
"StringEquals": "VPC-ATTACHMENT-CREATED",
"Variable": "$.detail.changeType"
}
],
"Default": "SendNotification",
"Type": "Choice"
},
"CreateTGWRouting": {
"Next": "SendNotification",
"Parameters": {
"FunctionName": "arn:aws:lambda:us-west-2:{ACCOUNT_ID}:function:{FUNCTION_NAME}",
"Payload.$": "$"
},
"Resource": "arn:aws:states:::lambda:invoke",
"ResultPath": null,
"Retry": [
{
"BackoffRate": 2,
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException",
"Lambda.TooManyRequestsException"
],
"IntervalSeconds": 1,
"MaxAttempts": 3
}
],
"Type": "Task"
},
"SendNotification": {
"End": true,
"Parameters": {
"Message.$": "$",
"TopicArn": "arn:aws:sns:us-west-2:{ACCOUNT_ID}:{TOPIC_NAME}"
},
"Resource": "arn:aws:states:::sns:publish",
"Type": "Task"
}
}
}
- The first state ActionType is a choice state that checks the event’s changeType:
- If VPC-ATTACHMENT-CREATED, it creates the Transit Gateway association and propagation.
- Otherwise (VPC-ATTACHMENT-DELETED), it sends the email notification directly.
- If VPC-ATTACHMENT-CREATED, it invokes the Lambda function (CreateTGWRouting) to create the Transit Gateway association and propagation. Once the routing has been created, it moves to the SendNotification state to send an email notification.
- The SendNotification state uses the Amazon SNS Publish task.
With our state machine definition ready, the last thing we need to do is define its permissions. Given the cross-Region actions are performed by the Lambda function (and it already has those permissions configured), the IAM role associated to the state machine can be as granular as only allowing the invocation of the Lambda function and the use of the Amazon SNS topic.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "InvokeLambdaFunction",
"Effect": "Allow",
"Action": "lambda:InvokeFunction”,
"Resource": "arn:aws:lambda:us-west-2:{ACCOUNT_ID}:function:{FUNCTION_NAME}"
},
{
"Sid": "AllowSNSPublicsh",
"Effect": "Allow",
"Action": "sns:Publish",
"Resource": "arn:aws:sns:us-west-2:{AWS_ACCOUNT}:{TOPIC_NAME}"
}
]
}
With the Step Functions state machine ready, the last thing is the EventBridge Rule that invokes it. In the following figure you can see the configuration in our example.
- Event bus: default.
- Event pattern:
- Detail-type: “Network Manager Topology Change”.
- Source: “aws.networkmanager”.
- Detail – ChangeType: “VPC-ATTACHMENT-CREATED” and “VPC-ATTACHMENT-DELETED”.
- Target: Step Functions state machine. If you create these resources using Infrastructure as Code (IaC) (recommended option) you must provide permissions to EventBridge to invoke the state machine.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "InvokeStateMachine",
"Effect": "Allow",
"Action": "states:StartExecution”,
"Resource": "arn:aws:states:us-west-2:{ACCOUNT_ID}:stateMachine:{NAME}"
}
]
}
Time to test the automation! If you create a new VPC attachment (the code in the aws-samples repository provides you an example), you should see a successful state machine execution (shown in figure 11), in addition to an email with the EventBridge event details (as shown in figure 12).
Finally, if you move to the VPC console (Transit Gateway route tables view) in the Region where the Transit Gateway has been created, then you can check the route table’s association and propagation of our recently created VPC attachment.
For multi-Account environments (when the owner of the VPC and Transit Gateway is different), this automation can help to create the Transit Gateway routing configuration at scale with each new VPC attachment. We have kept it simple with a single route table, but you can use services like AWS Secrets Manager to send more information about the VPC attachment between Accounts, and create complex automations. Furthermore, take into account that you should provide cross-Account permissions either in the Step Functions state machine or the Lambda function.
Tracking VPN actions using Lambda and DynamoDB
It’s time now for the second automation in our solution: we want to track all of the actions from our VPNs (new attachments, BGP or IPSec changes, etc.) in a DynamoDB table for further processing. This data can be used for example in troubleshooting activities.
As before, let’s start checking the resources needed for the automation. In this case, as shown in Figure 5, the EventBridge rule directly targets a Lambda function, which writes into the DynamoDB table. The Lambda function is created in US West (Oregon).
- Runtime Python 3.10
- 128 MB memory
- 512 MB ephemeral storage
- 1 min 30 second timeout
The function code can be found below. It obtains the DynamoDB table name from an environment variable and does a PutItem action against the table.
import json
import boto3
import os
def lambda_handler(event, context):
try:
# Obtaining information from the event
timestamp = event['time']
change_type = event['detail']['changeType']
vpn_id = event['detail']['vpnConnectionArn']
region = event['detail']['region']
# Boto3 client
dynamodb = boto3.client('dynamodb')
# DynamoDB table
table = os.environ['TABLE_NAME']
# We log the event in the DynamoDB table
dynamodb.put_item(
TableName=table,
Item={
'vpn-id': {"S": vpn_id},
'changeType': {"S": change_type},
'awsRegion': {"S": region},
'timestamp': {"S": timestamp}
}
)
return {
'statusCode': 200,
'body': json.dumps('Event logged!')
}
except Exception as e:
# Printing the error in logs
print(e)
return {
'statusCode': 500,
'body': json.dumps('Something went wrong. Please check the logs.')
}
Regarding permissions, the IAM role associated to the function should have permissions to perform dynamodb:PutItem actions against the DynamoDB table created. Remember also to attach the managed IAM policy AWSLambdaBasicExecutionRole to allow logging into Amazon CloudWatch logs.
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dynamodb:PutItem"
],
"Resource": [
"arn:aws:dynamodb:us-west-2:{ACCOUNT_ID}:table/{TABLE_NAME}"
],
"Effect": "Allow"
}
]
}
The DynamoDB table has been configured with vpn-id as partition key, and changeType as sort key.
Just as before, with the automation built, it’s time to create the EventBridge rule that invokes the Lambda function.
- Event bus: default
- Event pattern:
- Detail-type: “Network Manager Topology Change”
- Source: “aws.networkmanager”
- We want to filter by changeType, and trigger the rule anytime there’s an event related to a VPN attachment, connection, or tunnel. To simplify the definition, we use prefix matching. You can find more information about how to filter content in the EventBridge event patterns documentation page.
- Target: Lambda function. If you create these resources using IaC (recommended option) you must provide permissions to EventBridge to invoke the function.
Time to test the new automation. We created a couple of VPN attachments in our Transit Gateway and performed some actions (BGP and IPsec up and down). In this VPN Gateway strongSwan aws-samples repository you can find the code we used to create the VPN connections. This is the information we have in our DynamoDB table:
As seen in figure 17, our event processing solution is tracking VPN events in DynamoDB. Now we have visibility into the different events happening to our VPNs, and when they happened. In our example, we have simply filtered some information provided by the event before adding it in the DynamoDB table. However, you can also use the Lambda function to enhance the event’s information or correlate several events to automate some actions.
Considerations
The following points are worth considering:
- You can use Network Manager for multi-Account visibility with AWS Accounts inside the same Organization. Remember that you must enable trusted access for your delegated administrator.
- Remember that Network Manager’s home Region is US West (Oregon). This means that you should process Network Manager Events in this AWS Region, regardless of where you have your resources.
- If you want to leverage Network Manager Events for your Transit Gateway, you must first register them into a global network.
- Network Manager is free of charge. However, when building the example we showed, you must consider the pricing of each specific Serverless service:
- EventBridge AWS default service events are free.
- Amazon SNS has a free tier for the first 1,000 notifications.
- Lambda and DynamoDB have costs associated to their use.
- Step Functions has a free tier for the first 4,000 state transitions.
- AWS Systems Manager Parameter Store standard parameters are free.
Conclusion
In this post, we have reviewed the events provided by Network Manager and how you can consume them using EventBridge rules. Additionally, we presented an example of how you can leverage AWS Serverless services to either get notifications or create automations in your environment.
To get started with the solution proposed in this post in your own AWS Account, clone the following aws-samples repository and check the resources created and their configuration.
Visit the Network Manager documentation page for additional information.