AWS Partner Network (APN) Blog
Onboarding and Managing Agents in a SaaS Solution
Editor’s note: This is the first in a two-part series about managing agents in a SaaS solution. Read Part 2 >>
By Oren Reuveni, Sr. Partner Solutions Architect – AWS SaaS Factory
Software-as-a-service (SaaS) products frequently use agents to gather data, execute actions, communicate with remote components, and run other product-related tasks in remote environments. These agents can be deployed in multiple forms and for multiple purposes.
If you manage multi-tenant SaaS environments and use agents, you face some unique challenges. Implementing such a solution requires adequate design.
For instance, you have to be sure those agents are securely identified and associated with their tenants, and that they are successfully isolated from accessing the data outside their context. The ability to configure tenant-related settings, such as tier type, and apply specific configuration changes can also be a challenge.
This post focuses on the deployment and management of agents in a SaaS environment. I will review the key considerations to keep in mind when building such solutions, and discuss the main challenges associated with registering a new agent in the system. I will also explore managing agents throughout their lifecycle in a multi-tenant SaaS environment.
In a follow-up post, I demonstrated the concept described here using a solution based on the AWS IoT Core managed service.
About SaaS Agents
SaaS solutions agents are essentially logic that runs remotely in the customer’s environment that can be used to gather essential data, or execute actions on behalf of the SaaS application.
For example, an agent can send different metrics back to the SaaS environment, or run product-related tasks in the environment it’s deployed in. Let’s explore the concept by reviewing several key aspects.
Figure 1 – SaaS agents in a multi-tenant cloud environment.
Figure 1 depicts a scenario of managing multiple agents. The agents belong to different tenants.
In this stage, they are already registered in, and being managed by, a single SaaS environment. It’s critical for the SaaS provider to ensure each agent operates only within its scope, since these agents will be communicating with one shared, multi-tenant environment.
Two aspects of the SaaS agent model are key: registration and management. To register the agent, use the SaaS solution to generate credentials and registration data that enables the agent to establish a secure communication channel with the SaaS environment.
After you have registered the agent, the SaaS solution manages it throughout its lifecycle. This includes facilitating communication, ingestion of telemetry data, ongoing management of the agents, configuring and updating them, and disabling them if needed.
Challenges of Using Agents in a SaaS Environment
The introduction of agents into your SaaS model adds a new set of considerations to your application design and architecture. Agents add new dimensions to your solution footprint that can influence the security and performance of tenant environments.
Following is a list of key areas to keep in mind when using agents as part of a SaaS solution:
- Identity — Each agent has to be positively identified and correlated to its tenant. Security, activity metering, and service tiers are directly related to the agent’s identity.
- Isolation — The agent has to operate only within the scope of the tenant it belongs to, and cannot access another tenant’s data.
- Throttling and noisy neighbor mitigation — Since each tenant is sending and receiving data from their agents while using a shared environment, consider how or whether that data could introduce a noisy neighbor condition where one tenant’s impacts the experience of other tenants.
- Tenant management and configuration — Each tenant has to be managed and configured according to the customer’s requirements and contracted level of service. This point is strongly connected to automation.
- Automation — Automation reduces friction and that reduction is essential to agility. It can enable customers to set up agents using a self-service model or, in some cases, provide agent installation as a fully automated process during the onboarding stage. Your solution needs to support all phases of the agent’s lifecycle, including onboarding, management, and deletion.
This enables the SaaS provider to meet customer requirements and operate at larger scale.
The first stage in working with agents is deploying them in their target environment and registering them with the SaaS solution. Registration ensures the agent:
- Is identified by the system and associated with the right tenant.
- Interacts with the tenant using the appropriate credentials and tenant scope.
While the SaaS provider designs and facilitates the agent registration process, the user is the entity that executes it in a self-service manner. The SaaS provider should make sure the agents are successfully going through the registration process, while ensuring they are coupled to the tenant they belong to, and that tenant data and configuration remain isolated.
As a first step, the agent’s installation package needs to be deployed in the target environment.
An agent can take several forms. It can be a code library or a binary deployed in a server. It can also be packaged as an AWS Lambda function, a container, or a server image like an Amazon Machine Image (AMI).
You can use different methods to deploy the agent. One common method is using an infrastructure as code tool such as AWS CloudFormation or Terraform. Another is using a script to automate deployment.
Figure 2 – Agent registration process.
After the agent is deployed in the target environment, provide it with registration data so it can authenticate and integrate with the SaaS environment. This data is generated in a self-service manner by the SaaS application user, interactively via the user interface, or programmatically via an API.
Communication with the agent needs to be secured, and you can do that in multiple ways. A common method is using a certificate to encrypt and sign the communication between the agent and SaaS environment. I will demonstrate the use of such a mechanism in the second part of this post.
Following is a possible format for such a token and registration data. Please note the token will most likely be encoded and encrypted prior to sending it over the wire.
This token is in JSON format:
Now, you can start the registration process. According to the suggested flow, the agent will be given this token as an argument when launched for the first time. This process mainly happens as a part of an automation for a single agent or multiple agents registration, but can be also executed manually by the SaaS solution user.
Once the agent has the token, it contacts the SaaS environment API endpoint and uses the chosen secure communication mechanism in order to authenticate. After the authentication phase completes, the data in the token is validated and processed to register it in the system.
As noted above, the agent’s token may contain the tenant identification data along with a unique ID that identifies the agent in the system. An alternative option is not sending this data at all, by using a unique identifier for the agent, and mapping it to the rest of the required data that will be stored on the SaaS environment side.
On the SaaS environment side, relevant elements like the agent’s profile and permissions are defined by the system according to the tenant it belongs to. As part of the implementation of this process, configuration data and tenant-specific data (like user defined scripts or other configuration parameters) can be also sent to the agent during the on-boarding stage.
At this point, the agent has the ability communicate with the SaaS environment and vice versa.
Managing Agents Throughout Their Lifecycle
An agent lifecycle begins with the registration process, and continues through ongoing management, monitoring, versioning, and deployment of updates. It usually ends with removing the agent from the system. To manage agents effectively, you must be able to handle version upgrades, deploy configuration changes, and aggregate the data they send.
Backwards compatibility is also a requirement since different agents can have different versions and configurations.
In agent-based environments, it’s often important for SaaS providers to seamlessly deploy new versions. Automating these updates allows you to avoid downtime, deliver better value for your customers, and better scale your operation.
Deploying a new version to thousands or even millions of agents must be automated because human operators delivering this task would take more time than is financially viable, and might produce errors. Carefully document the upgrade so you can handle potential errors, bugs, and deployment rollbacks, if necessary.
As your agents evolve, you’ll want to ensure you have mechanisms in place to accurately track the version history your agent. This is especially important since your system may support multiple versions of an agent at a given moment in time. Knowing the history of each version is essential to troubleshooting the various agents that could be deployed.
Data flow is another important topic to keep in mind. The agent can ship multiple types of data into the SaaS environment. It can be solution telemetry data (event log, for example), agent log data, or data that’s part of the product itself. The data that comes from the agents should be ingested automatically. It’s common for this data to be transformed and enriched prior to storing it in the system.
An example of that is splitting the data into per tenant partitions, or keeping different data types (say, logs and product-related data) in different data stores.
Agents may also need to support a way to surface alerts and notifications. A notification mechanism provides valuable information, such as agent usage metrics, to the user, or allows the SaaS provider to be informed about certain events like registration of a new agent. This allows the SaaS provider to have better insight into tenant activity patterns by sending notifications about tenant events and analyzing them.
Alerts (that are defined with thresholds) can set off alarms to notify the SaaS provider of unusual activity, or trigger mitigation mechanisms, such as scaling and tenant activity throttling.
The lifecycle usually ends when a managed environment becomes deprecated, or, for example, when a user decides to stop using the solution. Either case results in the need to delete the agent from the system. Another relevant scenario for ending the lifecycle is quickly disabling access to the system for rogue agents which, for example, can reside in an environment that was compromised or whose performance was impacted.
I have suggested a conceptual approach of how to register and manage the agents throughout their lifecycle. I talked about key topics to look into when building such solutions, reviewed the registration and onboarding flow, and discussed actions that are required during the ongoing work with the agents.
For SaaS environments that rely on agents, it’s essential to examine all the moving parts of the agent lifecycle. The goal here is to introduce agents without undermining the agility, security, or manageability of your SaaS environment. This means focusing on introducing all the mechanisms that ensure you can effectively deploy and manage your agents in a way that mitigates the friction they might introduce into your environment.
In a future post, I will use an example based on AWS IoT Core that implements the concept described here.
About AWS SaaS Factory
AWS SaaS Factory helps organizations at any stage of the SaaS journey. Whether looking to build new products, migrate existing applications, or optimize SaaS solutions on AWS, we can help. Visit the AWS SaaS Factory Insights Hub to discover more technical and business content and best practices.
SaaS builders are encouraged to reach out to their account representative to inquire about engagement models and to work with the AWS SaaS Factory team.
Sign up to stay informed about the latest SaaS on AWS news, resources, and events.