AWS for Industries

Accelerating DOCSIS 4.0 adoption with generative AI on AWS

Introduction

Data Over Cable Service Interface Specification (DOCSIS®) is a critical technology standard in modern cable broadband networks, enabling cable operators to deliver high-speed internet, voice, and IP video services over existing hybrid fiber-coaxial (HFC) infrastructure. Published by CableLabs, DOCSIS technology supports hundreds of millions of residential and enterprise broadband customers globally. DOCSIS networks also play a role in mobile backhaul and public Wi-Fi deployments, further expanding its reach and significance in the telecommunications landscape. As bandwidth demand surges with applications such as augmented reality, cloud gaming, and 8K video, operators must expand network capacity while maintaining service. Although DOCSIS 4.0 specification promises up to 10 Gbps of downstream capacity, further enhancements in HFC infrastructure and network management are expected to drive even greater performance. As Multiple Service Operators (MSOs) look to evolve to DOCSIS 4.0 networks and beyond, we see artificial intelligence (AI) as a crucial component to enhance connectivity, network performance, and operational support.

This post outlines a practical framework for integrating generative AI into DOCSIS 4.0 operations using AWS services. Using Amazon Bedrock with state-of-the-art foundation models (FMs) such as Amazon Nova allows MSOs to streamline DOCSIS 4.0 network management while enhancing network observability and efficiency. Although many AI use cases are applicable in DOCSIS network operations, this post focuses on core generative AI architectural elements, including enhancing Knowledge Bases or Retrieval Augmented Generation (RAG) through advanced chunking strategies, developing Agentic intelligence with the ability to reason and take actions using tools, and Responsible AI practices with robust security controls and governance.

DOCSIS 4.0 landscape and MSO challenges

As the cable industry is ramping up deployments of the DOCSIS 4.0 network, adopting new standards presents multifaced challenges across people, processes and technology. MSOs face complex decisions in capacity forecasting, ongoing maintenance, troubleshooting between the access network and the core, while continuously striving to improve end customer experience. Planning network capacity involves deciding when to split nodes, allocate spectrum, and balance upstream/downstream bandwidth. Engineering teams must make sense of extensive fragmented documentations such as industry specifications, vendor equipment operating manuals, and internal guides; extract intelligence; and apply technical domain expertise to make forward-looking decisions. Network Operations Centers (NOCs) manage vast telemetry, alarms, and performance data, necessitating swift anomaly diagnosis. Virtual cable modem termination system (vCMTS) evolution would further intensify telemetry volumes with continuous data streaming at as little as a few seconds, as compared to traditional Simple Network Management Protocol (SNMP) polling, which can be as infrequent as every 15-30 minutes. Not all NOC engineers possess deep DOCSIS 4.0 expertise. Searching for troubleshooting procedures can slow adoption and impede ongoing support. We performed sample experiments to answer domain-specific questions, such as DOCSIS capacity planning. These results revealed that using generic, widely available large language models (LLMs) can produce unreliable results, with confusion between European and North American standards, providing conflicting or incorrect guidance.

Generative AI framework for DOCSIS intelligence

Generative AI offers MSOs a platform to streamline DOCSIS planning, accelerate troubleshooting, and democratize domain expertise. In this post, we cover three foundational concepts: starting with improving your generative AI knowledge bases, building AI Agents and using Agents to combine Agentic RAG and tool use, and establishing guardrails for Responsible AI. We believe these three areas are foundational to building a robust generative AI-powered DOCSIS operational platform.

Figure 1 DOCSIS Intelligence – a high-level Agentic concept

Figure 1: DOCSIS Intelligence – a high-level Agentic concept

Knowledge bases

Most MSOs may have already experimented with knowledge search prototype proofs of concept (POCs) using some form of RAG pattern. One of the immediate generative AI applications is building intelligent assistants for consulting domain-specific resource, such as CableLabs DOCSIS specifications, whitepapers, and internal engineering guides. Powered by Amazon Bedrock, MSOs can quickly scale their prototype assistants to production for retrieve and summarize tasks and Q&A, such as determining when to split nodes, allocating channels and widths, interpreting signal quality metrics, or gathering security requirements on Cable Modems and CMTSs. As several proof-of-concept projects have already covered basic RAG capabilities, we highlight only the key elements of RAG that emerged as differentiators in our testing.

Beyond the data, three factors that stood out during our pursuits were: implementing data preprocessing, selecting (and iterating) the right chunking strategy, and guardrails for governance (in the later section of this post).

Data preprocessing
We noticed the majority of the DOCSIS 4.0 specifications and other relevant data sources in our knowledge base included a distinct header and footer on every page. The added information, while appearing benign, led to contamination of the search context on certain occasions. A step to remove this extra header/footer information highlighted the importance of small data preprocessing steps contributing to improved qualitative outcomes. However, data preprocessing would go beyond just removing headers and footers. It necessitates an evolving of approach tailored to the idiosyncrasies of each data source.

Chunking strategy
As most documents are long, chunking is crucial to break down large documents into smaller, manageable pieces that fit within the context window. Dividing inputs into smaller chunks allows generative AI systems to process information more efficiently and faster. Chunking breaks down the text into smaller segments before embedding your knowledge sources. This is important to make sure you only retrieve content that is highly pertinent to the query, reduce noise, improve speed of retrieval, and bring in more relevant context as part of the retrieval process in RAG. The domain, content, query patterns and LLM constraints heavily influence the ideal chunk size and method. For our use case, with technical DOCSIS 4.0 specifications, we experimented with four different chunking methods offered by Amazon Bedrock Knowledge Bases. Each method offers distinct advantages and limitations in handling the complex technical content, detailed specifications, intricate relationships present in the DOCSIS documentation and cost factors.

Fixed-size chunking represents the simplest approach of document segmentation, where the content is divided into chunks of a predetermined size measured in tokens (for example, 512 tokens per chunk for Cohere Embed English v3 model). This method includes a configurable overlap percentage between chunks to maintain some continuity. Although it offers the advantages of predictable chunk sizes (and thus costs), its main drawback is that it may split content mid-sentence or separate related information. Fixed-size chunking is particularly useful if the data is relatively uniform in length and structure, with limited context awareness and predictable low costs as a priority.

Default chunking splits the content into chunks of approximately 300 tokens while respecting sentence boundaries. This method automatically makes sure that sentences remain intact, making it more natural for text processing. It needs no special configuration and provides a good balance between simplicity and content integrity. However, it offers limited control over chunk size and context preservation. This method works well for basic text processing where maintaining complete sentences is important but sophisticated content relationships are less critical.

Hierarchical chunking creates a structured approach by establishing parent-child relationships within the content. During retrieval, the system initially retrieves child chunks, but replaces them with broader parent chunks so as to provide the model with more comprehensive context. For example, in the DOCSIS 4.0 Physical Layer Specification, a parent chunk might contain broader portions of the section “7.2.4.12 FDX Fidelity Requirements”. Child chunks would break down specific subsections such as Power Requirements, Table References, and Technical Parameters. If the question is around Power requirements, then searching over smaller child embeddings are more precise and quicker, for example Power requirements. Later it replaces the child chunks with the broader parent chunk embeddings for comprehensive context retrieval that tie Power requirements with the broader Full Duplex (FDX) Fidelity requirements. This method excels at maintaining document structure and preserving contextual relationships between different levels of information. It works best with well-structured content, particularly valuable for technical documentation for maintaining hierarchical relationships and faster speed of retrieving relevant context.

Semantic chunking divides text based on meaning and contextual relationships. It employs a buffer that considers surrounding text to maintain context. It uses three key parameters: maximum tokens per chunk, buffer size for surrounding sentences, and a breakpoint percentile threshold to determine chunk boundaries. This method is especially powerful for preserving meaning-based relationships in unstructured content. It enables related concepts to stay connected even when they appear in different places, say in a conversational text data source. Although this method demands more computational resources, with added costs for using a FM for semantic processing, it excels at maintaining the coherence of related concepts and their relationships. We found this approach is more suitable for scenarios where we have natural language content in our knowledge base, for example conversation transcripts between a call center agent and a broadband subscriber where related pieces of information are scattered in the text.

Due to the inherently organized nature of the DOCSIS documentation, which features well-defined sections, subsections, and clear parent-child relationships between technical concepts, we found that Hierarchical chunking was the most suitable approach for our use case. The method’s ability to keep related technical specifications together while preserving their relationship to broader sections proved particularly valuable for navigating and understanding the complex DOCSIS 4.0 specifications. One caveat here is that broader parent chunks mean more input tokens and thus tend to be more expensive. Although it’s hard for any prescriptive guidance on which method works best for your use, we recommend conducting a deeper validation for your data with tools such as RAG evaluation and LLM-as-a-judge capabilities, now available as features (in preview as of December 2024) in Amazon Bedrock. Figure 2 shows a simplified representation across chunking strategies within Amazon Bedrock.

Figure 2 A simplified representation comparing chunking methods applied to DOCSIS 4.0 Specs

Figure 2: A simplified representation comparing chunking methods applied to DOCSIS 4.0 Specs

In future posts, we cover more Advanced RAG concepts applied to Pre-retrieval (Query rewriting, Query Expansion) and Post-retrieval (Reranking). We also cover advanced applications with Structured Data Retrieval and Graph RAG, which help uncover relationships in network telemetry data.

AI Agents

Peter Norvig and Stuart Russell in their authoritative AI reference “Artificial Intelligence: A Modern Approach” articulate an agent as an artificial entity capable of perceiving its surroundings using sensors, making decisions, and then taking actions in response using actuators. For our DOCSIS 4.0 Intelligence framework, we adapt the AI Agent concept as an overarching intelligent autonomous entity. The Agentic framework is capable of planning, reasoning, and acting, with access to your curated DOCSIS knowledge base and guardrails to safeguard intelligent orchestration. In the following section, we demonstrate steps to build a sample agent that helps engineers calculate DOCSIS network capacity.

DOCSIS AI Agents: the why

We found that zero-shot chain-of-thought prompting an LLM for domain-specific questions such as DOCSIS network capacity calculations led to inaccurate results. Interestingly, the answer with zero-shot prompting on Mistral Large 24.07 defaulted to European DOCSIS standards (Figure 3.1), while on Anthropic Sonnet 3.5 v2 defaulted to the DOCSIS US standards (Figure 3.2).

Figure 3.1 Zero-shot output with Mistral Large 24.07

Figure 3.1: Output with Mistral Large 24.07

Figure 3.2 Zero-shot output with Claude Sonnet 3.5 v2

Figure 3.2: Output with Claude Sonnet 3.5 v2

To accurately calculate DOCSIS 4.0 capacity with deterministic outcomes, we demonstrate how to build a DOCSIS AI Agent.

Figure 4: Build-time configuration for our DOCSIS Agent

Figure 4: Build-time configuration for our DOCSIS Agent

Figure 4 shows the building blocks of the AI Agent design based on Amazon Bedrock Agents. An Agent is powered by LLM(s), and is comprised of Action Groups, Knowledge Bases, and Instructions (Prompts). The Agent determines actions it needs to take based on user inputs and responds back with an answer pertinent to the question asked.

Build-time configuration

1. Foundation model: The first step is to pick a FM that the agent invokes to interpret user input and subsequent prompts in its orchestration process. The agent also invokes the FM to generate responses and follow-up steps in its process. We picked the Amazon Nova Pro 1.0 model from a broad choice of state-of-the-art FMs available in Amazon Bedrock.

NovaPro logo

Sample instructions for the Agent (Prompt)

The next step is to write clear instructions that describe what the agent is designed to do. Advanced prompts allow you to customize instructions for the agent at every step of the orchestration and include AWS Lambda functions to parse each step’s output.

The following is a sample instruction prompt for our LLM Agent that specializes in DOCSIS 4.0, with a specific tool that it can use during the run time. Instructions guide the agent regarding what it should do and how it should interact with users.

Figure 5.1: A sample AI Agent instruction (Prompt)

Figure 5.1: A sample AI Agent instruction (Prompt)

A prompt snippet is included for your reference (the actual prompt is slightly more complex and may vary per the chosen FM):

Figure 5.2 AI Agent prompt snippet

Figure 5.2: AI Agent prompt snippet

Action groups

Action groups are comprised of Actions. Actions are tools that implement a given business logic. In our example, we write a deterministic Lambda function that takes a set of input parameters from the end user and performs the calculation based on a sample formula to calculate DOCSIS4.0 capacity.

Figure 6.1: Amazon Bedrock Agent action groupFigure 6.1: Amazon Bedrock Agent action group

We recommend using the Quick create a new Lambda function option that creates boiler plate request response objects within the Lambda code and the needed AWS Identity and Access Management (IAM) permissions for Amazon Bedrock service to invoke the Lambda function.

Function details or API schema

Then we define the Function Details (or use an Open API 3.0 compatible API schema). For our example, we have marked the frequency_plan as a necessary parameter while specifying that downstream or upstream are optional parameters. If the optional downstream and upstream parameters are not specified, then the Lambda code calculates capacity for both scenarios.

Figure 6.2: API schemaFigure 6.2: API schema

Lambda function

The following is a sample Lambda snippet that implements the logic to calculate DOCSIS 4.0 capacity based on the input parameters.

Figure 6.3: Lambda snippet – calc-docsis-capacity-novaFigure 6.3: Lambda snippet – calc-docsis-capacity-nova

Runtime process

DOCSIS AI Agentic Architecture

Figure 7: DOCSIS AI Agentic reference architecture

The runtime of the AI Agent is managed by the InvokeAgent API operation. This operation starts the agent sequence, which consists of three main steps: 1/ Pre-processing, 2/ Orchestration, and 3/ Post-processing. For brevity, we only cover the Orchestration steps in the following section (annotated with numbers in Figure 7).

The orchestration step captures the user input (step 1), interprets and reasons (step 2), invokes action groups (step 3), processes intermediate results (steps 4, 5, 6), and returns output to the user (steps 7 and 8). Each step is outlined in detail:

1. An authorized User initiates AI Assistant through a chat or other interface.
2. The AI Agent interprets the input with an FM (Amazon Nova Pro 1.0) and generates a rationale that lays out the logic for the next step it should take.
3. The agent determines an applicable Action Group based on the rationale or query knowledge base.
4. If the agent predicts that it needs to invoke an action, then the agent sends the parameters, which are determined from the user prompt, to the Lambda function configured for the action group or returns control by sending the parameters in the InvokeAgent response. Figure 7.1 shows the rationale available as trace in the Amazon Bedrock console.

Figure 7.1: Agent rationale from trace

Figure 7.1: Agent rationale from trace

The Agent may also query a knowledge base. However, for our case, it has the calc-docsis-capacity-nova tool (Lambda function) that is needed to answer the user’s question.

5. The Lambda function returns the response to the calling Agent API.

6. The agent generates an output, known as an observation, from invoking an action and/or summarizing results from a knowledge base.

Figure 7.2: Agent Action output observation from trace

Figure 7.2: Agent Action output observation from trace

7. The agent uses the observation to augment the base prompt, which is then (once again) interpreted with an FM (Amazon Nova Pro 1.0). Then, the agent determines if it needs to reiterate the orchestration process. This loop continues until the agent returns a response to the user or until it needs to prompt the user for extra information.

8. During orchestration, the base prompt template is augmented with the agent instructions, action groups, and knowledge bases that you added to the agent. Then, the augmented base prompt is used to invoke the FM. The FM predicts the best possible steps and trajectory to fulfill the user input. At each iteration of orchestration, the FM predicts the API operation to invoke or the knowledge base to query.

Figure 7.3: Agent rationale and final response from trace

Figure 7.3: Agent rationale and final response from trace

Results

Implementing the preceding steps allowed us have our first DOCSIS AI Agent powered by Amazon Nov Pro 1.0 that is capable of invoking a tool for calculating DOCSIS capacity using a defined formula. In practice, there are multiple agents working in harmony on complex multi-step tasks and Knowledge Bases. We will visit Amazon Bedrock Multi-agent collaboration (preview as of December 2024) in a future post.

Figure 7.4: Agent’s final response back to the userFigure 7.4: Agent’s final response back to the user

Guardrails for governance and Responsible AI

As part of your Responsible AI strategy, we strongly encourage implementing safeguards from the ground up. To deliver relevant and safe user experiences aligned with MSO’s organizational policies and principles, you can use Amazon Bedrock Guardrails. Bedrock Guardrails allow you to define policies to evaluate user inputs, conduct model-independent evaluations using contextual grounding checks, block denied topics with relevant content filters, block or redact Sensitive Information (PII), and make sure that responses adhere to configured policies. For example, you would like to block a front-line call center agent from using your RAG application for looking up procedures that could manipulate a sensitive network configuration. Consider a hypothetical scenario that a newly joined support engineer wants to disable MAC filtering on a subscriber’s modem for troubleshooting their internet. A zero-shot prompt to an LLM would look something like the following:

Figure 8.1: A zero-shot prompt to an LLM generating a response that could compromise broadband subscriber’s securityFigure 8.1: A zero-shot prompt to an LLM generating a response that could compromise broadband subscriber’s security

Disabling MAC address filtering risks unauthorized network access that could compromise security. The following example shows a sample Bedrock Guardrail configured to Deny sensitive changes such as MAC address manipulation and the return of a configured message back to the end user.

Figure 8.2: Amazon Bedrock Guardrails Denied topic filterFigure 8.2: Amazon Bedrock Guardrails Denied topic filter

Another example shows the sensitive information filter within Bedrock Guardrails. Consider that the user accidently enters MAC address (PII) into the chat prompt and no Bedrock Guardrails are configured. The following is the LLM response:

Figure 8.3: A zero-shot prompt with PII (MAC Address)Figure 8.3: A zero-shot prompt with PII (MAC Address)

Although the LLM recognizes it is sensitive information, the prompt must be intercepted and rejected even before it reaches the LLM. Based on the use case, you can selectively reject inputs containing sensitive information or redact them in FM responses.

The following example shows that Bedrock Guardrail identified the pattern entered by the user is a MAC address, and per the configured behavior it blocks the prompt and returns a message that you configure. You can also use a regular expression to define patterns for a guardrail to recognize and act upon.

Figure 8.3: Amazon Bedrock Guardrail for Sensitive Information filter that detects PII (MAC Address)

Figure 8.3: Amazon Bedrock Guardrail for Sensitive Information filter that detects PII (MAC Address)

With a consistent and standard approach used across FMs, Bedrock Guardrails deliver industry-leading safety protections. Although the preceding are just two applications of implementing Guardrails, there are deeper features within Guardrails, such as Contextual grounding checks and Automated Reasoning checks (Symbolic AI). These are so that your outputs align with known facts and aren’t based on fabricated or inconsistent data. We will visit more advanced governance capabilities for DOCSIS 4.0 generative AI applications in a future post.

Call to action: embracing AI in the DOCSIS 4.0 transition

The shift to DOCSIS 4.0 represents a pivotal juncture for cable operators, and AI can significantly accelerate this transition. Our experience with leading MSOs has shown that effective AI implementation doesn’t need complex frameworks or specialized libraries. This aligns with the Amazon Invent and Simplify Leadership Principle. Instead, success comes from a direct, progressive approach:

1. Start simple: Enhance foundational RAG implementations to enhance employee productivity aimed at industry and domain specific use cases.
2. Advance gradually: Move toward Agentic patterns for automated decision-making and complex task handling.

Integrating knowledge bases, AI agents, and robust guardrails allows MSOs to build secure, efficient, and future-ready AI applications as DOCSIS 4.0 and cable technology advances. In a future post, we will also address more advanced topics such as fine-tuning, continued pretraining, and their applications in the cable industry.

Conclusion

The digital transformation of the cable industry isn’t just ongoing, it’s accelerating. In this landscape, AI integration has shifted from a luxury to a competitive imperative. The operators that embrace these technologies today are better positioned to deliver superior service quality, optimize network performance, and drive operational efficiency to evolve future technologies. Join us in shaping the future of cable broadband, where AI and human expertise combine to create more resilient, efficient, and intelligent networks.

Nameet Dutia

Nameet Dutia

Nameet Dutia is a Senior Solutions Architect within the Amazon Web Services (AWS) Telecom Industry Business Unit. Nameet specializes in Artificial Intelligence and Machine Learning and has been driving transformation with AI & generative AI solutions in the telecom space. Overall, he has over 16 years of experience across various roles, including leadership of Software Engineering teams, driving innovation across advanced cloud computing, AI, and large-scale distributed systems. Nameet holds a Master’s degree from Southern Methodist University's Cox School of Business and Bachelor’s degree in Computer Engineering from University of Mumbai.

Dr. Jennifer Andreoli-Fang

Dr. Jennifer Andreoli-Fang

Dr. Jennifer Andreoli-Fang is an accomplished technology leader with over two decades of expertise in leading network, cloud, and AI/ML technology development. She joined Amazon Web Services (AWS) in 2021 and currently leads the fixed networks segment. Jennifer is widely recognized for her influential work in cable broadband and at the 3GPP. She is the top female inventor in the cable industry, with more than 110 patents. Jennifer holds a PhD specializing in machine learning and wireless networks from UCSD.