AWS Machine Learning Blog
Build well-architected IDP solutions with a custom lens – Part 5: Cost optimization
Building a production-ready solution in the cloud involves a series of trade-off between resources, time, customer expectation, and business outcome. The AWS Well-Architected Framework helps you understand the benefits and risks of decisions you make while building workloads on AWS.
An intelligent document processing (IDP) project usually combines optical character recognition (OCR) and natural language processing (NLP) to read and understand a document and extract specific terms or words. The IDP Well-Architected Custom Lens outlines the steps for performing an AWS Well-Architected review, and helps you assess and identify the risks in your IDP workloads. It also provides guidance to tackle common challenges, enabling you to architect your IDP workloads according to best practices.
This post focuses on the Cost Optimization pillar of the IDP solution. A cost-optimized workload fully utilizes all resources, achieves an outcome at the lowest possible price point, and meets your functional requirements. We start with an introduction of the Cost Optimization pillar and design principles, and then dive deep into the four focus areas: financial management, resource provision, data management, and cost monitoring. By reading this post, you will learn about the Cost Optimization pillar in the Well-Architected Framework with the IDP case study.
Design principles
Cost optimization is a continual process of refinement and improvement over the span of a workload’s lifecycle. The practices in this post can help you build and operate cost-aware IDP workloads that achieve business outcomes while minimizing costs and allowing your organization to maximize its return on investment.
Several principles can help you to improve cost optimization. Let’s consider different project phases. For example, during the project planning phase, you should invest in cloud financial management skills and tools, and align finance and tech teams to incorporate both business and technology perspectives. In the project development phase, we recommend adopting a consumption model and adjusting usage dynamically. When you’re ready for production, always monitor and analyze the spending.
Keep the following in mind as we discuss best practices:
- Implement cloud financial management – To achieve financial success and accelerate business value realization with your IDP solution, you must invest in cloud financial management. Your organization must dedicate the necessary time and resources for building capability in this new domain of technology and usage management.
- Cultivate a partnership between technology and finance – Involve finance and technology teams in cost and usage discussions while building your IDP solution and at all stages of your cloud journey. Teams should regularly meet and discuss topics such as organizational goals and targets with your IDP solution, current state of cost and usage, and financial and accounting practices.
- Adopt a consumption model and adjust dynamically – Provision resources and manage data with cost awareness, and manage your project stage and environment with cost optimization over time. Pay only for the resources you consume, and increase or decrease usage depending on business requirements. For example, development and test environments for your IDP solution are typically only used for 8 hours a day during the work week. By stopping development and test environment resources when not in use, such as outside of the 40 working hours per week, you can reduce costs by 75% compared to running them continuously for 168 hours per week.
- Monitor, attribute, and analyze expenditure – Measure the business output of the workload and the costs associated with delivery. Use this data to understand the gains you make from increasing output, increasing functionality, and reducing cost with your IDP workflow. AWS provides tools such as Amazon CloudWatch, tags, and AWS CloudTrail to make it straightforward to accurately identify the cost and usage of workloads, make sure you utilize resources to measure return on investment (ROI), and enable workload owners to optimize their resources and reduce costs.
Focus areas
The design principles and best practices of the Cost Optimization pillar are based on insights gathered from our customers and our IDP technical specialist communities. Use them as guidance and support for your design decisions, and align these with the business requirements of your IDP solution. Applying the IDP Well-Architected Custom Lens helps you validate the resilience and efficiency of your IDP solution, and provides recommendations to address any gaps you might identify.
You might have encountered cases when the financial team independently performs financial planning for your cloud usage, which turned out to be disrupted by the technical complexity. It’s also possible to ignore resource and data management while provisioning services, thereby creating unexpected cost items on your billings. In this post, we help you navigate through these situations and provide guidelines for cost optimization with your IDP solution, so you don’t have learn these lessons in a costly way. The following are four best practices areas for cost optimization of an IDP solution in the cloud: financial management, resource provisioning, data management, and cost monitoring.
Financial management
Establishing a team that can take responsibility for cost optimization is critical for successful adoption of cloud technology, and this is true for building an IDP solution as well. Relevant teams in both technology and finance within your organization must be involved in cost and usage discussions at all stages when building your IDP solution and along your cloud journey. The following are some key implementation steps to establish a dedicated cloud financial management team:
- Define key members – Make sure that all relevant parts of your organization contribute and have a stake in cost management. Most importantly, you need to establish collaboration between finance and technology. Consider the following general groups, and include members with domain expertise in financial and business areas, as well as in technology, to integrate the knowledge for better financial management:
- Financial leads – CFOs, financial controllers, financial planners, business analysts, procurement, sourcing, and accounts payable must understand the cloud model of consumption, purchasing options, and the monthly invoicing process. Finance needs to partner with technology teams to create and socialize an IT value story, helping business teams understand how technology spend is linked to business outcomes.
- Technology leads – Technology leads (including product and application owners) must be aware of financial requirements (for example, budget constraints) as well as business requirements (for example, service level agreements). This allows the workload to be implemented to achieve the desired goals of the organization.
- Define goals and metrics – The function needs to deliver value to the organization in different ways. These goals are defined and will continually evolve as the organization evolves. This function also needs to regularly report to the organization on the organization’s cost optimization capability.
- Establish regular cadence – The group should come together regularly to review their goals and metrics. A typical cadence involves reviewing the state of the organization, any programs or services currently running, and overall financial and optimization metrics.
Resource provisioning
Given the various configurations and pricing models of AWS services as part of the IDP solution, you should only provision resources based on what you need and adjust your provisioning over time to align with your business requirement or development stage. Additionally, make sure you take advantage of free services offered by AWS to lower your overall cost. When provisioning resources for your IDP solution, consider the following best practices:
- Decide between asynchronous inference or synchronous inference – You should adopt synchronous inference for real-time processing of a single document. Choose asynchronous jobs to analyze large documents or multiple documents in one batch, because asynchronous jobs handle large batches more cost-effectively.
- Manage Amazon Comprehend endpoint inference units – Depending on your needs, you can adjust the throughput of your Amazon Comprehend endpoint after creating it. This can be achieved by updating the endpoint’s inference units (IUs). If you’re not actively using the endpoint for an extended period, you should set up an auto scaling policy to reduce your costs. If you’re no longer using an endpoint, you can delete the endpoint to avoid incurring additional cost.
- Manage Amazon SageMaker endpoints – Similarly, for organizations that aim for inference type selection and endpoints running time management, you can deploy open source models on Amazon SageMaker. SageMaker provides different options for model inferences, and you can delete endpoints that aren’t being used or set up an auto scaling policy to reduce your costs on model endpoints.
Data management
Data plays a key role throughout your IDP solution, from building and delivering. Starting with the initial ingestion, data is pushed across different stages of processing, and eventually is returned as output to end-users. It’s important to understand how your choice of data management will impact the overall IDP solution cost. Consider the following best practices:
- Adopt Amazon S3 Intelligent-Tiering – The Amazon S3 Intelligent-Tiering storage class is designed to optimize storage costs in Amazon Simple Storage Service (Amazon S3) by automatically moving data to the most cost-effective access tier when access patterns change, without operational overhead or impact on performance. There are two ways to move data into S3 Intelligent-Tiering:
- Directly PUT data into S3 Intelligent-Tiering by specifying
INTELLIGENT_TIERING
in thex-amz-storage-class
header. - Define S3 Lifecycle configurations to transition objects from S3 Standard or S3 Standard-Infrequent Access to S3 Intelligent-Tiering.
- Directly PUT data into S3 Intelligent-Tiering by specifying
- Enforce data retention policies throughout the IDP workflow – Use S3 Lifecycle configurations on an S3 bucket to define actions for Amazon S3 to take during an object’s lifecycle, as well as deletion at the end of the object’s lifecycle, based on your business requirements.
- Split documents into single pages for specific FeatureType processing –
FeatureType
is a parameter for the Document Analysis API calls (both synchronous and asynchronous) in Amazon Textract. As of this writing, it includes the following values:TABLES
,FORMS
,QUERIES
,SIGNATURES
, andLAYOUT
. Amazon Textract charges based on the number of pages and images processed. Not all pages might include the information you need to extract. Splitting documents into single pages and only focusing on the pages with theFeatureType
you need can help avoid unnecessary processing, thereby reducing your overall cost.
So far, we’ve discussed best practices on the implementation and deployment of your IDP solution. When your IDP solution is deployed and ready for production, cost monitoring is an important area for you to observe and control the cost directly. In the following section, we discuss how to best perform cost monitoring with your IDP solution.
Cost monitoring
Cost optimization begins with a granular understanding of the breakdown in cost and usage; the ability to model and forecast future spend, usage, and features; and the implementation of sufficient mechanisms to align cost and usage to your organization’s objectives. To improve the cost optimization of your IDP solution, follow these best practices.
Design cost monitoring for the lifetime of IDP workflow
Define and implement a method to track resources and their associations with the IDP system over their lifetime. You can use tagging to identify the workload or function of the resource:
- Implement a tagging scheme – Implement a tagging scheme that identifies the workload the resource belongs to, verifying that all resources within the workload are tagged accordingly. Tagging helps you categorize resources by purpose, team, environment, or other criteria relevant to your business. For more detail on tagging use cases, strategies, and techniques, see Best Practices for Tagging AWS Resources.
- Tagging at the service level allows for more granular monitoring and control of your cost. For example, with Amazon Comprehend in an IDP workflow, you can use tags on Amazon Comprehend analysis jobs, custom classification models, custom entity recognition models, and endpoints to organize your Amazon Comprehend resources and providing tag-based cost monitoring and control.
- When tagging at the service level isn’t applicable, you can navigate to other resources for cost allocation reporting. For example, because Amazon Textract charges on a one-page basis, you can track the number of synchronous API calls to Amazon Textract for cost calculations (each synchronous API call maps to one page of the document). If you have large documents and want to utilize asynchronous APIs, you can use open source libraries to count the number of pages, or use Amazon Athena to write queries and extract the information from your CloudTrail logs to extract the page information for cost tracking.
- Implement workload throughput or output monitoring – Implement workload throughput monitoring or alarming, initiating on either input requests or output completions. Configure it to provide notifications when workload requests or outputs drop to zero, indicating the workload resources are no longer used. Incorporate a time factor if the workload periodically drops to zero under normal conditions.
- Group AWS resources – Create groups for AWS resources. You can use AWS resource groups to organize and manage your AWS resources that are in the same Region. You can add tags to most of your resources to help identify and sort your resources within your organization. Use Tag Editor to add tags to supported resources in bulk. Consider using AWS Service Catalog to create, manage, and distribute portfolios of approved products to end-users and manage the product lifecycle.
Use monitoring tools
AWS offers a variety of tools and resources to monitor the cost and usage of your IDP solution. The following is a list of AWS tools that help with cost monitoring and control:
- AWS Budgets – Configure AWS Budgets on all accounts for your workload. Set budgets for the overall account spend and budgets for the workloads by using tags. Configure notifications in AWS Budgets to receive alerts for when you exceed your budgeted amounts or when your estimated costs exceed your budgets.
- AWS Cost Explorer – Configure AWS Cost Explorer for your workload and accounts to visualize your cost data for further analysis. Create a dashboard for the workload that tracks overall spend, key usage metrics for the workload, and forecasts of future costs based on your historical cost data.
- AWS Cost Anomaly Detection – Use AWS Cost Anomaly Detection for your accounts, core services, or cost categories you created to monitor your cost and usage and detect unusual spends. You can receive alerts individually in aggregated reports, and receive alerts in an email or an Amazon Simple Notification Service (Amazon SNS) topic, which allows you to analyze and determine the root cause of the anomaly and identify the factor that is driving the cost increase.
- Advanced tools – Optionally, you can create custom tools for your organization that provide additional detail and granularity. You can implement advanced analysis capabilities using Athena and dashboards using Amazon QuickSight. Consider using Cloud Intelligence Dashboards for preconfigured, advanced dashboards. You can also work with AWS Partners and adopt their cloud management solutions to activate cloud bill monitoring and optimization in one convenient location.
Cost attribution and analysis
The process of categorizing costs is crucial in budgeting, accounting, financial reporting, decision-making, benchmarking, and project management. By classifying and categorizing expenses, teams can gain a better understanding of the types of costs they will incur throughout their cloud journey, helping them make informed decisions and manage budgets effectively. To improve the cost attribution and analysis of your IDP solution, follow these best practices:
- Define your organization’s categories – Meet with stakeholders to define categories that reflect your organization’s structure and requirements. These will directly map to the structure of existing financial categories, such as business unit, budget, cost center, or department.
- Define your functional categories – Meet with stakeholders to define categories that reflect the functions within your business. This may be your IDP workload or application names and the type of environment, such as production, testing, or development.
- Define AWS cost categories – You can create cost categories to organize your cost and usage information. Use AWS Cost Categories to map your AWS costs and usage into meaningful categories. With cost categories, you can organize your costs using a rule-based engine.
Conclusion
In this post, we shared design principles, focus areas, and best practices for cost optimization in your IDP workflow.
To learn more about the IDP Well-Architected Custom Lens, explore the following posts in this series:
|
AWS is committed to the IDP Well-Architected Lens as a living tool. As IDP solutions and related AWS AI services evolve, and as new AWS services become available, we will update the IDP Well-Architected Lens accordingly.
To get started with IDP on AWS, refer to Guidance for Intelligent Document Processing on AWS to design and build your IDP application. For a deeper dive into end-to-end solutions that cover data ingestion, classification, extraction, enrichment, verification and validation, and consumption, refer to Intelligent document processing with AWS AI services: Part 1 and Part 2. Additionally, Intelligent document processing with Amazon Textract, Amazon Bedrock, and LangChain covers how to extend a new or existing IDP architecture with large language models (LLMs). You’ll learn you can integrate Amazon Textract with LangChain as a document loader, use Amazon Bedrock to extract data from documents, and use generative AI capabilities within the various IDP phases.
If you require additional expert guidance, contact your AWS account team to engage an IDP Specialist Solutions Architect.
About the Authors
Suyin Wang is an AI/ML Specialist Solutions Architect at AWS. She has an interdisciplinary education background in Machine Learning, Financial Information Service and Economics, along with years of experience in building Data Science and Machine Learning applications that solved real-world business problems. She enjoys helping customers identify the right business questions and building the right AI/ML solutions. In her spare time, she loves singing and cooking.
Brijesh Pati is an Enterprise Solutions Architect at AWS. His primary focus is helping enterprise customers adopt cloud technologies for their workloads. He has a background in application development and enterprise architecture and has worked with customers from various industries such as sports, finance, energy and professional services. His interests include serverless architectures and AI/ML.
Mia Chang is a ML Specialist Solutions Architect for Amazon Web Services. She works with customers in EMEA and shares best practices for running AI/ML workloads on the cloud with her background in applied mathematics, computer science, and AI/ML. She focuses on NLP-specific workloads, and shares her experience as a conference speaker and a book author. In her free time, she enjoys hiking, board games, and brewing coffee.
Rui Cardoso is a partner solutions architect at Amazon Web Services (AWS). He is focusing on AI/ML and IoT. He works with AWS Partners and support them in developing solutions in AWS. When not working, he enjoys cycling, hiking and learning new things.
Tim Condello is a senior artificial intelligence (AI) and machine learning (ML) specialist solutions architect at Amazon Web Services (AWS). His focus is natural language processing and computer vision. Tim enjoys taking customer ideas and turning them into scalable solutions.
Sherry Ding is a senior artificial intelligence (AI) and machine learning (ML) specialist solutions architect at Amazon Web Services (AWS). She has extensive experience in machine learning with a PhD degree in computer science. She mainly works with public sector customers on various AI/ML related business challenges, helping them accelerate their machine learning journey on the AWS Cloud. When not helping customers, she enjoys outdoor activities.