AWS Spatial Computing Blog
AI-Powered Construction Document Analysis by Leveraging Computer Vision and Large Language Models
The AEC Industry and the Age of AI
The architectural, engineering, and construction (AEC) industry is experiencing a digital transformation, driven by the increasing adoption of artificial intelligence and machine learning technologies. At the forefront of this evolution is TwinKnowledge, a pioneering AEC AI company that’s reimagining how the industry designs and reviews construction drawings at scale.
In this blog post, we’ll explore how TwinKnowledge collaborated with the Amazon Web Service’s (AWS) Prototyping and Cloud Engineering (PACE) team to overcome a critical challenge: scaling their computer vision (CV) platform across clients to efficiently process thousands of architectural drawings in a matter of days while maintaining high accuracy. The solution combines the power of Large Language Models (LLMs) and CV models to create AI-augmented workflows that significantly enhance the human review process.
TwinKnowledge’s mission is to decrease errors in construction documents and improve overall project efficiency by leveraging AI to augment the human review process, enabling AEC professionals to find and correct more errors. However, the company faced scalability constraints that limited their ability to implement their broader Generative AI strategy. By leveraging Amazon SageMaker’s MLOps capabilities and building a robust pipeline for data ingestion and validation, the AWS PACE team helped TwinKnowledge break through these limitations.
We’ll detail how this solution not only solved immediate scaling challenges but also laid the foundation for enhanced search capabilities and AI-driven insights. Through this technical deep dive, you’ll learn how AWS services can be orchestrated to create powerful, scalable AI solutions that augment human expertise rather than replace it.
The Industry’s Information Problem
Today, two-dimensional drawing sets are the source of truth in construction. They are the synthesis of a project’s design work, composed by the architects and handed over to the general contractor to be used in the field to form the basis of a project’s execution. In the field, a little more than 70% of the industry still uses them in their original form: as paper blueprints (see “BlueBuilding the Future: Bluebeam AEC Technology Outlook 2025”, Blubeam). These design documents represent the direct source of risk on construction projects. To mitigate this risk, companies strive to regulate these designs through control measures, such as lessons learned from past projects, detailed scope documents, requirements and standards documents, and extensive coordination intended to get things right. Companies appoint Quality Assurance (QA) /Quality Control (QC) Directors and teams of reviewers to ensure document quality. Despite these reviewers, it is common for these documents – often thousands of pages long – to contain noncompliance, design mistakes, and to be missing key design decisions and design details.
According to a joint PlanGrid-FMI industry report, “Construction Disconnected”, nearly a quarter of all construction rework is due to either incorrect or hidden project information (22%), costing the U.S. construction industry over $14.3 billion a year. The majority of that information lives within PDF files – written contract documents, requests for information, and, most importantly, drawing sets. The report further indicates that AEC professionals spend:
- 5.5 hours weekly searching for project data
- 5 hours weekly resolving information-related conflicts
- 4 hours weekly addressing mistakes and rework
That’s a total of 14.5 hours a week of avoidable work related to information issues.
These problems are not new to the industry. Two-dimensional drawing sets have always been the standard for construction, but the AEC industry is hitting a plateau with its current workforce and project intelligence systems in their ability to perform QA/QC on drawing sets. This challenge is compounded by a critical workforce transition: 41% of construction professionals are expected to retire by 2031, according to the National Center for Construction Education and Research article “Skilled Labor: A Comeback Story”, taking with them decades of tacit design knowledge.
TwinKnowledge, a company applying AI to unlock valuable insights from construction documentation and detect errors in drawing sets, recognized an opportunity to address this challenge by combining AI capabilities with AEC professionals’ expertise to extend capabilities and enhance drawing QA/QC review.
Their approach integrates LLMs with CV to create a comprehensive system that maps design information within drawing sets to their intended design drivers: lessons learned, requirements, and standards. This integration required processing and combining both textual and graphical design information into a single knowledge base for analysis by LLMs.
However, off-the-shelf multi-modal AI models proved insufficient for graphical information processing. The graphical information needed to be processed in a way that it could be organized, accessed, and presented to reviewers in the way that made sense to them, e.g., in the structure of drawing sets. That meant parsing and processing graphical information according to how information in drawing sets are structured: as plan views with references to details, drawing details with references to schedules, and schedules with their own references.
To do so, TwinKnowledge built a proprietary data pipeline that first separates drawing set information from textual information and then employs an in-house CV model to parse drawing sets in their native format. However, the drawing set structure varies in its visual orientation and representation from project to project. To address this, TwinKnowledge fine tunes its base CV model for each company and each project’s set of drawings, delivering information comprehension at near-human levels on a project-by-project basis.
This solution, while effective, revealed a new challenge: the need to scale and operationalize the fine-tuning process across TwinKnowledge’s rapidly expanding client base. With their client portfolio projected to double in the near term, they needed an efficient solution to scale their project artificial intelligence capabilities for the AEC industry. For this, they turned to the AWS PACE team.
Expanded TwinKnowledge Solution Statement
Behind all construction drawings lies an invisible web of knowledge – lessons learned, standards, coordination, and dependencies – a system of design drivers and the actual design that forms the complete source of truth for a project. Current industry practice involves spot-checking compliance within this system, an approach that contributes to the inefficiencies costing the industry billions of dollars annually. While human reviewers excel at the critical thinking required for compliance assessment, they face two key limitations:
- Vast amounts of project information to process
- Inadequate project information systems for facilitating design compliance mapping
LLMs, in contrast, excel at processing and storing vast amounts of textual data for near-immediate access. TwinKnowledge’s solution combines specially trained CV models with LLMs to chunk and transform graphical information from drawing sets into textual data while preserving its original structure and associations. In this way, TwinKnowledge is leveraging LLMs to embed – and thus semantically connect – all textual and graphical design information on a project into the same high-dimensional vector space, giving LLMs ready access to the connected information. TwinKnowledge’s LLM, having been fine-tuned to the language of construction, can then find and connect design-driving information with design graphics (i.e., details) in drawing sets. With TwinKnowledge this can be done at 100% coverage level, a significant increase on the current industry-standard spot-check system.
TwinKnowledge’s pipelines for chunking and embedding project information
The diagram above shows TwinKnowledge’s two distinct pipelines for chunking and embedding project information: a text processing pipeline for document analysis and a CV pipeline for drawing set processing. The pipeline includes a process that extracts drawing set structure information such as the sheet name and number, the latest date in the title block, the view/detail/schedule title, and even the information’s physical boundaries on the page, and stores this information as metadata.
To scale this solution for each client and each project effectively, TwinKnowledge engaged the AWS PACE team to design an extensible software architecture. The focus centered on scaling three critical aspects:
- Processing for labeling drawing set information, e.g., “detail”, “detail title”, and “titleblock”
- Training workflows over the general CV model
- Inference execution for client-specific CV models
The team’s long-term goal is to implement a comprehensive end-to-end MLOps pipeline that streamlines the entire machine learning lifecycle for AEC document processing (See “AWS Summit ANZ 2022 – End-to-end MLOps for architects” for in-depth pipeline architectures). This pipeline will encompass three key components: data ingestion and labeling, model training, and inference workflows. By integrating these components, the TwinKnowledge-AWS team aims to create a seamless process that enhances efficiency, ensures data quality, and accelerates model development and deployment.
Architecture Diagram for end-to-end MLOps pipeline
The diagram above illustrates the high-level architecture of the envisioned MLOps pipeline:
- Construction Document Ingestion and Labeling: This initial stage involves collecting, processing, and labeling AEC documents and drawings.
- Training Pipeline: Develop, train, and validate machine learning models using the prepared data.
- Inference Workflow: The final stage where trained models are deployed to analyze new AEC documents in real-time.
Expanded Overview of AWS PACE Prototype
The foundation of any successful machine learning project lies in well-organized, high-quality data. In the context of AEC document processing, this is particularly crucial due to the diverse nature of construction drawings and documentation. The data ingestion and labeling workflow is designed to address these challenges through intelligent automation and robust organization.
AWS Architecture Diagram for data ingestion and labeling
The architecture for our data ingestion and labeling process leverages several AWS services to ensure efficiency and scalability:
- From the Web Frontend, users are able to upload construction documents and trigger the processing pipeline.
- Amazon Cognito is used to authenticate users in the web application.
- AWS Web Application Firewall (WAF), using managed rulesets, protects the web application from common exploits.
- Amazon API Gateway is used to create and manage the REST API resources and methods, and communication with AWS Lambda.
- AWS Lambda provides serverless compute resources and facilitates communication to and from other services.
- Amazon Simple Storage Service (Amazon S3) stores the uploaded data and images extracted from the documents.
- When a new construction document set is uploaded to S3, an AWS Step Functions State Machine is started. An AWS Fargate cluster executes a document page-splitting and image generation job inside an Amazon Virtual Private Cloud private subnet.
- Amazon DynamoDB stores metadata about each construction document processing project.
- Amazon Simple Notification Service (Amazon SNS) sends notifications to users to let them know that the pipeline is complete.
- Amazon CloudFront is used to distribute the React frontend web application.
- Amazon Elastic Container Registry (ECR) is used to store Docker container images used for the pipeline.
- AWS X-Ray and Amazon CloudWatch are used for tracing and monitoring.
- AWS CloudFormation is used for deploying resources as infrastructure as code (IaC).
By automating key aspects of data ingestion and labeling, the time and effort required to prepare AEC documents for machine learning is significantly reduced. This approach not only accelerates the development cycle but also ensures consistency in data preparation, leading to more reliable model training and improved inference results.
The emphasis on data organization within this workflow enables easier data versioning, experiment tracking, and model lineage – critical aspects for maintaining and improving machine learning models over time in the dynamic AEC industry.
Conclusion
Through this collaboration we’ve demonstrated how combining AWS services with innovative MLOps practices can transform document processing workflows in the AEC industry and help scale industry-transforming AI. The solution not only addressed TwinKnowledge’s immediate challenges but also established a robust foundation for future AI capabilities.
Key takeaways from this implementation include:
- The power of integrating MLOps best practices with Amazon SageMaker to create sustainable, scalable AI solutions
- How AI-augmented workflows can enhance, rather than replace, human expertise in document review processes
- The importance of building flexible architectures that can evolve with advancing AI technologies
As the AEC sector continues to digitize and embrace AI technologies, solutions like TwinKnowledge’s platform will play an increasingly crucial role in driving efficiency and innovation. The successful deployment of this solution marks just the beginning of what’s possible when combining domain expertise with AWS’s comprehensive machine learning capabilities.
To learn more about how AWS can support your organization’s AI initiatives or to explore similar solutions, contact your AWS account team or visit the AWS Solutions Library.