CoDetect: A Serverless AI-Powered Web App for Detecting Medical Conditions in CT Scans
By George Kanellopoulos, Solution Consultant at DXC Technology
In March 2020, the World Health Organization declared a pandemic as COVID-19 spread rapidly throughout the planet. As the seriousness of the situation loomed, health care systems around the world struggled to respond to this unprecedented situation.
Within this tense climate, several organizations (including research facilities, universities, and commercial companies) started looking for ways to support doctors’ efforts to respond efficiently and quickly.
In that spirit, DXC Technology created a serverless artificial intelligence-powered solution to help detect manifestations of COVID-19 (and other medical conditions) in CT scans.
DXC Technology is an AWS Premier Consulting Partner and Managed Service Provider (MSP) that helps clients harness the power of innovation.
In this post, after examining the AWS services we chose for this solution, I will discuss two functional use cases and demonstrate the benefits of DXC’s CoDetect design and implementation approach.
CoDetect is a web-based app that allows users to submit CT scan studies for an artificial intelligence (AI) model analysis.
In any discussion that revolves around software solutions in a medical context, the first concern is always privacy.
Legal restrictions, and the sensitive nature of the data involved, create limitations for the software architect to consider and eventually solve.
Equally vital in an AI-powered solution is the model’s ability to function with a high confidence level. For that goal to materialize, the existence (some would say in abundance) of relevant and correctly annotated training datasets is imperative.
Finally, any software solution that’s to be used by the medical community in time-sensitive tasks or emergency-like environments should be easy to use, quick, and reliable.
By design, the CoDetect app is simple to use with only a handful of features. The analysis module allows users to search and organize, with statistical metrics and charts available to all users.
Accessing the app requires a multi-step process involving a strict workflow and multiple actors. By default, two-factor authentication (2FA) is enabled for each profile and must be configured by the user during the initial login. Other security measures, such as a unique URL per customer (medical facility), are also present.
This array of features ensures CoDetect is both secure and privacy-oriented.
By following this paradigm, we were able to focus on our business needs and go from design to proof of concept in no time. Furthermore, delegating as much as possible to managed services gave us breathing space in the security and operational efficiency.
Additionally, we tried to expand that paradigm to other areas such as the frontend codebase. This endeavor meant we would have to adhere to three fundamental principles:
- Focus on meeting business goals and not on solving technology problems.
- Use managed services where possible. Change the architecture where it’s not.
- Abstract as much as possible.
The CoDetect solution comprises several contextual areas:
- AI interface
- Data/object stores
- Asynchronous operations
Utilizing a UI toolkit and abstracting state management to an external library (redux) allowed us to focus almost entirely on building features and meeting business goals, thus fulfilling one of our architecture goals.
Using the AWS Amplify console as a hosting platform meant we could deploy the web interface seamlessly and continuously (CI/CD). Additionally, we could create and manage several environments quickly.
As a bonus, the AWS Amplify service offers domain management capabilities, thus freeing us from dedicating resources for this task alone.
The backend is a REST API, and this approach was chosen because we wanted both the frontend and potentially third-party software to use the available functionality.
The primary mechanism, as demonstrated in the diagram below, employs three AWS managed services. Amazon Cognito is used as a user repository and authorization mechanism; Amazon API Gateway is used for hosting the CoDetect API; and AWS Lambda functions have been developed for the core business logic functionality.
Figure 1 – Amazon Cognito as an authentication mechanism.
This architecture provides out-of-the-box security, scalability, and reliability without investing time in solving critical issues or creating and maintaining the relevant infrastructure.
For hosting our AI engine, we selected Amazon SageΜaker since it allows us to host our notebooks, train our models, and integrates nicely with other services such as Lambda and Amazon Simple Storage Service (Amazon S3). Additionally, our AI service endpoint is hosted in Amazon SageMaker.
Lastly, for image classification purposes, we use a DenseNet-169 model, with a custom approach to how the training data are loaded. This approach provides predictable costs, scalability, and reliability.
Our primary datastore is Amazon DynamoDB, which we chose because it:
- Integrates with Lambda functions.
- Provides encryption at rest, an important feature when dealing with medical data.
- Offers low latency.
- Provides the Stream capability, which is useful for creating real-time analytics data.
We used a Relational Data Modeling approach, reducing cost and overall maintenance and implementation debt.
We used Amazon S3 for our main image store, and a dedicated S3 bucket is used to store the CT scan images submitted for analysis by the user. There are additional buckets that are used natively by Amazon SageMaker to store training and other AI-related data.
Amazon S3 integrates extensively with Lambda functions, and provides critical security and management features such as encryption at rest and object expiration.
As mentioned, one of our primary focus areas was the security of the solution related to privacy. With that goal in mind, we implemented strict user authorization workflows.
To enforce those mechanisms we used Amazon Cognito for authorization and authentication purposes, both in the UI and as an authorizer for the Amazon API Gateway endpoints. Email notifications are an integral part of those workflows, and for those purposes we used Amazon Simple Email Service (SES).
Apart from the fact SES is easy to set up and use, it integrates with Amazon Route 53. As such, creating a domain-based email sending capability was an easy task. Furthermore, the service provides security mechanisms such as DKIM and SFP, a crucial feature when dealing with medical data solutions.
The core functionality of CoDetect is to analyze CT scan images with the use of an AI model. This process can take from a few seconds to a few minutes, depending on the load and the capabilities (GPU, CPU, RAM) of the underlying infrastructure.
Naturally, the results publishing process should be asynchronous and non-blocking. For that purpose, we decided to use the WebSocket protocol and, more specifically, to create a WebSocket API in Amazon API Gateway to support those asynchronous operations.
Early on, we decided to utilize additional AWS services to enable our DevOps and intra-team communication needs. For our Git repositories, we used AWS CodeCommit. One obvious benefit is that our codebase and architecture repositories are hosted on the same AWS account/construct that the solution exists.
Additionally, by utilizing CodeCommit’s integration capabilities with other AWS services, we created intra-team automated processes. For example, by combining CodeCommit Triggers and Amazon Simple Notification Service (SNS), we can communicate changes in both design and code within the team in an automated fashion.
Since AWS Amplify (the frontend hosting platform) integrates with CodeCommit natively, our engineering team can deploy new features, bug fixes, and changes seamlessly. The same stands true for our backend team using CI/CD tooling in conjunction with CodeCommit.
Functional Use Cases
Use Case #1: Analyzing a CT Scan
Although requesting a new CT scan analysis from a user perspective is easy, the underlying process is quite complicated from a business logic perspective. However, by utilizing the principles of the Serverless Native Mindset, we were able to meet our business goals with minimum code footprint and practically zero maintenance debt.
Some examples of applying that mindset to this specific process are:
- A CT scan file format follows the DICOM standard. A file of that format is usually several MBs in size for a single CT scan and, as a result, transporting files of that type through standard HTTP protocol was prohibitive.
The Amazon S3 feature for uploading objects with pre-signed URLs allowed us to fulfill the business goal with a few code lines without compromising the solution’s security principles. That’s because the developer can specify expiration times for the generated URLs. As such, this functionality cannot be exploited.
- A CT scan analysis involves multiple steps once submitted to the backend. This means the end user (UI) cannot be in a “blocked” state while the analysis is taking place and the results are being prepared.
For that reason, we decided to use the WebSocket protocol for the client (UI) to receive the results asynchronously to solve this business objective. The native support that Amazon API Gateway offers to create WebSocket APIs helped us implement this feature with minimum effort. At the same time, no additional infrastructure had to be created or maintained.
- Security-wise, since we used Amazon Cognito as our primary user repository and authentication mechanism, we can impose the same security standards to our REST API by defining the Cognito user pool as an authorizer of our API endpoints.
This is a straightforward process that requires no coding. Thus, all the policy standards enforced while using the app (2FA, approval process) are, in essence, extended to the API’s security scheme.
- The use of AWS Software Developer Kits (SDKs) complements the features of business logic within Lambda functions. This is another form of abstraction that allowed us to interact with AWS services, usually with a few lines of code and fulfill our objectives.
By using managed services where possible and the SDK to interact with additional AWS services that are part of the solution, we could implement the main feature of CoDetect with less than 800 lines of code. At the same time, no infrastructure was required to be created or maintained by our teams.
The trust factor is higher, and the RUN cost lower since maintaining the underlying infrastructure for the services we use is the cloud vendor’s responsibility (AWS).
Figure 2 – CT scan analysis process.
Use Case #2: Analytics
One of the key features of CoDetect is the ability to provide data in real-time.
Traditional approaches in this area suggest aggregations and metrics should either be generated due to a request or regularly generated by additional background workers in pre-defined intervals. Those approaches, although functional, come with certain drawbacks.
In the case of CoDetect, we decided to utilize Amazon DynamoDB’s Stream feature to capture in real-time the user’s activity (in essence, all new analysis DB records) and convert those events into insightful analytics data records.
This approach has significant benefits:
- The mechanism is reactive. We don’t have to create, maintain, and monitor background jobs to compile analytics data. The table activity we are interested in is passed as an event in our own Lambda, and with a few lines of code, we can convert these data into analytics records.
- The approach does not cause resource issues in our primary database store. In traditional techniques, a job would have to query the primary database store to collect the data necessary and create the analytics data tokens.
With this methodology, table activity is captured while occurring and sent to a Lambda function, thus keeping the primary database uninvolved from the whole process with its resources remaining dedicated to its primary purposes.
- Since the database event is being “streamed” into the designated Lambda almost momentarily, it‘s naturally possible to process that event immediately. Thus, in our case, the analytics data are updated in real-time continuously.
- No interruption or unavailability of data can occur in this approach in contrast to other methods, which might require the analytics data not to be available until new updates are incorporated.
Figure 3 – Real-time analytics data process.
As is true for any solution, CoDetect can be improved. In particular, we are planning the following as part of our next iterations:
- Utilize AWS Step Functions as a replacement to internal Lambda invocations for the analysis functionality.
- Implement an additional validation point using Amazon Comprehend Medical to analyze user inputs’ free-text reference data while submitting a new analysis.
- Replacing the serverless framework (our current infrastructure as code tool) with AWS CDK.
- Optimize our Amazon SageMaker setup to be less dependent on custom Docker images and more oriented to native service features.
- Implement a UI for monitoring the solution by utilizing subject-related AWS services such as Amazon CloudWatch and AWS X-Ray.
- Upgrade our Lambda runtime from .NET core 2.1 to 3.1 to utilize the new CloudWatch Lambda Insights feature for our Lambda functions.
We are also preparing to undergo the AWS Well-Architected Framework – Serverless Applications Lens review process to detect additional improvement areas.
While designing and implementing CoDetect, DXC Technology applied the Serverless Native Mindset by using:
- Managed services where possible.
- Abstraction libraries where available.
- AWS SDK as the primary way of interacting with managed services that are part of our solution.
As a result, we were able to implement CoDetect with a minimal code footprint and a clear focus on the business objectives. Simultaneously, our solution follows CI/CD and agile principles, has a predictable cost, and is production-ready while being able to serve any number of users required.
Regarding the maintenance of CoDetect, our involvement is minimal. The trust factor is high since maintaining the underlying infrastructure for the services is governed by the Shared Responsibility Model.
Security-wise, we take those steps necessary to protect our data. The underlying abstract services are secured by the AWS engineering teams, and this combination of the two creates a highly secure environment for our customers.
DXC Technology – AWS Partner Spotlight
DXC Technology is an AWS Premier Consulting Partner and MSP that understands the complexities of migrating workloads to AWS in large-scale environments, and the skills needed for success.
*Already worked with DXC Technology? Rate the Partner
*To review an AWS Partner, you must be a customer that has worked with them directly on a project.