Portcast Scales Machine Learning Models at Sea Using Amazon SageMaker
Promoting Supply Chain Efficiency with Machine Learning (ML)
Ocean freight shipping is the main mode of transport for global trade, accounting for 90 percent of trade volume. The ability to predict, and plan for, the arrival of goods at ports greatly determines supply chain efficiency, notwithstanding weather disruptions, customs delays, and other factors.
Singapore-based startup Portcast offers a predictive supply chain software-as-a-service (SaaS) via its machine learning (ML)-powered platform, helping shipping companies and manufacturers predict the arrival time of cargo at ports. Portcast wants to reduce its customers’ manual effort and cost of container tracking by 80 percent and 20 percent respectively. The company’s proprietary ML models use historical patterns of shipping and container movements and real-time data such as weather conditions, port traffic, and vessel location to enhance supply chain efficiency.
With Amazon SageMaker, we can build a globally scalable platform for predictive logistics with efficient cloud storage and data computation.”
Cofounder and CTO, Portcast
Isolating ML Predictions to Scale Independently
Portcast uses a range of Amazon Web Services (AWS) services to support its underlying infrastructure and ability to run ML models at scale. To store the data collected for ML processing, it uses Amazon Relational Database Service (Amazon RDS) and Amazon Simple Storage Service (Amazon S3). It also uses Amazon Elasticsearch Service for search logs driving analytics. “Interaction between services on AWS is seamless, which allows us to constantly experiment with different ideas,” says Lingxiao Xia, cofounder and chief technology officer at Portcast.
Up till 2019, Portcast was training and deploying its ML models using Amazon Elastic Compute Cloud (Amazon EC2) bare metal instances and self-hosted clusters. This approach worked fine in the beginning, when the company was tracking a few hundred containers at a time. However, as the company grew, scaling its ML predictions became complex and costly given that each vessel often carried up to 20,000 containers. Compounding the complexity was the need to provide up-to-date arrival predictions twice a day per container.
With a bare-metal setup, Portcast had to scale its entire ML infrastructure—including processing, predictions, and endpoints—collectively, which generated very high memory requirements. Portcast needed to isolate the prediction aspect of its ML models to scale independently; it solved the challenge by using Amazon SageMaker.
Running Processes in Parallel with Limitless Memory
Before deciding on Amazon SageMaker, Portcast requested for a hands-on training session with AWS solutions architects. Portcast’s data team used Amazon SageMaker to optimize ML workloads by separating model training from predictions and processing, starting with a few instances carrying out ML model training. Now, Portcast uses Amazon SageMaker to speed up automation in its end-to-end ML cycle, from training to processing to predictions.
“By taking predictions out of the ML model as a separate service, we’re able to scale models independently and reduce memory requirements,” says Xia.
Previously, Portcast was limited by the number of processes it could run in parallel, as each model had to be loaded into memory. “With all our models hosted on Amazon SageMaker, we can run hundreds of processes in parallel with no memory limit. We have the potential to generate millions of predictions a day,” Xia adds. Portcast also leverages the multi-model endpoints feature in Amazon SageMaker to reduce costs by hosting multiple models on each instance, saving at least 60 percent on ML deployment.
Speeding up Deployment and Automating Monitoring
Currently, Portcast monitors tens of thousands of containers per day—a scale not technically possible on its previous infrastructure. “With Amazon SageMaker, we can build a globally scalable platform for predictive logistics with efficient cloud storage and data computation,” Xia says.
As a fully managed service, Amazon SageMaker handles the underlying infrastructure that trains and runs ML models, so Portcast only has to determine the initial setup. ML models scale automatically, and Amazon CloudWatch sends alerts when abnormalities are detected. Portcast’s data team has a user interface that enables high visibility into processing jobs and their status without manual intervention. This saves the team at least 2–3 hours per week formerly spent on infrastructure monitoring.
Beyond time savings, Xia emphasizes the value of reducing context switching. “If our data scientists need to shift from analyzing models to monitoring tasks, the context switching cost is more than the time spent on that task,” he explains.
Improving Data Science/Development Workflows
The introduction of Amazon SageMaker has also reduced dependency between Portcast’s data science and development teams. Developers no longer need to set up infrastructure before the data team can update ML models or add new features. Data scientists can independently establish the infrastructure required for their jobs within Amazon SageMaker.
Some data scientists proficient in Amazon SageMaker have become internal champions. They regularly suggest or initiate projects to help address common challenges like processing bottlenecks. The data team also actively enriches their knowledge of Amazon SageMaker through targeted sessions and discussions with AWS solutions architects on optimization strategies for scaling while controlling costs.
Scaling to Support Expansion
Nidhi Gupta, cofounder and chief executive officer of Portcast, feels that the best is yet to come for logistics innovation. “The next couple of years is just the inflection point for our industry, and we anticipate 10−20 times’ growth in the coming months,” she says. “With Amazon SageMaker, we can handle more containers on the same platform as we grow. This allows us to explore more business opportunities while optimizing our resources, which ultimately improves our bottom-line.”
To learn more, visit Machine Learning on AWS.
Portcast offers predictive visibility and demand forecasting technology for logistics companies and manufacturers. Its customers can reduce manual planning by up to 80 percent and more accurately determine cargo demand and arrival times to control costs.
Benefits of AWS
- Runs hundreds of ML processes in parallel with no memory limit
- Saves at least 2–3 hours per week on infrastructure monitoring
- Scales from tracking hundreds to thousands of containers in 2 years
- Cuts ML deployment costs by at least 60%
- Reduces dependency between developer and data science teams
- Promotes a self-service culture to solve internal bottlenecks
AWS Services Used
Amazon SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML.
Amazon CloudWatch is a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers.
Amazon Relational Database Service (RDS)
Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud.
Amazon Elasticsearch Service
Amazon Elasticsearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost effectively at scale.
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.