Canada's AI Innovators
Creating breakthroughs in Canada and beyond
Logan: Unlocking nature's secrets to fight microplastic pollution
Microplastic pollution is a man-made problem. But what if the solution already exists innature, waiting to be uncovered by AI?
That’s the mission Artem Babaian has set for his research team.
Meet Artem Babaian
Assistant Professor of Molecular Genetics, the University of Toronto’s Donnelly Centre
When Artem Babaian visits the water's edge at Lake Ontario, he's torn between enjoying a peaceful moment—and worrying about a global problem lurking beneath the surface.
“Because of its proximity to the city of Toronto, high levels of microplastics are found in the fish and life throughout Lake Ontario,” he explains.
"Microplastics are in water, our food, and ultimately in our bodies.”
What was once overlooked has become one of the most pervasive environmental challenges of our time. But Babaian, Assistant Professor of Molecular Genetics at the University of Toronto’s Donnelly Centre, and his team of researchers at the Laboratory for RNA-Based Lifeforms believe nature holds the answer.
"We think of microplastics as being a very synthetic problem, but the irony is that the solution might already be in nature.”
A planetary-scale problem demands a planetary-scale solution
As a researcher, Babaian realized why scientists were not able to make this discovery sooner.
For decades, scientists around the world have been collecting DNA and RNA samples, generating an extraordinary wealth of sequencing data, deposited into open databases.
"There are more than 39 million sequencing datasets with the answers to our planet's toughest problems inside," Babaian notes.
But the sheer volume of data had become its own barrier, creating a fundamental bottleneck.
"Genomics has long embraced open science,” he continues. “But access alone isn’t enough. How do we find the right enzyme?”
With a library of data so vast that no single computer could read it, and very few could access it, Babaian and team were determined to build a tool that would give researchers the ability to search and interpret DNA data at the global scale.
Democratizing access to genomic data
They developed a software program called Logan, an open-source, searchable index of the world’s public sequencing data. Fundamentally, it is a DNA search engine for accessing nature's genetic library.
“Logan is the world's largest biological sequence database—it unlocks enzymes that already exist in nature," Babaian says.
Building Logan required infrastructure far beyond traditional computing. To process and index the data, Babaian's team deployed 2.2 million CPUs simultaneously over several days using Amazon Web Services (AWS). This level of parallelization enabled them to complete in 6 days what would have previously taken years.
Critically, the team also optimized their cloud workflow on AWS to drive down the cost of processing each dataset from several dollars to just cents—eliminating the need to choose subsets and making it feasible to index all of the world's public sequencing data.
"We brought the cost per dataset down from two or three dollars to about five cents," says Babaian. "That meant we didn't have to limit ourselves to subsets—we could analyze everything and release it freely for everyone to use and reuse."
But indexing the data was only half the challenge—making sense of it required a different kind of intelligence.
AI-driven discovery of plastic-degrading enzymes
With Logan in place, AI becomes the discovery engine.
Once Logan identified the plastic-degrading proteins, the team leverages Amazon Bedrock's industry leading AI models to gain more information on which environments these protein sequences are found in. This increases the team's ability to quickly identify the best protein sequences for specific environmental factors, which accelerates testing, and helps determine if they could actually be impactful in degrading microplastics.
In an initial pilot, the team searched for enzymes capable of breaking down PET (Polyethylene Terephthalate) plastics, commonly found in water bottles.
“These enzymes are hidden in bacteria, fungi, insects, and other organisms,” he notes. “What we found in that first pilot outperformed anything ever designed in a lab.”
“We uncovered over a billion versions of plastic degrading enzymes in a mere ten hours,” Babaian says. “AI is allowing us to zero in on the discoveries that matter.”
A future of open, equitable science
Babaian’s long-term vision is not just scientific advancement, but accessibility. His team is working to synthesize and test the most promising enzyme candidates, with the goal of deploying them in real-world systems such as water treatment. At the same time, he is committed to keeping Logan open and accessible to the global research community.
"The AWS Open Data Registry allows us to democratize access to Logan’s data and catalyze other research groups to make their discoveries possible. Our goal is to ensure equal access to humanity's genetic data."
By combining open science, cloud infrastructure, and AI, Logan represents a new model for discovery—one where the full scale of nature’s data becomes searchable, and its solutions accessible.
While plastic degradation is a powerful proof of concept, Logan’s potential extends far beyond environmental applications.
"We're starting with microplastics, but this is just the beginning," Babaian explains.
"Anywhere where life has found a way to do interesting chemistry, we now have a way to find it and help transform our world for the better.”
Discover our other innovators
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages