In 1916, the Minnesota Mining and Manufacturing Company, or 3M, opened its first research lab, a closet-sized space in its St. Paul sandpaper factory. A series of incidents across the first 14 years of the company’s existence—one involving a shipment of sandpaper botched in transit by a spilled case of olive oil revealing the paper’s poor quality—inspired the then general manager, William McKnight, to create a space for testing products to improve quality control.
As McKnight’s influence grew (he would become chairman of 3M’s board in 1949), so did the fervor for quality. Over the years at 3M, entrepreneurial scientists transformed everything from wild ideas to incidents like the sandpaper shipment to even failed experiments into products that are now household staples, such as Scotch tape and Post-it notes.
Courtesy of Wired
Quality remains an intrinsic facet of the culture at 3M. Spurred by the success at the lab, 3M has significantly expanded its research facilities. Nearly six percent of the company’s revenue is now funneled into R&D. In St. Paul—where nearly 12,000 employees come together to create and launch new products and improve old ones—thousands of researchers and scientists in the corporate labs are striving to add to the innovation pipeline.
One of the most important topics on the 3M campus is machine learning. Using machine learning on Amazon Web Services (AWS), 3M is improving tried-and-tested products, like sandpaper, and driving innovation in new fields, like healthcare. Perhaps as a testament to the programs’ effectiveness, products that are less than five years old consistently contribute to about 30 percent of the company’s revenue; every year 3M releases about 1,000 new products.
“There are not many companies that can combine what we have as a rich material foundation with the digital capability to really make something new,” said Hung Brown Ton, Chief Architect at the St. Paul Corporate Research Systems Lab. “That’s what’s exciting for us—leveraging these new cloud capabilities like machine learning.”
Revising a 100-Year-Old Product with Machine Learning
Since overcoming countless sandpaper-making hurdles of the company’s early days, 3M has continued to improve the abrasive capacity of its long-standing product. Until the recent introduction of machine learning techniques into the product development workflow, however, the process was extremely time-consuming.
The ideal grain of sand (which is actually a synthetic material called Cubitron) cuts best and lasts longest. Traditionally, to arrive at that ideal, a 3M technician would inspect a CT scan for each sheet of paper to assess the number of grains on a sheet. Then, the technician would test each sample against a rough surface to measure its effectiveness and try to correlate that effectiveness with the percentage of grains.
“This involves a long development process, which takes weeks,” said Brown Ton, as he and his team collaborate with research scientists who are developing the new abrasive samples and products (including the product still colloquially referred to as sandpaper).
With machine learning on AWS, which Brown Ton’s teams started implementing a little less than a year ago, the process is now much faster and more precise. The 3M team is currently testing models that use traditional image training as well as leveraging neural networks on Amazon’s SageMaker. While the technician still tests the samples, the models make image analyses significantly faster, helping her narrow down the best options. These machine learning models enable researchers to analyze how slight changes in shape, size, and orientation may improve abrasiveness and durability. In turn, those suggestions inform the manufacturing process.
Given the amount of data generated by these scans and tests (about 750 GB per palm-sized sheet), the team was initially blowing through the heavy-duty engineering laptops it had bought to run the analyses. “So it made perfect sense to move this capability to the cloud,” said Brown Ton, “because we were enormously hampered by the compute power of any conventional laptop or desktop we could purchase. The process today in AWS is orders of magnitude more efficient – and a delight to spend our time on understanding abrasives instead of waiting for data to be collected and tests to complete.”
Converting reams of unstructured text into billable codes
While sandpaper is a 3M staple, as the manufacturing company has grown, it has expanded into new areas—including healthcare. 3M founded its subsidiary Health Information Systems (HIS) in 1983, not long after the first major electronic health record (EHR) system was developed. Today, 96 percent of hospitals are using EHRs, compared to just a sliver ten years ago, and in all that data HIS saw an opportunity to build something new: a set of machine learning-powered medical coding products.
In order to charge insurance providers for its services, a healthcare provider must translate EHRs into the appropriate billing codes. Errors in the process are common, and can result in delayed payments or over-billing, which is fraud. In the U.S., most hospitals are managing it with the help of HIS’s Natural Language Processing (NLP) tools, which are powered by machine learning on AWS.
David Frazee, the research lab’s director and a 14-year 3M veteran, was previously HIS’s CTO. He said the traditional process of determining billing codes required that individuals known as coders review each record and, based on knowledge and experience, pick the right code from one of literally 141,000 options. “Three weeks later, you could give the same coder the exact same records, and they may determine a different code,” said Frazee.
Since April 2016, HIS has combined that imperfect human expertise with machine learning models to reduce the error in the process. Much of an EHR is unstructured—as Frazee puts it, anything beyond a doctor’s scribble on a napkin probably qualifies as a record—so simply getting the models to understand what a record means is a feat.
To that end, linguists teach the NLP model to parse confusing records—for example, to know that a doctor’s note describing a body part as “cold” does not mean the patient has a cold. The coders sign off (or don’t) on the model’s decision. Their evaluation is fed back to the model, so it can improve the next round. The model, which processes a staggering three million documents daily, is learning quickly, and in many procedures selects the right code about 98 percent of the time. The model itself runs on high-powered Amazon EC2 and S3 instances.
Both Frazee and Brown Ton see machine learning on AWS proliferating the entire company in the coming years.
“The abrasives R&D project is, I think, very representative of the future of 3M’s material science and data science collision,” Frazee says. “We are one of the best materials manufacturing companies in the world, but we have not taken advantage of the fact that we’ve got a lot of data about our materials.”
“And if you think about the data being aggregated in the cloud, collected through IoT, processed through machine learning—while also leveraging modeling and simulation and the ability to visualize vast amounts of data—all these things come together for us,” Brown Ton adds. “Continuing to leverage these new and rapidly evolving cloud capabilities is really exciting for us and our customers.”