AWS Startups Blog

Scalable, Automated Data Preparation with the Power of AI



Guest post by Bernhard Bicher, Co-founder & CEO, Onedot AG

At the mention of the term Artificial Intelligence (AI), the first image that comes to mind is movie influenced, self-developing robots in a likely dangerous context to humans. However, existing AI is far away from Hollywood creations and steeped in our lives at a much calmer everyday level. We see it all around us, implemented in use cases relevant to banking, healthcare, transportation, smartphone apps and most online services. And it all runs on data.

With an aim to make the overwhelming variety of master data simple to consume, Bernhard Bicher and Tobias Widmer founded Onedot in 2014 in Zürich Switzerland, with a focus on processing structured and semi-structured product master data. The current team of enthusiasts are experts in e-commerce, machine learning, statistics, and text analytics product data, offering SaaS for automated product data preparation through use of over 30+ proprietary probabilistic and statistical machine learning algorithms. The goal is to make data consumable across digital commerce. “Our first idea on how to make data consumable, led to prototypes using a bunch of different approaches – from probabilistic and statistical methods, to machine learning and AI.” Tobias Widmer recalls. “There was a need to design every AI version better, and it pushed us further”.

We built our first Minimum Viable Product (MVP), first algorithms, completed our first processing job and onboarded some of first customers (like eBay). We had an interesting pipeline, recruited the initial core team, which was followed by funding through independent technology investors. Realising we needed to become more sales-oriented, we began working with large enterprises, began to hire delivery team members to keep up with some of our bigger projects, and focused on developing a stable sales pipeline and partnerships. During this time, we continued to enhance our former AI prototype into a production-grade, data processing SaaS, used today by companies like the MIGROS Group, Jelmoli and Zageno. A good product and growing team meant we needed to step out of our existing work methodologies and collaborate better. We also needed stable and scalable computing infrastructure.

The anatomy of product data is garbled and messy. There’s not enough product data, detailed enough descriptions, or simply too much in the way of incorrect data, which are hurdles in the current market. It’s not enough to put something in an online store; you need to describe the product, add different details relevant to the customer and your vendors to create a rich experience for each product. This is where Onedot steps in. As a first step, Onedot analyzes and structures the supplier, in the second phase, the data gets transformed to the desired target structure.

“Our AI is capable of automatically preparing product data regardless of structure and format. The data sources can be manufacturers, suppliers or content providers, the target business needs high-quality data for ERP, PIM or online shop systems. Onedot developed proprietary algorithms, because the existing approaches just didn’t cut it,” says Tobias Widmer. With a reliable and secure cloud services platform, enough compute power and database storage to process sizable amounts of data, our first batch of pilot customers were persuaded to try the software. And they were not disappointed.

Summarised, Onedot deploys code, quickly and frequently, running a variety of different workloads reliably. If you are aware of the challenges involved with capacity planning, bad configuration duplication or configuration drift, you’ll be able to avoid this. The Onedot technology stack is fully containerised, features a reactive microservices architecture and is automatically deployed in a continuous manner, using an infrastructure-as-code paradigm. “Only if we have infrastructure-as-code, then we’re talking about true Infrastructure Automation,” concludes Tobias Widmer.