How BlueFin helped build a scalable IoT data platform for Walchem using Amazon Timestream
Walchem is a leading manufacturer of online analytical instruments and electronic metering pumps. Their mission is to provide the integration of sensor, pump, and electronic technologies for chemical control. Walchem’s Fluent platform provides cloud-based telemetry and monitoring within water treatment facilities, enabling facility operators to gather insights into how the facility is operating and whether changes need to be made, and enables them to make changes in real time. The Fluent platform consists of a variety of dashboards and monitors real-time data streaming from these devices. As a result, the Walchem team needed the Fluent application and underlying data platform to be robust to scale to their device fleet and large customer base.
In this post, we share how Walchem through its partnership with Bluefin Technology Partners, a leading solution provider for Innovative Connected Products and Amazon partner, built a scalable IoT data platform using Amazon Timestream.
Original architecture and challenges
The original architecture of Fluent used a MySQL relational database to capture the offloading of remote controller data. As the solution went live, the team quickly realized the volume and frequency of the data would quickly outpace the capabilities of MySQL both from a storage, performance, and cost perspective. The MySQL database, within 2 months of launch, had 10 billion records, totaling about 1.1 TB in size. The application processed 22 million records a day. On average, the 5,300 controllers in the field sent data in 5-minute intervals to the Fluent platform. It was difficult to estimate the adoption and usage of the Fluent platform because different tenants used the application at different scales. Therefore, the volume of data, usage of the application, as well as quantity of reports delivered were all much higher than expected. The original architecture was scoped to handle 250,000 transactions per day, but instead was processing 990,000 transactions. This was also limited the ability to scale to add additional controllers.
This resulted in the following challenges:
- Cost of infrastructure increased to accommodate future unexpected growth
- Performance wasn’t optimal because of the need to analyze huge volumes of historical data
- Operational overhead increased with more infrastructure management
- Reliability of application was unpredictable, especially during peak times, due to elasticity limitations
To address these challenges, Walchem needed a database that would scale on demand without impacting performance while keeping cost reasonable. Most of the reports that Fluent delivered to end-users required analysis of time series data, which is why they chose to use Timestream, a fast, scalable, and purpose-built time series database.
Timestream addressed all the cost, performance, scale, and maintenance challenges of their existing database solution. The serverless nature of the Timestream database not only addressed the unpredictable scale of performance and storage needs, but also removed the undifferentiated heavy lifting of managing an ever-growing database infrastructure.
The following diagram illustrates the solution architecture.
Migrating to Timestream
After Walchem decided to use Timestream as the database solution for the Fluent application, the next step was migrating the existing data from MySQL. Walchem developed a multi-threaded application to migrate existing data from MySQL. The application performed batch inserts using the Timestream API and migrated all required data over a period of approximately 2 days. During this period, the Fluent platform wrote to both the MySQL and Timestream databases to avoid any interruption and loss of data after switchover.
From a data modeling perspective, the team was able to easily migrate from their existing MySQL table to Timestream. Their MySQL data model, outlined in the following code, was a table with multiple indexes on
reported_time to allow for efficient temporal range queries:
When moving to Timestream, the data model had to be modified only slightly. Timestream utilizes a schema-less, narrow table data model, which means that each row contains a measure value, a set of one or more dimensions, and a measure name. Furthermore, the dimensions only need to be specified at write time. This allowed the Walchem team to easily move their data from MySQL to Timestream, and also simplified their management overhead by removing the need to manage the indexes.
The team modified their queries to use the built-in time series functionality and data model of Timestream.
The following code shows the original MySQL query:
The following code is the modified Timestream query:
With the migration, the team realized many measurable outcomes: increased scale and performance at lower cost, greater architectural simplicity, and a lower maintenance due to using Timestream as a managed database service.
Increased scale and performance at lower cost
With Timestream, the team achieved the necessary scalability by saving upwards of 50% on the data storage tier relative to the existing architecture. Furthermore, extended graphs and visualizations, which previously took 30–40 seconds to plot, could now be retrieved and plotted in under 2 seconds. For example, one use case was to show sensor data in a line graph for a duration at configurable intervals. The period start date was selectable by the user and could go back many years. The MySQL database was partitioned by month, so there was always a significant lag in fetching data from a partition that wasn’t in memory yet. Specifically, querying data from non-loaded partitions (graphs outside of the last 12 hours) was taking up to 40 seconds to return. With Timestream implemented, graphs rendered in less than 2 seconds.
As the table size grew, the MySQL instance size and associated cost for reasonable performance also grew. The underlying architecture in Timestream scales storage and compute independently. Its purpose-built query engine for such use cases helped the Fluent platform scale without compromising on performance or incurring a significant increase in cost.
Greater architectural simplicity
When faced with the limitations of MySQL, the team evaluated a variety of different options. For example, team evaluated an architecture using Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon DynamoDB in order to achieve the long-term storage and query capabilities. However, Timestream allowed the team to achieve the same outcome within a comparable level of short-term and long-term storage and query performance, all within a single data store.
Furthermore, the team managed several indexes and variants of the tables within MySQL to enable the specific views necessary for the application. Although the team needed to refactor the data model a bit to conform to the narrow table model in Timestream, they were able to eliminate all of those constructs. Because Timestream supports SQL queries, the switchover from the previous implementation to Timestream queries was nearly seamless.
Lower maintenance due to using a managed database service
Finally, the Walchem team no longer needs to manage a database.
With the original architecture, the team needed a dedicated database administrator for the maintenance of the application data store. Furthermore, any downtime or issues with the database ultimately resulted in downtime for their production application, which the Walchem team couldn’t afford. The managed nature of Timestream reduced the administration overhead for the team and allowed them to refocus their cycles from managing and caring for their database, to continuing to innovate on their platform and provide further value for their customers.
Conclusion and next steps
Walchem has been planning to expose the sensor time series data via APIs so that their tenants can build custom applications by consuming those APIs. However, the uncertainty of API usage volume and the scale requirements for relational database infrastructure didn’t allow Walchem to make those APIs available for external use.
With the introduction of Timestream, they don’t need to worry about scaling the database infrastructure and tuning the performance based on the demand and usage. Walchem is now able to save the upfront infrastructure investment cost associated with such new business initiatives, resulting in more focus on innovation.
About the Authors
John Gray is a Data Lab Solutions Architect at AWS based out of Seattle. In this role, John works with customers on their Analytics, Database and Machine Learning use cases, architects a solution to solve their business problems and helps them build a scalable prototype.
Piyush Bothrais a Senior Solutions Architect with AWS where he helps customers innovate, differentiate their business, and transform their customer experiences. He has over 15 years’ experience helping customers achieve their business outcomes with architecture leadership and digital transformation enablement. Outside of work, Piyush loves spending time with family and friends, music, playing golf, and watching any live sports event.