“We realized early on that the relational database was not going to be a good solution for us and looked for alternatives in distributed storage systems. Amazon SimpleDB looked like a great fit because of its simple sparse record structure, automated indexing, fast query and data partitioning facilities,” recalls Iskold describing the SimpleDB model, which combines horizontal scale out with indexing of each attribute-value pair to ensure rapid query capability, even on terabyte or larger databases.
“Our choice to go with a SimpleDB over a relational database was driven by the fact that SimpleDB is designed from the ground up to address scaling massive amounts of data. This, in addition to the fact that our entire offering runs on Amazon Web Services stack made it an easy decision for us to choose SimpleDB over a relational database.”
SimpleDB enables this high scalability by promoting scale out, rather than up. Rather than allowing the size of the database to grow vertically, SimpleDB users are encouraged to spread the load horizontally across many domains. The service can then take multiple requests across all of these domains in parallel (i.e. multi-threaded access). In this way, both the total size of the dataset and the request throughput can grow without performance degradation.
This horizontal scale out approach is on display with Glue. To ensure high performance and to promote clarity of design, Glue partitions People and Things into distinct domains. However, an important concept for Glue is that of an Interaction – where each Person interacts with a Thing in Glue. The key of the Interaction record is the combination of Person key and Thing key, a relationship that would be captured via a join in a relational database. To maintain this relationship in SimpleDB, each interaction record is simply stored twice – once in the Person domain, and then in the Things Domain. This redundancy allows Glue to mimic relational joins with SimpleDB.
Amazon Web Services, particularly the storage solutions, Amazon S3 and SimpleDB, facilitated a faster time to market. “To build an in-house distributed storage solution is non-trivial, if not impossible, so AWS was critical to the success and development of Glue. We are also very pleased about the cost, SimpleDB costs us less than $100 per month, yet it supports a large number of users and interactions.” An on-premise or hosted solution sized to meet the growth projections of Glue could have easily topped $1,000/month in equipment, licensing, and DBA support, meaning that SimpleDB can offer up to a 10 to 1 total cost of ownership advantage.
In addition to using SimpleDB, AdaptiveBlue is also utilizing EC2 and S3 heavily. The Glue Web Service takes advantage of EC2 flexible ability to scale, while S3 is used to store and host over 30 million objects visited by Glue users. AdaptiveBlue has been a customer since the early days of AWS. Iskold states, “We have nothing but highest praise to say about the quality of AWS products and responsiveness and dedication of Amazon’s team. We know that we could not have built Glue without having AWS’ powerful infrastructure beneath it.”
To get started with Glue, visit http://getglue.com/
To read how AdaptiveBlue uses AWS to power BlueOrganizer, go to solutions/case-studies/adaptiveblue.