Active.com was looking for a way to analyze a user’s click-stream in near real-time to deliver pertinent trending information in a timely manner. One of the fundamental ways that Active.com enhances user experience on its website is by understanding and anticipating user needs- surfacing relevant content dynamically to users whenever possible. This is reflected in the “Popular Near You” feature on the homepage, or the “Events Near You” feature on the channel pages, such as active.com/running. There are many reasons to analyze Web site traffic patterns including; understanding page rank, customer click paths, feature popularity, detect user problems or errors, etc. Typically, Web site operators conduct such analysis ‘off line’ – at a point in time well after the events being tracked have actually occurred. While Active.com does use web analytics platforms to understand its business offline, it found that the delay associated with off-line log processing was not an option when presenting users with an hourly snapshot of localized, trending information.
Prior to using Amazon SNS, Active.com implemented a message bus based on an Extensible Messaging and Presence Protocol (XMPP) server that relayed messages between back-end servers over HTTP. XMPP was developed for real-time internet communication including chat, and has been applied to a range of applications including instant messaging, presence, multi-party chat, voice and lightweight middleware using publish/subscribe plug-ins. While the XMPP architectural model was promising for Active.com, the issues associated with this message bus solution (based on an eJabberd XMPP server) included lack of reliability, scalability and operating cost.
When Jeremy Thomas, Director of Product Development at Active.com, learned about the Amazon Simple Notification Service, he recognized that some of the application challenges that Active.com was experiencing with their existing solution, might be alleviated by switching to Amazon SNS. Amazon SNS offered a push-based eventing model, instantaneous delivery, and supported multiple protocols and large fan-outs. Although their application was still in beta, maintaining hosted XMPP servers in a highly reliable manner proved to be challenging. Jeremy Thomas, said, “We decided to reduce our cost, complexity and management overhead by switching to a low-cost managed messaging service instead of running our own messaging servers. We realized we could cut down our monthly costs, realize a more bug-free experience, and reduce operational burden and achieve greater flexibility by moving our message bus solution to Amazon Simple Notification Service.”
Amazon SNS was ultimately chosen as the message bus solution. One of the core business drivers for the Active.com message bus was the ability to gain relevant insights by aggregating and slicing real-time data to support different business goals. A common event model was used that was suitable for a wide range of activities on the Active.com Web site– from simply browsing articles on a particular sport, to searching for events of interest by location, to participating in community discussions. A subset of the click stream events that are captured and relayed using Amazon SNS can be viewed and sorted by geographic location, sport or activity by visiting; http//:realtime.active.com.
Subsequently, the group built their core real-time Web event management solution around Amazon Simple Notification Service (Amazon SNS). Components of Active.com run on Amazon’s cloud and uses AWS products in addition to Amazon SNS, including; Amazon Elastic Load Balancing (ELB), Amazon Elastic Compute Cloud (EC2), and Amazon Relational Database Service (RDS). There were three major subsystems developed by Active.com around Amazon SNS in order to meet its business objectives.
The architecture of the Active.com message bus using Amazon SNS is detailed in the diagram below.
After a "hit" comes in from one of the tracked events (registration, article view, search result) on the Active.com web site, the user is geo-located based on his IP address. Active.com then makes a call to the “Giffer” over HTTP to notify the system of the hit. The Giffer stores the event in the RDS database. The Publisher polls RDS, searching for new events. When one is found, it broadcasts a message over SNS to the Realtime Subscriber (via HTTP) which then takes that event, performs a lookup against the Active.com Search and Location APIs for additional metadata about said event, and inserts the information its collected into a separate RDS database for later display on http://realtime.active.com. (If a third party wants access to aggregate trending data, such as the most popular running events in LA, it can simply be given the subscription URL and they're good to go. This is precisely what www.active.com does with the “popular near you” widget on the homepage) Realtime.active.com constantly polls the RDS database for newly added hits for display, creating the activity stream effect on the Real-time map. Note only a subset of hits are actually displayed in order to deliver a more digestible user experience.
Before SNS was introduced, publishers and subscribers established a stateful session with the ejabberd XMPP server. Messages were produces and consumed through these sessions. Active.com discovered that these sessions would often and mysteriously lose connectivity to the XMPP server. As a result, a separate thread had to be written that would monitor sessions and reconnect them when this happened. Further, ejabberd documentation and support was not widespread, and this increased the total cost of ownership of the XMPP-based messaging solution.
Active.com appreciates that AWS helps reduce the burden of infrastructure planning and provisioning. Thomas says, “With the help of AWS Web services, we have established a powerful business platform, while maintaining the flexibility we need to continue innovating with a relatively small team.”
Jeremy Thomas concludes, “SNS allowed Active.com to process tens of millions of events per month by just flipping a switch—no scaling worries or architecture headaches whatsoever. Once it was plugged in, Active.com had an instant, established a reliable pipeline used to deliver data for mining real-time trends and displaying race registrations and searches as they happen on realtime.active.com."
To learn more about Active.com, visit http://www.active.com/ .
To learn more about Active.com's real time Web site, visit http://realtime.active.com/ .