Sinch provides a range of software development toolkits (SDKs) and application programming interfaces (APIs) that allow developers to integrate communication functions—specifically voice, verification, SMS, video, and instant messaging—into their mobile apps. The company is a spinoff from parent company Rebtel, a Swedish voice-over-IP firm that provides cost-effective international calling services. Sinch has around 20,000 developers using its platform, and serves big brands such as Easy Taxi, Nimbuzz, Tango, and Truecaller, as well as startups and individual developers. The company has grown rapidly since it was established in 2014 and has its headquarters in San Francisco, with additional offices in Stockholm and London. To date, it has connected more than five billion voice minutes for more than 500 million users through its mobile communications technology.

Business intelligence (BI) is central to operations at Sinch. Its product, marketing, and sales departments, among others, use reports generated from big data analysis to support the decision-making process—and ultimately keep the company competitive. “BI plays an important role in helping us understand the business and identify new opportunities,” says Pantelis Parastatidis, business intelligence developer at Sinch. But gaining business insight was proving difficult with the firm’s on-premises architecture. To save storage space and enable faster querying, the data residing in its existing data warehouse was already aggregated—limiting the ability to drill down into the data as well as “not being best practice in data warehousing,” according to Parastatidis. He continues, “We just didn’t have the hardware or the staff resources to handle big data. As a result, our business teams couldn’t get a complete picture of the company’s operations, which in turn affected the quality of decision-making. We wanted to move to the cloud to support big data and our internal business units.”

With its business—and data—rapidly expanding, Sinch was looking for a highly available, terabyte-scale solution that was easy to manage and allowed it to get the most from its big data. The firm evaluated a number of cloud providers on the market, but chose Amazon Web Services (AWS) after being convinced of its ability to scale on demand and be available 24/7. “The clear cost structure of AWS was attractive. Plus, it looked like it offered the most stable services of all the cloud providers on the market,” says Parastatidis.

All Sinch AWS resources are orchestrated by AWS Data Pipeline. Records from Sinch SDKs are fed into Amazon Simple Storage Service (Amazon S3), from where the files are processed and parsed using Apache Pig. Each time the records are parsed, Amazon Elastic MapReduce (Amazon EMR) clusters are generated and then terminated after the parsing ends. From there, Amazon Elastic Cloud Compute (Amazon EC2) instances copy this data to corresponding tables in the staging area in Amazon Redshift. The data is subsequently hosted in an Amazon EC2 server. Amazon Simple Notification Service (Amazon SNS) provides notifications to the BI team so they can keep track of the solution easily through their email accounts.

The illustration below shows the Sinch business intelligence architecture using AWS:


AWS technologies integrate with the firm’s on-premises architecture, which includes Microsoft SQL Server Integration Services (SSIS) for the extract, load, transform (ETL) process, and Microsoft SQL Server Report Builder 3.0 for reporting services.

“The project is ongoing but we’re already two-and-a-half months ahead of schedule and within our budget, which we defined as less than $50 a day for the whole BI architecture,” says Parastatidis. “This success is mainly due to how easy AWS is to use and the immediate support we’ve received.” Of AWS Support, he says, “It’s helped us out a number of times—we receive really detailed answers to our queries with good technical detail.”

By migrating its BI solution to AWS, Parastatidis and his team can help Sinch business units to analyze big data and gain insight into operations. “The performance in Redshift is much better than in our on-premises data warehouse,” says Parastatidis. “It allows us to produce analytics, predictive models, and reporting in much greater detail.”

Processing is also fast, with data loading taking half the time. “In our on-premises architecture, loading data into the warehouse takes up to one hour and 10 minutes. This same process in the AWS infrastructure takes less than 30 minutes,” says Parastatidis.

Operating an infrastructure in the cloud is also helping Sinch save significant time and resources. “With hardware, we start with a cost-benefit analysis: what servers and disk storage we need and whether we have the money to buy them. This process is not only time consuming, but the price we eventually arrive at inevitably isn’t the whole cost. We pay $41 a day to run our data warehouse on Redshift and with Data Pipeline. AWS is a cost-effective option for us and pricing is transparent so we know what we’re spending. And because it’s so easy to use, I can set it all up and maintain it myself. I don’t have to spend money on a database administrator,” says Parastatidis.

Although running costs were a big factor in the company’s move to the cloud, the decision wasn’t based purely on price. “High availability is vital for business intelligence, so having an environment that’s stable is very important. Our data warehouse needs to be available 24/7 to all stakeholders—both in Europe and the US—to enable them to get the reports they need when they need them,” says Parastatidis.

Sinch is also impressed with the on-demand scale it has with AWS. “It’s really straightforward to scale in Amazon Redshift,” says Parastatidis. “If we need additional servers, they’re only a few clicks away. It’s a completely different world compared to provisioning the hardware. For example, with Redshift, if we want to run SQL queries using EC2 instances, we just choose from the options in the drop-down list, and pick the type of instance we need. It takes seconds.” And although data is growing daily, Sinch doesn’t have to continually analyze whether it needs more hardware or staff to handle that growth. “Our Redshift storage is growing by around 2 to 2.5 GB a day,” says Parastatidis “We currently hold 232 GB of data in Redshift, and this will grow as we add other types of business information to the system.”

Sinch is already hoping to expand its business intelligence solution and increase its use of AWS to provide even greater insight for its internal customers. Currently, the data held in AWS relates mainly to products, but the firm is looking to include marketing, sales, and financial information. “In the future we want to build more structure into our reporting as well as add data that’s related to different parts of the business. This will give us a more comprehensive view of operations and a better basis for decision-making,” says Parastatidis.

