Data visualization software company Tableau has been a perennial innovator in the data analytics space, offering some of the largest companies in the world a variety of ways to analyze and interact with data. When Tableau introduced Tableau Online, a fully managed software-as-a-service (SaaS) offering, the company observed an immediate and exponential increase in customer demand, but its hybrid data center infrastructure could not support Tableau Online’s rapid growth and expanding global footprint.
To solve those challenges, Tableau created a pilot point of delivery (PoD) module on a single cloud solution built on Amazon Web Services (AWS). As a result, Tableau improved performance, scaled to demand, and reduced operating costs. The success of the pilot convinced Tableau to go all in on AWS for Tableau Online, building in a cost-beneficial infrastructure with near-infinite scale and redundancy and drastically improved performance. This solution also enhanced availability to 99.9 percent and provided the agility to launch quickly in new regions. Going all in on AWS ultimately enabled Tableau’s modernization journey, giving the company the tools to meet the high demand for Tableau Online and to fast-track its SaaS transformation. Tableau extended its AWS migration to also include all its developer productivity systems.
Opportunity | Breaking Out of an On-Premises System and into the Cloud
Founded in 2003, Tableau quickly established a reputation for sophisticated data visualizations that were easy to create. Traditionally, customers could choose Tableau and deploy where they wanted—on Windows or Linux, on premises or in public clouds—and could connect to data inside and outside their own data centers. To provide its customers with more flexibility and freedom, in 2013 Tableau launched Tableau Online, a fully hosted solution, so that customers would never have to configure servers, manage software upgrades, or scale hardware capacity. Customers could get up and running in minutes and seamlessly add users as their needs grew.
Initially, Tableau Online used an on-premises colocation data center and managed service provider facilities. Each of these facilities ran a few Tableau Online service instances, or PoDs. Over the next 3 years, Tableau Online expanded quickly, but the more its business grew, the more challenges it encountered with scaling its facilities. Tableau Online needed new PoDs in the US East and Europe (Ireland and London) regions, but setting them up using the hybrid data center infrastructure was challenging, and procuring hardware usually took 6–9 months—3 months if expedited.
Tableau wanted a single simple solution for Tableau Online that would improve performance and scalability, bring the PoD closer to customers, and offer them data locality, thereby increasing visualization speeds. Tableau saw that the global presence of AWS could facilitate data localities for Tableau’s widespread customers. Performance testing revealed that AWS virtual machines outdid Tableau’s existing data center virtual machines in memory performance and flexibility. Wanting to migrate Tableau Online to AWS before having to renew the lease extensions of its Santa Clara and Dublin data centers, Tableau began its first foray into the cloud in 2016.
Migrating to AWS was seamless, and customers started having three to five times better performance across the board.”
Vice President of Cloud Engineering, Tableau
Solution | Seeing Instant Results on AWS
Tableau didn’t waste any time with its lift-and-shift Tableau Online migration. “We did one clean swoop to move everything—all the customer workloads and processes for Tableau Online—to AWS,” says Pankaj Dhingra, vice president of cloud engineering at Tableau. First, the migration team designed its network layer, providing security and connectivity among its environments on Amazon Virtual Private Cloud (Amazon VPC), which helps businesses launch AWS resources in logically isolated virtual networks that they define.
The next step of the migration involved Tableau Online’s storage layer. The company opted for Amazon Relational Database Service (Amazon RDS) for PostgreSQL, which makes it simple to set up, operate, and scale PostgreSQL deployments in the cloud. For compute, Tableau Online relied on Amazon Elastic Compute Cloud (Amazon EC2), a web service that provides secure, resizable compute capacity in the cloud. After discovering that the cloud-native file system on Windows File Server didn’t meet its performance needs, Tableau decided to use a distributed replicated storage solution backed by performance-enhancing Amazon Elastic Block Storage (Amazon EBS) volumes, which are durable block-level storage devices that can be attached to one instance or multiple instances at the same time. Amazon EBS uses Amazon EC2 instances to handle throughput and transaction-intensive workloads at any scale. As a result, queries into Tableau’s database, Hyper, became four times faster.
“Migrating to AWS was seamless,” says Dhingra, “and customers started having three to five times better performance across the board.” The longest load time (99th percentile) for visualizations dropped from 6.6 to 2.2 seconds, and shorter load times (50th percentile) dropped from 0.8 to 0.18 seconds. In one example, the median viz load time decreased from 41.10 seconds to 18.62 seconds, and the median load time distribution in the US-Seattle region decreased from 65.9 seconds to 42.6 seconds. “We got excellent help and resources from the AWS team,” says Dhingra. “AWS Solutions Architects helped us review our designs and future-proof the things that we were doing.” To make sure the cloud solution would perform as efficiently as possible, Tableau tested more than 20,000 Tableau vizzes, data import speeds, compute power, extract refreshes, and data center access speeds. Ultimately, the migration enabled Tableau to triple in scale: for example, the PoD in the US East (N. Virginia) region can support about 100,000 users, whereas the old architecture could handle 30,000–40,000 users per PoD.
The company also gained redundancy and georeplication at high-bandwidth, low-latency networking between AWS Regions and Availability Zones. Visualization load times for Tableau Online improved by two to three times, an observation that finalized Tableau’s decision to go all in on AWS. The company can bring up a new PoD in a new geography in a matter of days using AWS, compared to months on premises. In addition to launching the US East region PoD for Tableau Online in 2019, the company launched two PoDs in 2020: one in the Asia Pacific (Southeast-2) region and one in the Asia Pacific (Northeast-1) region.
Migrating Developer Productivity Systems to AWS
In early 2017, Tableau’s foundational systems—which included a grid consisting of thousands of virtual machines for developers to develop and test code, as well as performance testing labs for benchmarking and performance testing to ensure high-quality and trustworthy software—still ran on on-premises data centers. This required millions of dollars in new hardware every quarter. And the company was budgeting for peak load capacity, even though almost 80 percent of the compute capacity remained unused at night. To gain flexibility, Tableau had purchased hundreds of racks, slotted thousands of servers, and kept building data center skills to manage it all. But the company wanted a fully managed, cost-effective solution—one that would not only provide flexibility and agility but also scale compute capacity quickly.
“Our developer productivity system needs were growing at a fast clip, and we spent most of our time procuring new hardware and adding it to our data centers,” says Dhingra. “And by the time we made it ready and functional, we had already outgrown that capacity.”
Tableau realized that although AWS would cost the same as its data centers, it would be more beneficial because AWS managed services would save Tableau time (including a 20 percent time reduction by the operations team) and resources by maintaining hardware, including patching and updates.
Outcome | Improving Operations and Expanding Business on AWS
By moving to AWS, Tableau adopted a standardized compute solution that can be upgraded to the newest versions while enabling quick experiments to determine the exact compute the company needs. Tableau used this experimenting capability to rightsize Amazon EC2 instances for different workloads. For example, within Tableau Online, rightsizing saved the company about $1 million in 2019. Now, Tableau can do the equivalent of adding hardware in an on-premises system—but on AWS, the company can add provisioning much more quickly and without interrupting service to internal or external customers. In 2020, Tableau Online consistently exceeded its 99.9 percent availability target.
On AWS, Tableau increased the performance, reliability, and scalability of Tableau Online while simultaneously reducing costs and streamlining operations. The AWS-powered Tableau Online compute environment also challenged customers’ perceptions of cloud versus on-premises capabilities. “When customers tried Tableau Online, they found the performance to be better than their on-premises hosted Tableau Server,” says Dhingra. “Even though Tableau Online is a multitenant environment, we were able to guarantee better performance.”
Tableau is a worldwide provider of business intelligence software to companies ranging from global enterprises to early-stage startups and small businesses. Headquartered in Seattle, Washington, Tableau has 17 international offices in Europe and the Asia-Pacific region.
AWS Services Used
Amazon Elastic Compute Cloud (Amazon EC2)
Amazon Elastic Compute Cloud (Amazon EC2) offers the broadest and deepest compute platform, with over 500 instances and choice of the latest processor, storage, networking, operating system, and purchase model to help you best match the needs of your workload.
Amazon Virtual Private Cloud (Amazon VPC)
Amazon Virtual Private Cloud (Amazon VPC) gives you full control over your virtual networking environment, including resource placement, connectivity, and security.
Learn more »
Amazon Elastic Block Storage (Amazon EBS)
Amazon Elastic Block Store (Amazon EBS) is an easy-to-use, scalable, high-performance block-storage service designed for Amazon Elastic Compute Cloud (Amazon EC2).
Learn more »
Amazon RDS for PostgreSQL
Amazon RDS for MySQL frees you up to focus on application development by managing time-consuming database administration tasks including backups, software patching, monitoring, scaling and replication.
Learn more »
Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.