Migrating your Startup from Firebase to AWS
Guest post by J. Michael Bako, Sr. Solutions Architect, Startups, AWS
Getting a startup off the ground is all about rapid iteration and getting to market as fast as possible. The decision on where to build your product is often made quickly without lengthy evaluation or long-term strategic consideration, based on factors like credit offerings, investor partnerships, and founder familiarity so you can start building without delay.
As such, we occasionally run into startups that built their initial MVP on Firebase, but desire to switch to AWS to achieve operations at scale with better data quality and reliability guarantees, and at lower cost. With Firebase consisting of proprietary services, APIs, and an SDK, a migration to AWS requires application refactoring – introducing a new architecture using AWS services, and rewriting parts of the codebase to use them accordingly.
To minimize the disruption of this refactoring, this guide will help you identify what AWS services are best suited for your startup’s new architecture along with some implementation strategies to ease and accelerate the cutover.
Translating Firebase Capabilities to AWS
Firebase consists of a collection of products and integrations that provide capabilities to your application. At its core, it enables a 2-tier web architecture: a client-side web or mobile front-end syncs data directly to the Cloud Firestore or Realtime database. Additional products then enable supporting capabilities, such as user authentication, push notifications, and crash analytics.
AWS Amplify and the AWS CDK
The fastest way to build and deploy a similar set of capabilities on AWS is to leverage AWS Amplify and the AWS Cloud Development Kit (AWS CDK). Unlike Firebase, where the products and integrations are part of a singular platform, Amplify and the AWS CDK provide an abstraction layer for many different AWS services, each with their own dedicated roadmap, support, and engineering teams. Amplify provides a development framework, SDK, code generation, and a DevOps pipeline that makes it easy to define back-end AWS services and integrate them with your front-end web or mobile client. The AWS CDK is for authoring Infrastructure as Code (IaC) templates beyond those generated through Amplify, with many pre-built constructs for common architectures you can reuse to minimize the coding effort. Thus, together you can quickly build a holistic architecture that follows best practices, minimizes operational overhead, and maximizes the ability of your application to scale in a cost efficient way.
For an example, you can read about how fintech startup Stedi accelerated their AWS development with Amplify and the CDK to build a commercial trading network to automate trillions of dollars in B2B transactions.
The AWS Amplify Experience
Working with Amplify has a developer experience very similar to that of Firebase. It predominantly involves using the Amplify CLI to perform configuration, code generation, and deployment tasks for the various capabilities it supports.
For example, after installing the CLI you initialize Amplify within your Firebase application project:
$ amplify init
To add a back-end REST or GraphQL API:
$ amplify add api
To deploy your application to Amplify Hosting:
$ amplify push
These commands generate the IaC templates needed to create the back-end AWS service resources, along with any configuration and other supporting files. The generated files are all conveniently located in an ./amplify folder off your project root directory.
After using the CLI to add capabilities to your application, you then update your code to make use of the Amplify SDK to integrate with them accordingly. For a more complete example, you can follow the “Getting Started” tutorial in the Amplify documentation.
Some of the primary integrations provided by AWS Amplify include:
Amplify provides a fully managed service for deploying and hosting a full-stack web application, with built-in CI/CD workflows. In addition to the back-end resources as depicted in the example above (and more), this includes hosting of front-end single page applications (SPA) using frameworks such as React, Angular, Vue, or Gatsby.
Amplify leverages Amazon Cognito to provide new user onboarding flows, a fully managed user directory, and pre-built sign-up, sign-in, multi-factor, and password retrieval functions. Amazon Cognito also supports identity federation for both social providers, such as Facebook and Google, and any provider with support for SAML or Open ID Connect (OIDC) protocols.
User Engagement and Analytics
Amplify uses Amazon Pinpoint to track user activity on your web/mobile application and helps you create marketing segments for targeted campaigns. Amazon Pinpoint enables communications over channels that include email, SMS, push notifications, and voice, with success metrics captured and presented in pre-built dashboards and reports.
Amplify simplifies incorporating many types of artificial intelligence into your application, without the need to train and deploy custom ML models. AWS has many fully managed AI services with capabilities including text translation, speech generation from text, entities recognition in image, interpretation of text, and transcribing text, which are all easily configured and consumed using Amplify. Through the Amazon Lex service, you can also incorporate conversational bots using the same intelligence that powers Alexa.
Check out the full list of features and capabilities managed through AWS Amplify.
From 2-tier to 3-tier
The 2-tier nature of Firebase invariably leads to having business logic reside in the client as your application grows in sophistication, which introduces complexity with version management and scalability over time. Server-side logic can be expressed using Cloud Functions, but only in response to changes in the database.
You can leverage Amplify and your migration to AWS to refactor into a 3-tier architecture – a web/mobile front-end, microservices for business logic, and any one (or more) of our 15 purpose-built databases as your data needs dictate. As shown in the example above, Amplify enables you to create both GraphQL APIs using AWS AppSync and REST APIs using Amazon API Gateway that integrate with your business logic defined in AWS Lambda functions. Not shown, you can also integrate with virtual machines on Amazon EC2 or containers that are orchestrated using Amazon ECS or Amazon EKS.
We mentioned earlier that AWS Amplify makes it easy to integrate your front-end with the back-end AWS services common to web and mobile architectures, however you are certainly not limited to just the AWS services it supports. With the AWS CDK, you can use any AWS service you wish and integrate into your application with the standard AWS SDK. Some additional services that are commonly used include:
Mobile & Web Testing
AWS Device Farm provides access to run tests across a fleet of real mobile devices and desktop browsers. You can select from pre-built test suites or provide your own tests using popular frameworks such as Appium, Calabash, and Espresso. You can test multiple versions of desktop browsers in parallel across Chrome, IE, and Firefox using Selenium.
AWS X-Ray provides an end-to-end view of requests as they travel through each of the components of your architecture. The data captured and visualized helps with both performance troubleshooting and debugging, including detailed error information and latency reporting.
For customers using BigQuery on Google Cloud, as shown in the example above, or wishing to incorporate more advanced analytics capabilities into their applications, AWS provides an extensive set of services. Please see an overview of these, along with migration strategies.
As mentioned in the introduction, migrating to AWS from Firebase requires some refactoring of your application. To minimize the disruption to their business, our customers favor running both their Firebase and AWS environments in parallel for a short period of time while leveraging some techniques described below to keep them in sync.
Below is a description of the common migration strategies for various aspects of your application. Luckily, these strategies (and more) have been fully implemented by my colleague and fellow Sr. Startup SA, Ben Shank. You can find the strategies in his open-source repo hosted on GitHub.
In addition, Shank has created a new workshop where you can practice using his tools to refactor a demo application from Firebase to AWS using Amplify and the CDK.
If you encounter any problems or have feedback, Shank is eager to hear about your experiences – please do file issues, PRs, or contact him through GitHub as appropriate!
Migrating Firebase Users to Amazon Cognito
Firebase does not support exporting user passwords, and thus a bulk import to Cognito will flag users as requiring a password reset.
To work around this, Cognito provides the ability to invoke a Lambda function for migrating users the first time they try to log in to the User Pool. You use the Firebase SDKs in this Lambda function to interface with the Firebase Admin API and successfully authenticate the user. You then return the user record to Cognito with its status set to Confirmed to enable that user to sign-in seamlessly moving forward. You can read more details about this implementation in our Cognito documentation.
Migrating Data from Cloud Firestore to Amazon DynamoDB
Cloud Firestore is a document-oriented database this is optimized for small documents. As such, the most common migration target on AWS is Amazon DynamoDB. There are two angles of attack for migrating data from Firestore to DynamoDB – a bulk loading of your existing data, and ongoing replication of changes until you are ready for full cutover.
While bulk loading sounds like it may entail a simple export/import operation, the data models between the two databases are not fully compatible with each other. As such, this process requires an ETL (Extract, Transform, and Load) pipeline. Our recommendation is to build this pipeline on AWS using a combination of AWS Step Functions, Lambda, and Amazon SQS queues. You create Lambda functions for each of the ETL steps and write their results into a corresponding SQS queue, and orchestrate the pipeline using Step Functions. This ensures you can have full durability and replayability of the jobs as they progress through the pipeline, minimizes the operational overhead using fully managed services, and has the lowest cost.
Ongoing change replication is a bit simpler, leveraging the Firestore listener capability. You create a small application to run as a background process that provides a callback to the Firebase SDK onSnapshot() method to write the corresponding changes to DynamoDB. Optionally, you can trigger an AppSync no-op mutation to synchronize these changes to your connected clients.
Migrating Object and Analytics Data
There are many utilities available to synchronize objects from Google Cloud Storage to Amazon S3, including some capabilities within the Google Cloud SDK itself. As mentioned earlier, we have a comprehensive blog post that covers migrating analytics data and processes from Google Cloud to AWS. Note that while it is free to load data into AWS, there are charges for exporting data out from Google Cloud that you’ll want to account for.
This post covered the common capabilities that Firebase provides and how to achieve similar outcomes on AWS. Using AWS Amplify and the AWS CDK accelerates the refactoring and deployment of your application, while minimizing your operational overhead. The migration techniques covered prevent downtime and allow you to dual-host your application until you are ready to perform a full cutover. It’s also encouraged that you leverage the toolkit created by Ben Shank to save on the overall coding effort required. Once fully running on AWS, we hope you enjoy the full breadth and depth of capability at your fingertips and the strong foundation you’ve created by following this guide. Have fun, and build on!
About the Author
J. Michael (“Jay”) is a Sr. Startup Solutions Architect at AWS with over 20 years experiences as an engineer, architect, and executive. Over the past 5 years, Jay has advised some of the world’s most recognized startup brands on migration strategy, architecture best practices, and optimization and governance on AWS. When not hacking away on his computer, Jay can be found hacking away at weeds on his 5-acre hobby farm outside Seattle, WA.