AWS Startups Blog
Scale your startup with Serverless on AWS
In 2019, cinch launched as a consumer-facing online platform that helped customers in the UK find, buy, and sell used cars. In late 2020, cinch spotted a trend in the used car sales market with new players going direct-to-consumer in the US and Europe. The pandemic accelerated the trend, and customer demand for used cars. Because of this, cinch announced a shift of their business model towards a direct-to-consumer car retail marketplace. The opportunity was huge, but required a domain (and organizational) transformation.
To deliver their new vision, the cinch team built a prototype on their existing containers platform. But with fewer than 30 employees, they got bogged down by managing it. After experimenting with different options, they decided to pivot to Serverless on Amazon Web Services (AWS) framework to reduce complexity. “Once I saw that an application was one file in serverless framework against hundreds in Kubernetes. It was a no-brainer,” says Jaz Chana, Technology Director at cinch.
By using AWS to scale their infrastructure, the cinch team was able to focus their cognitive load on improving the platform, quickly release new features, and re-build existing ones based on real world customer insight.
With their new architecture, cinch was able to pivot their business to the new model in 6 months, increase traffic by 2.5x (6,000 to 16,000 requests per minute), and reduce latency. They went from hundreds of cars sold within days, and grew by a factor of 100x within a few weeks.
Save time and effort with a walking skeleton
To save time in the initial architecture and design, the team built a “walking skeleton”: a tiny implementation of a larger system that performs a small end-to-end function. A walking skeleton doesn’t aim for perfection, but must be robust enough to pull together its main architectural components. The architecture and the functionality can then evolve in parallel.
As Figure 1 shows, the walking skeleton initially relied on Amazon DynamoDB, Amazon Simple Storage Service (Amazon S3), and an AWS Lambda function. The Lambda function loaded an array of vehicles from Amazon S3 and performed any sorting and filtering in memory. This approach worked for simple filtering and supported a few thousand cars.
But as the application gained popularity and the team gained real world insights, they saw that the existing architecture couldn’t support the complex aggregation/facet counting on multiple selectable filters that customers wanted.
With serverless you can rebuild instead of refactor
As shown in Figure 2, to address the increased requirements, the team created a new repository instead of iterating on their existing codebase. They also introduced Amazon OpenSearch to deliver advanced filtering at scale. Amazon EventBridge decoupled the search service from the rest of the platform. This let the search team replace the service without affecting other teams.
With this revised architecture, cinch can autonomously maintain integrations across domain boundaries. And, as they continue to scale, the on-demand, usage-based pricing model of serverless services allows them to adopt or decommission services based on their requirements, without having to consider contract terms or the cost of running multiple architectures in parallel.
“It might sound like this approach involved more effort, but it ended up being much cleaner. Ultimately, I don’t even think it required more work. Starting from scratch without serverless would have taken much longer; we may not even have done it at all.” Bertie Blackman – Automation Engineer – cinch (Search Team)
Reducing cognitive load to save time
To maximize the output of the team, cinch aimed to optimize for cognitive load, or the amount of information that working memory can hold at one time.
There are 3 types of cognitive load:
- Germane cognitive load is the effort that is core to the topic. By increasing germane load, you can solve the problems that matter.
- For example, “What service should offer vehicle details?”
- Intrinsic cognitive load is the effort associated with building a software platform. By decreasing intrinsic load, you’ll be able to scale the team faster and easier.
- For example, “How do I consume an event? How do I add something to a queue? How do I test a Lambda function?”
- Extraneous cognitive load is related to the environment in which the task is being done. You want to get rid of extraneous load through automation and by offloading everything your application needs to do but doesn’t increase its competitive advantage in the eyes of its customers.
- For example, “How do I configure this service to scale with traffic? What operating system should I use?”
Serverless speeds up decision-making to optimize for germane load
Serverless frees up cognitive load and allows teams to focus on the business domain. To support this, cinch defined a path to maximize memory for germane cognitive load, as summarized here and shown in Figure 3:
- Test-driven development: teams create automated test cases alongside code, eliminating the need for a separate test team
- Observability: instrumenting the code with intent so that the teams can measure the health of business transactions
- Pairing, mobbing: use agile software techniques that reduce bottlenecks handoffs and ensure high quality
- Trunk-based: use source control branching model, with developers collaborating on code in a single branch
- Serverless: build and run applications without managing infrastructure
- Event-driven: build architectures with EventBridge to decouple services without having to manage implementation details
Using Lambda, DynamoDB, and Amazon API Gateway encourages teams to build loosely coupled architectures and independently deployable services. With serverless, engineers don’t have to worry about managing AWS Availability Zones, scaling infrastructure, or operating system patching, which add extraneous cognitive load.
The cinch team uses EventBridge for inter-domain service communication. With EventBridge, when they have an event to share across domains, they can publish it to a shared event bus and not have to think about how it will be consumed.
Equally, when a team wants to respond to an event from another team, they know the event will be on a central shared bus. Having a shared location and event structure further reduces intrinsic cognitive load.
Conclusion
cinch achieved its ambitious goals by using AWS serverless technologies. When the new website was launched, traffic quickly increased by a factor of 2.5x (peak of 6,000 requests per minute to 16,000 requests per minute).
With their serverless architecture, AWS manages the scale automatically. So, as the number of requests increased, latency went down. Traffic is highly variable, with anywhere between 250k and 4.5 million events a day, a 15x difference in volume between quiet and busy days.
With AWS scaling the infrastructure, cinch can focus their cognitive load on improving the platform. Now they can quickly releasing new features and re-build existing ones based on real world customer insight. This helps fuel exponential growth; cinch has gone from hundreds of cars sold within days, to growing by a factor of 100x within a few weeks.