AWS Architecture Blog

Architecting a Low-Cost Web Content Publishing System

Introduction

When an IT team first contemplates reducing on-premises hardware they manage to support their workloads they often feel a tension between wanting to use cloud-native services versus taking a lift-and-shift approach. Cloud native services based on serverless designs could reduce costs and enable a solution that is easier to operate, but appears to be disruptive to end user processes and tools. A lift-and-shift migration, though it can eliminate on-premises hardware and maintain existing workflows, doesn’t eliminate the need to manage a server infrastructure, does nothing to improve a team’s agility in releasing enhancements after migration to the cloud, and may not optimize the cost of the resulting solution. Rather than settling for an either/or option that sacrifices cost savings and ease of operation in order to be non-intrusive to their web authors’ daily work, the University of Saint Thomas, Minnesota team implemented a creative hybrid approach that both avoids end user disruption and achieves the cost savings, agility, and simplified administration that a cloud-native solution can provide.

The Situation

University of St. Thomas wanted to reduce on-premises management of hardware for their university website. In addition, by migrating this functionality to the cloud, they intended to increase the website’s availability. The on-premises solution was deployed on an IIS server maintained by the IT team, but the content of the website was authored by staff members in departments across the university using two different Content Management Systems (CMS). The publishing process from these tools to the web server worked well, and there was no appetite for eliminating the distributed nature of the web site’s development nor the content management systems that the authors were comfortable with.

The IT team hoped to implement a serverless solution utilizing only Amazon Simple Storage Service (S3) to host the static website content. Not only would that reduce the cost of the solution, it would also eliminate having to manage web servers. One of the two content management systems could publish directly to S3, but unfortunately the other CMS could not.

A lift-and-shift migration approach would move the website onto an IIS server in Amazon Elastic Compute Cloud (EC2) and update the publishing process to write its outputs to this new server. This solution would avoid any impact to the authors because all the change would be accomplished behind the scenes by the IT team. However, this approach did not achieve the team’s goals of creating a solution that cost less and was easier to manage than the current on-premises one.

Rather than giving up on creating a cloud-native solution, the team worked from the constraints on the edges of the solution toward the middle.

Solution

Achieving the cost savings, management ease, and high availability for the solution depended upon using S3 to store the website’s contents (#1 in the diagram). If the CMS tools could have published directly to S3, the solution would have been completed by simply adjusting the CMS tools to target their output to S3. However, only one of the two CMS tools could do this. The other one expects to publish its output to a file system that is accessible to the on-premises server where the CMS tool runs. The team solved this problem by launching a t3.small EC2 instance (#2) to sit between the CMS tools and the S3 bucket that would store the website’s production content. Initially, it seemed like using two simple file sync processes could keep the file system of the EC2 instance synchronized with the CMS files. However, when the team first attempted this approach to build a copy of the website on EC2’s file system, they discovered that one of the sync processes would delete the other tool’s output rather than ignoring it when synchronizing updates from its tool to EC2.

To overcome this issue, the team created separate website roots in the EC2 file system into which each CMS would synchronize. Using Unionfs, a Linux utility that combines multiple directories into a single logical directory, a unified root folder for the website (#3) was created that could be easily pushed to S3 using the S3 CLI.

With this much of the solution in place the team had successfully created a new architecture for their website that was nearly as inexpensive as a static website hosted on S3, but that also maintained the tools and processes that their website authors were familiar with.

There was just one more technical issue to address: The IIS site contained internal metadata that redirected its users from virtual directories to the physical content located elsewhere in the website’s content. For example, https://..../law might be redirected to https://..../lawschool/ To achieve similar functionality, the IT team created one HTML file for each of these redirects and added them to a third website root directory in the EC2 instance (#4). These files include static HTML headers needed to redirect the user’s browser to the desired endpoint. Blending this directory with the other two through Unionfs creates a single logical copy of the website’s contents and that can be synchronized out to S3 with a S3 sync CLI command.

A final enhancement to the website was to use an Amazon CloudFront distribution (#5) to cache its contents providing improved response time for website visitors. The distribution object caching TTLs are set to defaults. The publishing process runs every 15 minutes, so to ensure that the website visitors would receive the latest content, the team wrote an AWS Lambda function (#6) that invalidates the cache each time an object is removed from (created in) the S3 bucket using S3 event notifications.

Conclusion

The University of Saint Thomas IT team found a creative way to implement a new solution for their university website that reduces the time and effort required to manage servers, achieves operational simplicity and cost savings by using cloud-native services, and yet doesn’t interfere with the web authoring tools and processes their customers were happy with. The mix of server-based and serverless components in their design illustrates how flexible cloud architectures can be and highlights the ingenuity of the team that built it.

Acknowledgements

Thank you to the following people at the University of Saint Thomas:

This solution was architected by Julian Mino, Cloud Architect. The creative use of Unionfs was suggested by William Bear, AVP for Applications and Infrastructure and former Linux administrator. Vicky Vue, Systems Engineer and Keith Ketchmark, Sr. Systems Engineer implemented the solution using Terraform, Ansible and Python. Daniel Strojny (Associate Director, Networks & IT Operations) helped resolved some internal DNS issues the team encountered.

Correction 2/13/2024 – This post originally referred to ‘Amazon Elastic Cloud Compute (EC2)’. This has been changed to the correct name: ‘Amazon Elastic Compute Cloud (EC2)’.

Craig Jordan

Craig Jordan

Craig Jordan is an AWS Solutions Architect and member of the AWS Data Solutions for Education team that is developing reusable architectural guidance to help teams like Maryville’s learn and apply AWS analytics and machine learning services to quickly build their own solutions. When he’s not working on technology solutions, he enjoys home improvement projects and opportunities to play piano.