AWS Public Sector Blog
Flexibility, cost-savings, and innovation: Kellogg School of Management chooses AWS
At the end of 2022, Will Thompson, lead computational research consultant at Northwestern University’s Kellogg School of Management, had a decision to make. The on-premises SQL server used by Kellogg faculty and students had reached the end of its life, and his team needed to identify a cost-effective way forward while ensuring that the datasets would remain highly available for researchers to use on demand. After weighing various options, Thompson worked with Amazon Web Services (AWS) to create a data lake that perfectly fit his institution and its unique needs.
Solving for an aging server and disparate datasets
Thompson’s computational research team plays a critical but often invisible role at Kellogg: maintaining a central collection of data to support research conducted by the school’s economists and graduate students. The datasets vary significantly in size and type, including both personal databases and datasets composed of multiple terabytes. For years, this repository was hosted on a proprietary commercial database using high performance SAN storage. But in 2022, the hardware urgently needed to be replaced to bring the server up to modern speed, availability, and security standards.
As Thompson considered replacing the SQL server or choosing a new, cloud-based solution, five primary factors played into his decision-making:
- Cost-effective – Replacing the on-premises hardware would be expensive, potentially costing Kellogg hundreds of thousands of dollars. The new data collection needed to be budget-friendly—both to install and maintain.
- User experience – Thompson didn’t want the user experience to change dramatically. Researchers’ process for querying should remain relatively consistent after the migration, allowing research to continue.
- Simple to maintain and secure – The ideal solution would reduce the maintenance and security burden on Thompson’s small team and his colleagues in cloud ops.
- Flexible storage – Under the old system, the large, static datasets were always online. “But that wasn’t our use case,” says Thompson. “Our databases don’t need to be queried all the time. Researchers will use a dataset for a few weeks, and then it will be untouched for years.” Kellogg needed a flexible solution that would store data for on-demand use.
- Future-ready – There was also an opportunity cost associated with staying on the SQL server. For example, migrating the data collection to the cloud would offer researchers artificial intelligence (AI) and machine learning (ML) capabilities.
Finding the ‘right fit’ solution in the AWS Cloud
After researching various options, Thompson’s team realized that the AWS Cloud met all five criteria. A data lake on AWS would provide the low-cost, flexible solution they sought while ensuring both continuity and opportunity for researchers. As a bonus, the attentiveness of the AWS team made Thompson feel assured they would not be left alone during the migration.
“The support from the AWS team made our decision to go with AWS an easy one,” explains Thompson. “We always knew we could get our questions answered and that we’d never get stuck in the process.”
In collaboration with AWS, the migration required effort from only two members of the computational research team. Using a custom code, the team members migrated Kellogg’s datasets to a data lake using Amazon Simple Storage Service (Amazon S3) and then made them available for analysis with Amazon Athena.
The result? A flexible solution and 90% cost savings
The shift to the AWS Cloud not only saved Kellogg the cost of replacing expensive hardware, but also resulted in much more flexible storage. AWS allowed Thompson’s team to leverage intelligent tiering, meaning that data is automatically moved into cheaper cloud storage when it’s not being queried and used. “I expect the savings to be very significant,” says Thompson. “Up to 90 percent compared with completely refreshing our SQL Server infrastructure.”
Along with cost savings, the flexibility of the AWS Cloud has eliminated many previous issues. For instance, Thompson is no longer worried about the reliability of old hardware or the cost and effort of storing backups. And for users, availability on AWS has improved “many times over,” says Thompson, ensuring that researchers and students have access to datasets whenever needed. As an added benefit, the automation features baked into the AWS Cloud have eased the computational research team’s workload, allowing them to focus on learning and innovation instead of maintenance.
A seamless and improved experience for researchers
The best indicator of migration success may have been that, for the most part, no one noticed. For researchers at Kellogg, the experience of querying and working with the data collection has remained mostly the same.
“You can still log in using your standard Northwestern ID and query the database using SQL,” says Thompson. “We didn’t want to add anything more to their plate. These folks are getting a PhD in economics, not software engineering.”
But, Thompson acknowledges, there’s potential for researchers to do much more with the new AWS data lake. With their datasets in the cloud, researchers interested in using AL and ML now have that capability. Thompson and his team are excited by the potential. “We have all of AWS available to use in the future. You don’t have to limit yourself.” Thompson is already planning to obtain additional AWS certifications for his team so that his staff can support researchers’ use of AI/ML.
“We want to be as expert as possible in the cloud,” Thompson says. “That’s what’s next for us—to keep working with AWS and learning more.”
Kellogg isn’t the only school within Northwestern University using the AWS Cloud – explore how Northwestern University Libraries is making research more efficient and accessible.
Learn more about how higher education institutions, research labs, and researchers around the world are accelerating time to science with the AWS Cloud. Watch 10-minute videos designed to help you use the cloud for research.