S&P Capital IQ provides multi-asset class data, research, and analytics to institutional investors, investment advisors, and wealth managers around the world. The company offers a broad suite of capabilities designed to help track performance, generate alpha, identify new trading and investment ideas, and perform risk analysis and mitigation strategies.

Jeff Sternberg, Chief Data Scientist, says, “the S&P Capital IQ platform is unique in that we offer a broad set of quantitative and qualitative data about companies and people. We underpin this deep level of information with great analytic and visualization tools so our clients can really gain insights and derive value.”

Sternberg recalls the thought process when the data science team was considering Amazon Web Services (AWS): “AWS was a clear market leader in cloud services. We were very interested in getting into Hadoop, and AWS provides Amazon Elastic MapReduce (Amazon EMR), which allowed us to do that very easily. While we have an extensive, in-house IT infrastructure, we wanted to understand what it would take to ramp up. The cloud provided a great opportunity to prototype and evaluate quickly. We’re still in that evaluation period, but we think that our future state is probably a combination of internal infrastructure and cloud-based Hadoop computing.”

The team has also used Amazon Simple Storage Service (Amazon S3). Sternberg says, “With AWS, we have the ability to use an HTTPS-based API to push data into the cloud. It’s very easy for us to integrate from our data center. And we can also pull data back from the cloud when we are finished doing our analyses.”

One of the major advantages of using AWS for the data science team has been flexibility. Sternberg notes, “We don’t have to go through a long ramp-up time to bring hardware in-house, or go through a purchasing cycle, so we can get going really quickly with new ideas.”

Scalability is an essential component as well. “The scalable nature of the cloud allows us to fire up clusters of computers to process data using Hadoop,” says Sternberg. “And those clusters can be very configurable in terms of the number of nodes, the amount of storage, the amount of processing power. So it can be a very scalable solution.”

Finally, security is vital. Sternberg explains, “With AWS, we’re able to use SSL to transfer data securely. We use Amazon’s Server Side Encryption Support (SSE) to encrypt data stored at rest in Amazon S3 as well. We appreciate the general ability to lock down and control access to our compute nodes. This is obviously very important to us, as we can’t risk any unauthorized access.”

Sternberg anticipates continued use of AWS: “We have a long queue of experiments and analyses that we’re undertaking and we envision using Amazon Elastic MapReduce, Amazon EC2, Amazon S3, and some other services from AWS to help us get there. We can envision using those products consistently in the future.”

To find out more about how AWS can help you store and process big data, visit our Big Data details page: http://aws.amazon.com/big-data/.