External reviews
External reviews are not included in the AWS star rating for the product.
Easy way to run Dask and speed up model training
You get an integrated Jupyter Lab + Dask cluster management environment, which makes it straightforward to parallelize model training and get a big speedup. Collaboration is built-in as well.
- Leave a Comment |
- Mark review as helpful
When you absolutely, positively need to parallelize all the data
Dask is a very powerful library that allows for parallel execution of python code across essentially arbitrary compute resources. I've used dask previously on a smaller scale for things like out-of-memory processing of very large dataframes too big to fit into ram on a respectable workstation.
Dask can take almost any job and make it as much faster as you want, depending on the number of processing nodes and their network connections, and your ability to create, debug, and maintain a distributed dask cluster. The latter of these can be quite a painful challenge to overcome.
We are very happy with the service that Saturn provides as they solve both of these issues at once. Their distributed client can autoscale the number of nodes in its cluster using whatever ec2 instance type thats needed and it plays very nicely cuda, which can be quite tricky (frustrating) to properly configure.
Executing the same code across multiple nodes equipped with their own cpu/gpu/ram is what makes a supercomputer super. Saturn essentially makes it convenient to rent a python-based supercomputer with whatever desired specifications limited only by the hardware available on aws and your vpc quota.
Low maintenance, high performance
Before Saturn, I wasted a ton of time trying to manage my team's JupyterHub. What began as a fun little project quickly turned into a maintenance nightmare. Saturn eliminated all the hassle. The environment just works. Within minutes we can go from one small, basic instance to multiple 64-core servers crunching big data. What's even more exciting is Saturn keeps getting better. New and useful features keep showing up, making it easier for my team to do great work. I'm looking forward to working with Saturn for a long time to come!
Fast Setup, Easy to Use
I used Saturn Cloud for a Machine Learning project that trained a network intrusion classier using PCAP data. In a few minutes I was coding in a jupyter notebook without having to worry about data privacy, and collaboration was simple. The ease of setup and computational power available make this a great collaboration tool, and I will definitely be using again.
Puts enterprise level power in the hands of a team of academics
I'm a data scientist working with a team of marine chemists on a series of peer review journal articles. SaturnCloud made getting them set up and going a snap.
We now have computational power equivalent to that I have used at a Fortune 50 company at a tiny fraction of the cost and with a much faster time to get up and going.
We would have used SaturnCloud for the built in collaboration tools even if we didn't need the computational power. Versioning is easy, even for people who have no experience at all with Git.
We have moved our analytics from a hodgepodge of Matlab, Excel spreadsheets and statistics software to a reproducible pipeline in Python.
Super easy set up on AWS + Dask in one click
I was surprised at how easy it was to get Saturn up and running via the AWS marketplace. It took me about 10 minutes from subscribing -> setting up -> email with user admin password -> spinning up a Jupyter Notebook. I'd imagine the value here is Dask and it was cool to see how easy it was to start up a Dask Cluster to be used for a Jupyter Notebook instance. I'm excited to start moving some of my local projects onto Saturn Cloud's AWS instance because I used to worry about having my data in their public cloud, but now know I have the privacy and security of my own VPC + can use Dask easily for the projects with larger datasets.