AWS Big Data Blog

Tag: RStudio

Deploy an Amazon EMR edge node with RStudio using AWS Systems Manager

RStudio is an integrated development environment (IDE) for R, a language and environment for statistical computing and graphics. As a data scientist, you may integrate R and Spark (a big data processing framework) to analyze large datasets. You can use an R package called sparklyr to offload filtering and aggregation of large datasets from your […]

Read More

Running sparklyr – RStudio’s R Interface to Spark on Amazon EMR

This post was last updated July 7th, 2021 (original version by Tom Zeng). The Sparklyr package by RStudio has made processing big data in R a lot easier. Sparklyr is an R interface to Spark, it allows using Spark as the backend for dplyr – one of the most popular data manipulation packages. Sparklyr also […]

Read More