Integrating RStudio Workbench with Spark and sparklyr#
sparklyr is an R interface for Apache Spark that allows you to install and connect to Spark, filter and aggregate datasets using dplyr syntax against Spark, then bring them into R for analysis and visualization.
You can install RStudio Workbench, formerly RStudio Server Pro1, within a Spark/Hadoop cluster and use sparklyr from R sessions.
The following articles describe how to integrate RStudio Workbench with a Spark cluster in different configurations:
- Using sparklyr with Cloudera CDH
- Using sparklyr with Amazon EMR
- Deployment and configuration options
Visit spark.rstudio.com for more information.