3 Getting Started: Configuration

This guide is to help administrators configure RStudio Package Manager (RSPM) for your use cases.

3.1 Prerequisites

Before configuring RSPM, it’s required that you:

  • Install RSPM
  • Activate a license for RSPM

For more information about installing and licensing RSPM, see the Getting Started: Installation section.

When configuring RSPM, it’s required that the commands provided in the quick start guides are run as an appropriately privileged user. By default, the user should be a member of the rstudio-pm group.

Additionally, the instructions assume that you’ve:

  • Set up an alias: alias rspm='/opt/rstudio-pm/bin/rspm' or
  • Added the binary to your path

3.2 Overview

Once you have set up RSPM, share the URL to RSPM with your users. By following the instructions included in each repository’s Setup page, users can set up R or RStudio to use RSPM.

RStudio Server (Pro) can be configured to use RSPM without requiring user setup. For more information, see the configuration instructions.

Follow the quick start guide(s) that are best suited for your use case(s):

  • Serving CRAN Packages 3.3
  • Serving Local Packages 3.4
  • Serving Local Packages from Git 3.5
  • Serving CRAN and Local Packages 3.6
  • Supplementing CRAN with Bleeding Edge Packages from GitHub 3.7
  • Serving an Approved Subset of CRAN 3.8
  • Serving an Approved Subset of CRAN and Local Packages 3.9

3.3 Serving CRAN Packages

A common use case for RSPM is making CRAN packages available in environments with restricted internet access.

To make the CRAN packages available:

  • Ensure that RSPM has the appropriate metadata using the sync command. RSPM pulls packages and metadata from the RStudio CRAN service.

    Note: It is not necessary to configure an upstream CRAN URL.

  • Create a repository and subscribe it to the built-in source named “cran”.

# Initiate a sync:
rspm sync --wait

# Create a repository:
rspm create repo --name=prod-cran --description='Access CRAN packages'

# Subscribe the repository to the cran source:
rspm subscribe --repo=prod-cran --source=cran

Future updates occur on a schedule dictated in the configuration file. For more information, see the Updates from CRAN section.

After completing these steps, the prod-cran repository is available in the web interface.

3.4 Distributing Local Packages

Many teams have a handful of internally built packages. However, if your internal packages are tracked in Git, then refer to the Serving Local Packages from Git section.

If your internal packages are local:

  • Create the bundled version of each package: R packages - Package Structure

    Note: If you are unfamiliar with building the bundled version of a package, then reach out to the R developer that maintains the package.

  • Copy the resulting tar files to the RSPM server:

# Create a local source:
rspm create source --name=prod-internal-src

# Add each local package tar file to the source:
#  The tar file must be rwx by the user running the CLI and the account running
#  RSPM (rstudio-pm by default)
rspm add --source=prod-internal-src --path='/path/to/package_1.0.tar.gz'

# Create a repository:
rspm create repo --name=prod-internal --description='Stable releases of our internal packages'

# Subscribe the repository to the source:
rspm subscribe --repo=prod-internal --source=prod-internal-src

RSPM automatically supports multiple versions of each package. When the R developers are ready for the next release of a package, simply run:

rspm add --source=prod-internal-src --path='/path/to/package_2.0.tar.gz'

RSPM ensures that version 2.0 is the default for new installations.

Note: If you wish to use an older version, RSPM keeps version 1.0 in the repository’s archive.

Most internal packages depend on packages from CRAN. In this case, the easiest option is to create a repository that includes the local packages and their dependencies.

For more information about serving local packages along with CRAN packages, see the following sections:

3.5 Serving Local Packages from Git

The previous configuration that uses a local source requires manual steps to add and update packages. If your organization uses Git to store internal R packages, then you can automate this process using a Git source.

Git sources require a valid R installation. For more information, see the Building R Packages section.

# Create a Git source:
rspm create source --type=git --name=prod-internal-src

# Add a Git endpoint, configured to surface tagged commits:
rspm add --git-url=https://bitbucket.example.com/r-pkg.git --source=prod-internal-src --git-build-trigger=tags

# Create a repository and subscribe it to the source:
rspm create repo --name=prod --description='Stable releases of our internal packages'
rspm subscribe --source=prod-internal-src --repo=prod

Packages can be built using Git endpoints accessed via HTTP(s) or SSH URLs: https://github.com/user/repo.git vs. git@github.com:user/repo.git.

If the Git URL uses SSH, then it requires an SSH key for authentication. In this case, first import the key and then use the add command.

SSH keys are not required to use a passphrase, but a secure key with passphrase is recommended.

# Import the SSH key:
# passphrase file should just be text file with passphrase for key (avoid leaving in bash history):
rspm import --name=read-r-pkg --path=/path/to/ssh/key --passphrase-path=/path/to/passphrase/file

# Optionally, remove the key from disk:
# rm /path/to/ssh/key

# Add the package:
rspm add --git-url=user@bitbucket.example.com/r-pkg.git --source=prod-internal-src --git-build-trigger=tags --git-ssh-key=read-r-pkg

For more information on Git sources and SSH key security, see chapters 15.6 and 15.6.1.1.

3.6 Distributing Local Packages along with CRAN Packages

For convenience, organizations frequently opt to distribute their local packages along with CRAN packages in a single repository. This setup gives a single URL for all the organization’s R packages.

To create a local source or a Git source along with a cran source, follow sections:

Run the following commands:

# Confirm sources exist:
rspm list sources; # should have: prod-internal-src, cran

# Create repository:
rspm create repo --name=prod --description='Production R packages from CRAN and our local packages'

# Subscribe the repository to the sources:
rspm subscribe --repo=prod --source=prod-internal-src; rspm subscribe --repo=prod --source=cran

In the final step, the order of subscriptions is important. If conflicts occur, then packages from the local source are favored. For information about resolving conflicts, see the Repositories with Multiple Sources section.

3.7 Supplementing CRAN with Bleeding Edge Packages from GitHub

In addition to a production repository, some organizations allow advanced R users to access “bleeding edge” versions of packages available on GitHub.

To make packages from GitHub available to R users, execute the following commands:

# Create a Git source:
rspm create source --type=git --name=github

# Add the GitHub version of ggplot2:
rspm add --source=github --git-url=https://github.com/tidyverse/ggplot2.git --git-build-trigger=commits

# Sync CRAN:
rspm sync --wait

# Create a repository, and subscribe it to the GitHub source and CRAN:
rspm create repo --name=bleeding-edge --descrption='CRAN supplemented with bleeding edge packages from GitHub, not for production use!'
rspm subscribe --repo=bleeding-edge --source=github
rspm subscribe --repo=bleeding-edge --source=cran

The result of these steps is a repository that contains:

  • The GitHub version of the package
  • CRAN packages

The GitHub package is installed from RSPM using install.packages NOT devtools.

In the example above, the ordering of the source subscriptions is important. Imagine the desired GitHub package is ggplot2. ggplot2 is available in the cran source and the git source. When ggplot2 is requested, RSPM looks in the order of the subscriptions.

  • First, RSPM looks for ggplot2 in the github source. Since ggplot2 is in the set of bleeding edge packages, RSPM finds ggplot2 in the github source.
  • Then, it serves the bleeding edge version to users.
  • Any ggplot2 dependencies that are not in the github source are pulled from cran.
  • RSPM displays whether the package came from CRAN or an alternative source on the package page for the repository.

3.8 Serving a Subset of Approved CRAN Packages

Organizations may only want to grant access to an approved list of CRAN packages. A curated CRAN source enables administrators to serve the approved list of packages as well as any dependencies while enabling admins to preview changes, add new packages, and run updates.

Create a file containing one package name per line. For example, /tmp/packages.csv:

plumber
shiny
ISLR

Create a curated-cran source.

Next, use the add command to preview the changes needed to add the packages.

# Ensure you have CRAN metadata:
rspm sync --wait

# Create the curated-cran source:
rspm create source --name=subset --type=curated-cran

# Dry run to see proposed packages:
rspm add --file-in='/tmp/packages.csv' --source=subset

The result will contain information on all the packages that will be added. The proposal can be saved to a CSV file using the csv-out flag. The required dependencies for the named packages are automatically discovered and included. Optionally use the --include-suggests flag to also discover and add suggested packages.

This action will add the following packages:

Name        Version  Path License                   Needs Compilation Dependency
BH          1.66.0-1      BSL-1.0                   false             true
crayon      1.3.4         MIT + file LICENSE        false             true
digest      0.6.15        GPL (>= 2)                false             true
htmltools   0.3.6         GPL (>= 2)                false             true
httpuv      1.4.3         GPL (>= 2) | file LICENSE false             true
ISLR        1.2           GPL-2                     false             false
jsonlite    1.5           MIT + file LICENSE        false             true
later       0.7.3         GPL (>= 2)                false             true
...         ...           ...                        ...               ...

 To complete this operation, execute this command with the --transaction-id=281 flag. 

If the proposal is acceptable, then run the command again, but use the transaction-id indicated in the dry run output.

Finally, create a repository and subscribe it to the source.

# Commit the changes:
rspm add --file-in='/tmp/packages.csv' --source=subset --transaction-id=281

# Create a repository:
rspm create repo --name=approved-cran --description='Approved packages from CRAN'

# Subscribe the repository to the source:
rspm subscribe --repo=approved-cran --source=subset

When the source is created, RSPM automatically pins the “subset” source to a frozen point in time on CRAN. New packages can be added from this frozen snapshot of CRAN by repeating the process.

To update the entire set of packages to the latest data available from CRAN, use the update command.

Note: The latest data will depend on the server’s sync schedule.For information about the sync schedule, see the Updates from CRAN section.

rspm update --source=subset

# Like the add command, a preview of the changes is printed out,
# along with a transaction-id. Use this id to commit the changes

rspm update --source=subset --transaction-id=281

Please review the full description of curated CRAN sources to understand further which points in time add and update will use.

3.9 Serving a Subset of Approved CRAN Packages and Local Packages

An extension of the previous use case is serving a subset of CRAN and a set of internal packages from within a single repository.

Start by following the steps to create a source with the subset of approved packages and a local source with the desired internal packages.

Next, for each internal package, obtain the list of the package’s dependencies from the package developer. Create a CSV file with one line per package name, e.g., internal_deps.csv:

xml2
jsonlite

Add additional dependencies to the curated CRAN source by running the add command.

# Your list sources should include:
# prod-internal-source and subset:
rspm list sources

# Add the local packages' dependencies:
rspm add --source=subset --file-in='internal_deps.csv'

# Proposed changes will be printed out, to commit the changes.
# Run the command again with the transaction-id:
rspm add --source=subset --file-in='internal_deps.csv' --transaction-id=283

Finally, create a repository and subscribe it to both sources. The order of the subscription commands is important. See the Repositories with Multiple Sources section for details.

# Create a repo:
rspm add repo=prod-pkgs --description='Stable release of our internal packages and approved CRAN packages'

# Subscribe the repo to both sources:
rspm subscribe --repo=prod-pkgs --source=prod-internal-source
rspm subscribe --repo=prod-pkgs --source=subset