D Air-Gapped RStudio Package Manager

RStudio Package Manager communicates with a RStudio CRAN service to access CRAN packages and metadata. In offline environments, it is possible to directly download the necessary data from the RStudio CRAN service and then copy it to an offline RStudio Package Manager server.

This guide includes 3 sections:

  • Instructions for an initial air-gapped setup (D.1)
  • Instructions for performing regular updates (D.2)
  • Instructions for upgrading RStudio Package Manager in an air-gapped environment (D.3)

D.1 Initial Setup

  1. Install the AWS CLI tools in an online machine. Confirm by running aws help.

  2. Run the command rspm air-gap in the offline RStudio Package Manager server. See 6 for more information about the rspm commands.

The air-gap command will print information and output a command similar to:

MAJOR=`curl https://rspm-sync-staging.rstudio.com/v2/version.txt`; \
echo "aws s3 sync --no-sign-request --exclude=* --include=v2/version.txt \
--include=v2/${MAJOR}/* s3://rstudio-pm-sync/ ./"
  1. In the online machine, create a new directory, e.g cran-data. Copy the command from step 2 and run it from within the directory on the online machine. A new command will be printed starting with aws s3 sync. Copy the entire output and run it in the the same location. The sequence of input and output looks like:
# Input:
MAJOR=`curl https://rspm-sync-staging.rstudio.com/v2/version.txt`; \
echo "aws s3 sync --no-sign-request --exclude=* --include=v2/version.txt \
--include=v2/${MAJOR}/* s3://rstudio-pm-sync/ ./"

# Output:
aws s3 sync --no-sign-request --exclude=* --include=v2/version.txt \
--include=v2/2/* s3://rstudio-pm-sync/ ./

# Input:
aws s3 sync --no-sign-request --exclude=* --include=v2/version.txt \
--include=v2/2/* s3://rstudio-pm-sync/ ./

These steps will begin the download which includes 50+ GB of data and may take some time to complete.

The result will be a directory with files prefixed by a version indicator, e.g.

cran-data/v2/version.txt
cran-data/v2/2
cran-data/v2/2/...
  1. Create a directory in the offline RStudio Package Manager server. If you have a cluster of nodes, use shared storage for this directory.
mkdir /var/lib/rstudio-pm/cran

Copy the data downloaded in step 3 from the online machine to the directory on the offline RStudio Package Manager server. For completely isolated servers, you may need to copy the data to a physical drive in order to move it to the offline environment.

Confirm by checking that the offline directory has the same number of files as the original directory on the online machine.

Finally, modify the permissions on the directory in the offline RStudio Package Manager server, changing ownership to the unix account running RStudio Package Manager, rstudio-pm by default:

sudo chown -R rstudio-pm /var/lib/rstudio-pm/cran
  1. Next, configure the offline RStudio Package Manager server to use the downloaded data. To do so, modify the RStudio Package Manager Configuration file to include the following properties in the CRAN configuration section:
; /etc/rstudio-pm/rstudio-pm.gcfg

[CRAN]
ManifestURL = A URL in the form, `file:///<the directory you created in step 4>`

For example, if your CRAN data directory is at /var/lib/rstudio-pm/cran, the file /etc/rstudio-pm/rstudio-pm.gcfg should contain:

; /etc/rstudio-pm/rstudio-pm.gcfg

[CRAN]
ManifestURL = file:///var/lib/rstudio-pm/cran

Once the file is updated, restart the RStudio Package Manager Service.

  1. Follow the remainder of the instructions in the admin guide for setting up sources and repositories using the rspm commands in the offline RStudio Package Manager server. The rspm sync command will use the downloaded data to populate the necessary CRAN data and packages.

D.2 Regular Updates

It is important to regularly update data available on the offline server. Updating this data does not automatically make new packages available to end users. The following process pushes updates from the RStudio CRAN service to RStudio Package Manager’s metadata. Follow the instructions in the admin guide to make updates available to end users. See 6 (and, specifically, 6.4.2) for more information.

  1. In the offline RStudio Package Manager server, run the command rspm air-gap. This command will output information along with a command similar to:
MAJOR=`curl https://rspm-sync.rstudio.com/v2/version.txt`; \
echo "aws s3 sync --no-sign-request --exclude=* \
--include=v2/version.txt --include=v2/${MAJOR}/* s3://rstudio-pm-sync/ ./"
  1. In the online machine, navigate to the directory created during the initial setup, e.g. cran-data, and paste the command from step 1. A second command will be printed. Copy and run the second command in the same location. Any newly available data will be downloaded, but existing data will not be re-downloaded.

  2. Copy the directory from the online machine to the folder created in the offline RStudio Package Manager during the initial setup, e.g. /var/lib/rstudio-pm/cran. Ensure that the directory is still owned by the unix account running RStudio Package Manager, rstudio-pm by default.

D.3 Upgrading RStudio Package Manager

A new version of RStudio Package Manager may require data from a new version of the RStudio CRAN service. To ensure a smooth upgrade with limited downtime, we recommend the following steps:

  1. You will need a staging environment that mirrors your offline production server. After creating this environment, begin by upgrading the offline staging server to the latest RStudio Package Manager release.

  2. Follow the instructions for the Initial Setup of an Air-Gapped server in D.1, using the offline staging server.

  3. After you have validated that everything works as expected, copy the the CRAN data, e.g. /var/lib/rstudio-pm/cran, from the offline staging server to the offline production server.

  4. Upgrade the offline production server to the new version of RStudio Package Manager.

  5. (Optional) After an upgrade, navigate to the directory storing CRAN data, e.g. /var/lib/rstudio-pm/cran. This directory will contain versioned folders, e.g.

/v2
/v3

Run the command rspm air-gap in the offline production server. The output will indicate the version of the RStudio CRAN service required by the current version of RStudio Package Manager, e.g.

Your RStudio Package Manager uses a CRAN Manifest Schema Version of “v3”.

You can safely remove all of the other folders in the directory. In this example, only /var/lib/rstudio-pm/v3 is necessary.