Skip to content

Air-Gapped Package Manager#

Advanced

Package Manager communicates with the Posit Package Service to access CRAN and Bioconductor packages and metadata. In offline environments, it is possible to directly download the necessary data from the online Posit Package Service and then copy it to an offline Package Manager server.

Note

Package Manager does not support offline access for PyPI at this time. Support for offline access for PyPI is planned for a future release.

This guide includes 3 sections:

Initial Setup#

  1. First, download the Package Manager offline downloader from the installation section.
  2. Run one of the following commands to learn about the available options:

    • rspm-offline-downloader get cran --help
    • rspm-offline-downloader get bioconductor --help
    • rspm-offline-downloader get vulns --help

    When ready, you can run the command with the appropriate flags to perform the full download. These steps will download metadata, README files, package archives, and vulnerabilities and may take some time to complete. For CRAN, the data size will be more than 70 GB. For Bioconductor, the total data size for all versions will be over 1.2 TB; specific Bioconductor versions will be up to 170 GB per version.

    If you enable CRAN binaries in offline environments, the size of the download will be larger (250 GB+ per configuration) depending on the number of R versions and operating systems selected.

    The result will be a directory of files and subdirectories such as:

    /path/to/destination/v3/version.txt
    /path/to/destination/v3/1
    /path/to/destination/v3/1/...
    /path/to/destination/sysreqs
    /path/to/destination/distros
    /path/to/destination/vulns
    

    If the --include-binaries flag is used, additional directories will be created depending on the selected R versions and Linux distributions. For example, if the command is rspm-offline-downloader get cran ... --include-binaries --r-versions=4.2,4.3 --distributions=jammy,rhel9, then you should expect the following directories to be included:

    /path/to/destination/bindex
    /path/to/destination/bin/4.2-jammy
    /path/to/destination/bin/4.3-jammy
    /path/to/destination/bin/4.2-rhel9
    /path/to/destination/bin/4.3-rhel9
    

    Enabling binaries in offline environments is optional. See the Serving Package Binaries section for more information on serving binaries.

    For Bioconductor, you can optionally limit the downloads to specific Bioconductor versions with the --versions flag. If this flag is not included, all supported versions will be downloaded. For example, if you use the command rspm-offline-downloader get bioconductor ... --versions=3.11,3.10, then Bioconductor versions 3.11 and 3.10 will be downloaded.

    Tip

    You can use the devel and release aliases to download the current Bioconductor development and release versions. For example, to download both the development and release versions, along with an older version, you could use the command rspm-offline-downloader get bioconductor ... --versions=devel,release,3.10.

    For Bioconductor, you will see a resulting directory of files and subdirectories. Note how a separate directory is created for each Bioconductor version (3.11 and 3.10 in the example, below).

    /path/to/destination/bioc/v4/version.txt
    /path/to/destination/bioc/v4/1/3.11/bioc
    /path/to/destination/bioc/v4/1/3.11/data/annotation
    /path/to/destination/bioc/v4/1/3.11/data/experiment
    /path/to/destination/bioc/v4/1/3.11/workflows
    /path/to/destination/bioc/v4/1/3.10/bioc
    /path/to/destination/bioc/v4/1/3.10/data/annotation
    /path/to/destination/bioc/v4/1/3.10/data/experiment
    /path/to/destination/bioc/v4/1/3.10/workflows
    /path/to/destination/sysreqs
    /path/to/destination/distros
    /path/to/destination/vulns
    

    Is it ok to stop downloading old Bioconductor versions?

    When updating downloaded Bioconductor data, you may choose to download only versions for which you need new data. When copying the new data to the air-gapped server, be sure to keep the data for previous versions. If you remove any data for versions that are in use, errors will occur when attempting to access packages or metadata from those versions.

    Why do the get cran and get bioconductor commands both create a sysreqs and distros directory?

    The rspm-offline-downloader get cran and get bioconductor commands both download system requirements data, supported operating systems, and supported R versions to a sysreqs directory. Both CRAN and Bioconductor share this data, and it is safe to overwrite when downloading either data set.

  3. Create a directory to store the data in the offline Package Manager server, such as /var/lib/rspm-offline-data. If you have a cluster of nodes, use shared storage for this directory.

    Terminal
    $ sudo mkdir -p /var/lib/rspm-offline-data
    

    Copy the data downloaded in step 2 from the online machine to the directory on the offline Package Manager server. For completely isolated servers, you may need to copy the data to a physical drive in order to move it to the offline environment.

    For example, if the downloaded data was located at /path/to/data:

    Terminal
    $ sudo cp -r /path/to/data/. /var/lib/rspm-offline-data
    

    Confirm that the offline data directory has all the files from the original data directory.

    Finally, modify the permissions on the directory in the offline Package Manager server, changing ownership to the Unix account running Package Manager, rstudio-pm by default:

    Terminal
    $ sudo chown -R rstudio-pm:rstudio-pm /var/lib/rspm-offline-data
    
  4. Next, configure the offline Package Manager server to use the downloaded data. To do so, modify the Package Manager Configuration file to include the following properties in the CRAN and/or Bioconductor configuration section:

    /etc/rstudio-pm/rstudio-pm.gcfg
    [Manifest]
    URL = A URL in the form, `file:///<the directory you created in step 3>`
    

    For example, if your offline data directory is at /var/lib/rspm-offline-data, the file /etc/rstudio-pm/rstudio-pm.gcfg should contain:

    /etc/rstudio-pm/rstudio-pm.gcfg
    [Manifest]
    URL = file:///var/lib/rspm-offline-data
    

    Once the file is updated, restart the Package Manager Service. If the configuration was successful, you should see a message like this in /var/log/rstudio/rstudio-pm/rstudio-pm.log:

    /var/log/rstudio/rstudio-pm/rstudio-pm.log
    Configured to serve CRAN/Bioconductor data from a directory. Checking path '/var/lib/rspm-offline-data'.
    
  5. Follow the Quick Start guide to make CRAN or Bioconductor packages available in the offline Package Manager server. The rspm sync command will now synchronize package data from the offline data directory (e.g., /var/lib/rspm-offline-data) rather than the online Posit Package Service.

Regular Updates#

It is important to regularly update data available on the offline server. The Posit Package Service is typically updated with new packages each business day.

We recommend using the follow steps to keep your offline server up to date:

  1. If you have maintained the originally downloaded files, you can perform a relatively fast update by re-running the rspm-offline-downloader command. Subsequent command executions will simply add or update files as necessary without re-downloading the entire set.

  2. Copy the directory from the online machine to the directory created in the offline Package Manager during the initial setup, e.g., /var/lib/rspm-offline-data. Ensure that the directory is still owned by the Unix account running Package Manager, rstudio-pm by default.

  3. Once the offline data directory has been updated, the Package Manager server will automatically synchronize the new data during the scheduled syncs. You may also manually synchronize the data by running the rspm sync command.

Note

If you manually update the offline data using an external drive, you can use the --starting-snapshot flag to only download new files since your last synchronization. Use the validate cran or validate bioconductor command in the rspm-offline-downloader tool to ensure that the destination directory is valid.

Upgrading Package Manager#

A new version of Package Manager may require data from a new version of the Posit Package Service. To ensure a smooth upgrade with limited downtime, we recommend the following steps:

  1. You will need a staging environment that mirrors your offline production server. After creating this environment, begin by upgrading the offline staging server to the latest Package Manager release.
  2. Follow the instructions for the Initial Setup of an Air-Gapped server in the Initial Setup section, using the offline staging server.
  3. After you have validated that everything works as expected, copy the offline data, e.g., /var/lib/rspm-offline-data, from the offline staging server to the offline production server.
  4. Upgrade the offline production server to the new version of Package Manager.
  5. (Optional) After an upgrade, navigate to the directory storing offline data, e.g., /var/lib/rspm-offline-data. This directory will contain versioned directories, e.g.,

    /v3
    /v4
    

    The output from rspm-offline-downloader get [ cran | bioconductor ] will have indicated the version of the Posit Package Service required by the current version of Package Manager, e.g., Performing full download of schema version v4.

  6. In this example, only the following directories are necessary:

    • /var/lib/rspm-offline-data/v3 (CRAN)
    • /var/lib/rspm-offline-data/bioc/v4 (Bioconductor)
    • /var/lib/rspm-offline-data/sysreqs (System Requirements)
    • /var/lib/rspm-offline-data/distros (Supported binary OSes and R versions)
    • /var/lib/rspm-offline-data/bindex (CRAN Package Binary Index)
    • /var/lib/rspm-offline-data/bin (CRAN Package Binaries)
    • /var/lib/rspm-offline-data/vulns (Vulnerabilities)
  7. The following directories can be removed:

    • /var/lib/rspm-offline-data/v2 (old CRAN data)
    • /var/lib/rspm-offline-data/bioc/v3 (old Bioconductor data)