E RStudio Connect Deployment Guide

E.1 Overview

This guide will cover the details of the deployment process in RStudio Connect. For most users, these details can be safely ignored, as the details are handled automatically via push-button publishing. However, some users may want to programmatically publish content using the rsconnect package or may have run into an error during deployment.

E.2 Programmatic Deployment

To programmatically publish content to RStudio Connect, use the functions deployDoc, deployApp, deployApi, and deploySite from the rsconnect package. Each of these functions will require a user account and a connected server. To setup an account on a server use addConnectServer and connectUser. To view currently configured accounts use accounts. For more details visit the rsconnect reference pages.

Each of the deployment functions listed above can be supplied with optional arguments. If additional arguments are not supplied, defaults are determined based on the content being deployed. All of the deployment functions follow a similar, underlying process. This appendix explains the process in detail.

E.3 Step 1: Building the Bundle

Connect builds an application bundle for the deployed content. The bundle contains the source code, any data files, and a manifest (JSON file) with metadata about the bundle and environment.

E.3.1 Application Metadata

rsconnect infers a number of attributes about the content including:

  1. appMode: static, shiny, rmd-static, rmd-shiny, api
  2. hasParameters: whether or not the R Markdown file includes parameters

In the case of an R Markdown document the YAML is parsed. Otherwise, .R files are flagged as shiny applications, html files and pdf files are flagged as static. (When a plot is published, the plot is wrapped in an html file).

E.3.2 List of Target Files

Next, rsconnect identifies the relevant files for the application. appFiles or appFileManifest can be passed as arguments to deployApp to specify the required files. Otherwise, rsconnect attempts to identify the required files using a number of heuristics.

For R Markdown documents and static HTML files, external dependencies are discovered using the rmarkdown function find_external_resources. This function searches for dependencies in the R Markdown file and the rendered HTML file. The function is able to identify files in the YAML header (if a parameter is a file), logos, images, data files used within R code chunks, and HTML dependencies. This process includes a minimal, client-side “render” of the document (the Rmd is not rendered, it is converted to plain markdown and then rendered to HTML without running any R code). Think of this rendering as creating a skeleton of the final HTML document. During push-button deployment, this initial “render” will show up in the IDE R Markdown tab.

The dependencies for R Markdown websites are identified uniquely. Websites should be deployed by calling deploySite.

Troubleshooting: To avoid client side rendering, deploy the content directly using deployApp with appFiles or appFileManifest.

For Shiny applications and Plumber APIs, rsconnect adds all the files in the project directory and subdirectories with a few exceptions: .Rproj file, the packrat directory, and the rsconnect directory. Files are added up to the specified max bundle size: getOption("rsconnect.max.bundle.size").

Troubleshooting: try rsconnect::listBundleFiles(appDir) to see the identified dependencies

E.3.3 Lint

After identifying the target file and dependency files, rsconnect applies a series of linters. The rsconnect linters attempt to identify common problems that might prevent an application that works locally from working after deployment. These checks ensure the application code does not contain:

  1. absolute paths
  2. invalid relative paths
  3. inconsistent capitalization among paths (the Connect server has case sensitive file paths)

The linters currently do not check for database connections.

Troubleshooting: You can disable the linters by passing lint=FALSE to the deployment function.

E.3.4 Create Temporary Folder

If the files pass the linters, RStudio Connect creates the initial bundle by copying all of the files to a temporary directory.

E.3.5 Library Dependencies

Next, rsconnect attempts to identify the package dependencies required by the app. (This step is skipped for static content). rsconnect does this by using packrat. Packrat is a dependency management tool for R designed to keep projects isolated, portable, and reproducible. rsconnect deployment does not use all of packrat’s functionalities. (For example, the package sources are not installed on the client in the project’s packrat subdirectory). For more information visit: https://github.com/rstudio/packrat

Packrat looks through the R code and makes note of any library() or require() calls. Packrat creates a list of the required packages and saves the list in the packrat.lock file. This lock file includes the package version and package dependencies. This process is recursive. In addition, the lock file also includes information on the version of R being used, the type of repository containing the package, and the specific URI for each type of repository. A few notes about this process:

Packrat searches in the order of .libPaths

For example, if the code includes library(babynames), Packrat will look for babynames inside the first library in .libPaths. Imagine there are two libraries: A and B and .libPaths(A,B). In A, babynames is version 1.0. In B, babynames is version 2.0. Packrat will assume the app depends on version 1.0. To understand this behavior, recall that a library is just a folder containing an installed R package. The most common scenario where this occurs is when the target directory is part of an existing packrat project.

Repositories

Most packages come from CRAN. In the packrat lockfile, packrat will record the names of packages originating from CRAN as well as a specific URL for CRAN (i.e. CRAN=‘https:cran.rstudio.com’). The url is determined by the state of options("repos") during deployment. The same process is used for other repositories: Github, BioConductor, and local repositories. In the case of a local repository, the repository URI may be a location on disk.

For the edge case of an internal package from a local repository, be sure the package’s Repository option (found in the package’s Description file) is mapped to a repo URI in the current options("repos"). For example, imagine a package called myPackage is stored in a local repo called myRepo. The myPackage Description file should include repository:myRepo. options("repos") should define a URI for myRepo during deployment runtime, i.e. options(repos = list(myRepo="file://path_to_private_repo")).

Troubleshooting: try rsconnect:::performPackratSnapshot(appDir). This command will create the packrat lock file helping to identify the dependencies, corresponding repos, and URLs expected for deployment.

Once the lock file is created, rsconnect proceeds to copy all of the description files for the packages listed in the packrat lock file. The files are copied into packrat/desc. Normally, a packrat lockfile would be enough to fully reproduce the package environment. This additional step is necessary just in case the version of packrat on the client is significantly different from the version on the server.

E.3.6 Manifest

Next, rsconnect generates the actual manifest. This manifest includes a list of the relevant source code, package dependencies, and other metadata including the R version, the locale, the app mode, content category, etc. The R version is determined while building the manifest. The R version listed in the manifest will later be used by Connect to attempt to re-create a server-side environment consistent with the client. While creating the manifest, rsconnect will also attempt to determine the primary document (if not already listed). Checksums are stored for each file, including the packrat description files. Finally, the manifest is copied to the temporary bundle directory alongside the code and packrat directory.

For example, a target directory with the structure:

targetDir
- app.R
+ dataDir
  - data.csv

where app.R includes:

library(babynames)
library(shiny)

The final bundle will contain:

bundleDir
  - app.R
  - manifest.json
  - index.htm
  + dataDir
    - data.csv
  + packrat
    - packrat.lock
    + desc
    - babynames
    - shiny
    ...

The manifest.json file will include:

{
    "version" : 1,
    "locale" : "en_US",
    "platform" : "3.2.5",
    "metadata" : {
        "appmode" : "shiny",
        "primary_rmd" : null,
        "primary_html" : null,
        "content_category" : "application",
        "has_parameters" : false
    },
    "packages" : {
        ...
    },
    "files" : {
        "app.R" : {
            "checksum" : "bc81fad5645566fe5d228abf57bba444"
        },
        "packrat/desc/babynames" : {
            "checksum" : "ee14db463dc57f078fea1c3d74628104"
        },
        ...
    },
}

The packages entry will contain a version of each package’s DESCRIPTION file. The files entry will include a checksum for each package description file.

Troubleshooting: try rsconnect::bundleApp(appDir, appFiles=rsconnect::listBundleFiles(appDir), ...). This command will generate a tarball containing the application bundle.

E.4 Step 2: Push Bundle to Connect

In step 2 rsconnect publishes the bundle to the server. This is done with a POST request to an HTTP endpoint. rsconnect supports multiple protocols for making HTTP requests. rsconnect looks for the server address and account information created when the IDE is linked to Connect. Publisher privileges are required for a user to link the IDE to Connect and publish content. These privileges are checked when the user sets up an account for publishing (this process creates a public-secret key pair unique to the user and Connect server).

Troubleshooting: try rsconnect:accounts()

When an application bundle is successfully deployed, rsconnect generates a folder in the original target directory called rsconnect. This folder contains a DCF file with information on the deployed content (i.e. the name, title, server address, account, URL, and time). If you re-deploy the same directory, rsconnect checks for this file allowing the deployed content to be updated. Redeployments will deploy and activate the new bundle for this application. You may use the “source versions” menu option in the dashboard to revert the application to a previous bundle. Redeployment will only work if the document is the same content type. For instance, you can not redeploy an R Markdown document after adding runtime:shiny. Instead, deploy the document to a new endpoint by changing the appName.

Currently, each deployed application is tied to an account. For example, imagine user1 deploys an app and shares the code with user2. If user2 deploys the app, a new copy of the app would be deployed. This is true even if user1 shares the rsconnect folder. (The only way for a different collaborator to deploy to the same app is for both collaborators to use a service account where the username and password are shared by both users. Both users would also need to go through the steps that link the IDE to Connect - generating the public-private keypair).

In some occasions, a single user will have multiple accounts on one server, or an account on multiple servers. To deploy a bundle to a different server or under a different account, specify the account and user parameters in the deployApp function. After successful deployment, a new DCF file will be added to the rsconnect folder. If you deploy the same content from a new machine to the server, using the same account, rsconnect will prompt you asking whether or not the content is a redeploy. This occurs even if the rsconnect folder does not exist on the new machine.

E.5 Step 3: Bundle is deployed on Connect

Once the bundle is published to the server, Connect prepares the content to be deployed. This process follows a number of steps:

E.5.1 Parse the Manifest

The bundle is uncompressed at a unique location (assigned based on appid and bundle id). The manifest from the uncompressed bundle is parsed to determine the type of content. The R version is also identified and matched based on the available R versions on the Connect server. You can find more details here. Files are checked against the checksum listed in the manifest to ensure content was not lost or corrupted during transfer.

E.5.2 Packrat Restore

Packrat is used to ensure the required packages are available. For every package identified in the manifest:

Packrat checks to see if the required package is available in the global cache. (The cache is specific to the version of R matched previously).

If the package is available, a symlink is created that points to the package within the global cache. If a symlink is not possible, the package will be copied from the global cache.

If the package is not available, packrat attempts to install the package. The package is requested from the repo URL identified during bundling. The package is installed and built from source and the installed package is added to the global cache.

Many R packages have system-level dependencies (Java, openssl, etc). If the package fails to install, be sure these system dependencies are installed and available.

All packages are installed as the default Applications.RunAs user (typically rstudio-connect). Connect ensures that the package libraries and uncompressed bundle have the appropriate permissions based on the application specific RunAs user.

E.5.3 R Markdown Render

If the deployed content is an R Markdown document (excluding documents with runtime:shiny) the Rmd file is rendered on the server. If the document is parameterized, the default parameters are used.

The application is presented as deployed. User input is currently required to publish the application and specify any server-side attributes (such as tuning runtime settings, permissions, etc).

E.6 Other Frequently Asked Questions

  1. My app deployed but does not run?

If the application is deployed but does not run, the error message will be caught and displayed in the application log (visible at the app url in Connect on the logs panel).

  1. Can I get more details about the deployment failure?

Yes, set the option “Show diagnostic information after publishing” in Tools -> Global Options -> Publishing

  1. Will database connections work once deployed?

Database connections will only work if the same drivers (and potentially DSNs) are available on the client and on Connect. At this time there is not a linter to check for connection strings.

  1. I use a specific distribution of R (i.e. MRO). Will matching work?

The version of R written to the manifest will be the version used during runtime.

On the server side, Connect attempts to match the version of R in the manifest as described here.

Currently Connect only matches based on the version - no other supplemental information (such as distribution) is maintained. For that reason, to ensure a specific distribution is used on the server, install only that distribution for the desired version.

  1. Are bundles compressed?

Bundles are not be compressed. Bundles do not need to be read completely into RAM during deployment. Typically the only bottleneck is upload speed. You can specify a maximum bundle size using: getOption("rsconnect.max.bundle.size").