4 Files & Directories

4.1 Program Files

The RStudio Connect installers place all program files into the /opt/rstudio-connect directory.

You should not need to change any files in the /opt/rstudio-connect hierarchy. Any alterations will be overwritten by subsequent re-installs or upgrades of RStudio Connect.

4.2 Configuration

The main RStudio Connect configuration file is /etc/rstudio-connect/rstudio-connect.gcfg. This file is initially owned by root with permissions 0600. You will edit this file to properly configure RStudio Connect for your organization.

Restart RStudio Connect after altering the rstudio-connect.gcfg configuration file using the instructions in Section 5.1.

Configuration settings marked as “reloadable” do not require a full restart. See Section A to learn which properties are reloadable. You can find a “reload” command for your operating system in Section 5.1.

A configuration management tool like Puppet or Chef can be used to maintain the rstudio-connect.gcfg file. We recommend that it remain owned by root and have permissions 0600, as your configuration may need to contain passwords and other sensitive information.

RStudio Connect upgrades will not overwrite customizations to the rstudio-connect.gcfg file. Similarly, the initial installation of RStudio Connect will not overwrite the rstudio-connect.gcfg file if it already exists.

If new versions of RStudio Connect require modification to the configuration, a separate /etc/rstudio-connect/rstudio-connect-migration.gcfg file may be created automatically with the updated settings. It will have the same permissions on disk of the main configuration file.

Note: If RStudio Connect is using an alternate configuration file or path, the configuration migration will take place in a file with the same name of the one currently in use plus the suffix migration. For example, if the service is started with connect --config /path/to/rsc.gcfg an attempt to migrate the settings will take place for the file /path/to/rsc-migration.gcfg. If RStudio Connect cannot create or write to the path where the migrated configuration would be placed then the migration will happen in memory only. A different path for the migration file can be specified by using the --migration-config option for the connect command.

See the configuration appendix A for details about the configuration files, their syntax, the available settings, and the migration process.

4.3 Server Log

The RStudio Connect server log is located at /var/log/rstudio-connect.log. This file is owned by root with permissions 0600.

If logrotate is available when RStudio Connect is installed, a logrotate configuration will be installed. The default configuration is to rotate the logfile daily. The old log file will be stored alongside the original with a numeric extension, .1, .2, etc. The rotated log files are compressed after one day. The .1 log file is retained uncompressed, but older logs are compressed. Most systems use gzip for compression, giving log files with extensions like .2.gz, .3.gz. Logs will be maintained for 30 days.

The manual for logrotate has more information.

4.4 Access Logs

The RStudio Connect HTTP access logs are located at /var/log/rstudio-connect.access.log. This file is owned by root with permissions 0600. Log files are stored in Apache Combined Log Format. See http://httpd.apache.org/docs/2.2/logs.html#combined for a description of this format.

If logrotate is available when RStudio Connect is installed, a logrotate configuration will be installed. The default configuration is to rotate the logfile daily. The old logfile will be compressed and stored alongside the original log file with a .1.gz extension (then .2.gz, etc.). Logs will maintained for 30 days.

If you configure RStudio Connect to write to a different file, you will need to update the logrotate configuration accordingly.

4.5 Audit Log

The RStudio Connect audit log file is optional and, if configured, is located at /var/log/rstudio-connect.audit.log. This file is owned by root with permissions 0600. Audit log files are stored in either CSV or JSON format.

If logrotate is available when RStudio Connect is installed, a logrotate configuration will be installed. The default configuration is to rotate the audit log file monthly. The old logfile will be compressed and stored alongside the original log file with a .1.gz extension (then .2.gz, etc.). Logs will maintained for the past 12 months.

If you configure RStudio Connect to write to a different file, you will need to update the logrotate configuration accordingly.

4.6 Application Logs

Each process launched by RStudio Connect produces output that is retained within the jobs subdirectory of the RStudio Connect data directory (see Section 4.7 for details). These directories and files are managed by the server. They are retained for 30 days and subsequently removed from the system.

Application logs are available in the RStudio Connect dashboard. The dashboard settings page for deployed content contains a Logs section containing execution details for each launched process. Standard output and standard error are captured and available.

4.7 Variable Data

RStudio Connect manages uploaded Shiny applications, Plumber and TensorFlow APIs, R Markdown documents, and plots. All of the variable data associated with this content is stored within the server’s data directory. This includes:

  • Deployment bundles as uploaded by the user.
  • Directories containing unpacked bundles, including R source code.
  • R packages, as demanded by the deployed code.
  • Rendered R Markdown documents.
  • Application images as uploaded by the user.
  • Git repository clones associated with applications.

The RStudio Connect data directory also contains information used by the server in managing your deployed content. This includes:

  • The RStudio Connect encryption key and SQLite database.
  • Process execution information including logged output.
  • Parameter overrides for R Markdown documents.

The default location for the RStudio Connect data directory is /var/lib/rstudio-connect. This can be customized by specifying an alternate Server.DataDir in your configuration file.

; /etc/rstudio-connect/rstudio-connect.gcfg
DataDir = /mnt/rstudio-connect

The RStudio Connect SQLite database must exist on local storage. If the location for Server.DataDir is not local storage but a networked location over NFS, configure the setting Database.Dir.

; /etc/rstudio-connect/rstudio-connect.gcfg
DataDir = /mnt/rstudio-connect

Dir = /var/lib/rstudio-connect/db

With the configuration above all storage-related files (which includes the encryption key) will be stored in a different location. If there is a reason to keep these other files in a different location such as a shared NFS storage, it is still possible to only change the location of the SQLite database using instead the setting SQLite.Dir.

; /etc/rstudio-connect/rstudio-connect.gcfg
DataDir = /mnt/rstudio-connect

Dir = /var/lib/rstudio-connect/db

4.7.1 Relocating Variable Data

Warning: After RStudio Connect is running for the first time, any changes to the settings Server.DataDir, Database.Dir and SQLite.Dir require all files under the locations pointed by these settings to be manually moved. Failure to follow this instruction may lead to errors and potential loss of data.

Here are a few scenarios: - If SQLite.Dir is being customized, all files which the names have a prefix of connect should be moved to the new location. All other files, including the encryption .key file should not be moved. - If Database.Dir is being customized, all files that were previously stored in {Server.DataDir}/db should be moved to the new location. - If Server.DataDir is being customized, all files under the configured directory should be moved.

Note: The encryption key is specially sensitive to changes to these settings and may prevent RStudio Connect from starting up if not moved correctly. If the encryption key is lost a new one will be created automatically but all encrypted data will be unrecoverable. See B.2 for more information on how to allow RStudio Connect to run again in this situation using the command rscadmin configure --reset-secret-key.

4.7.2 Permissions

Data directory permissions are established by RStudio Connect as files are created. This section documents the general ownership patterns you will find under the RStudio Connect data directory.

Directories directly accessed from R applications will usually be owned by the Applications.RunAs user. This setting defaults to use an rstudio-connect account created during RStudio Connect installation. The rstudio-connect account has a default primary group also named rstudio-connect. We use the account and group name rstudio-connect throughout this section instead of referencing the property name.

Directories used during metrics collection are owned by the rstudio-connect user (customizable via the Metrics.User setting).

Learn more about customizing metrics collection in Section 20.1.1.

Directories not accessed by R applications or by the monitoring system will be owned by root.

/var/lib/rstudio-connect is owned by root with permissions 0701.

The R subdirectory contains R packages used when content is deployed. The entire R directory hierarchy needs to be owned by rstudio-connect. Files must have 0600 permissions and directories need 0700 permissions.

The packrat subdirectory contains R packages installed on behalf of deployed content. These packages are installed when content is deployed and subsequently used when an application or report executes. The entire packrat directory hierarchy needs to be owned by the rstudio-connect and the rstudio-connect group. Files must have 0640 permissions while directories need 0750 permissions.

The reports subdirectory is owned by root with 0711 permissions. This contains generated output for report content deployed with source. The reports tree contains a nested directory structure of the form: v2/A_ID/V_ID/R_ID. The A_ID directory corresponds to a content deployment (an R Markdown document). The V_ID directory represents a configuration of that document (a set of parameter values). The R_ID contains a single rendering of that document with the associated parameters. The directories v2, A_ID, and V_ID are all owned by root with 0711 permissions. The final directory, R_ID contains the actual rendered output and is owned by rstudio-connect with 0700 permissions. Files contained in the R_ID directory will have 0600 permissions.

The bookmarks directory contains a bookmarking state subdirectory for each Shiny application. The top-level directory is owned by root with 0711 permissions. Each bookmarks/A_ID subdirectory is owned by rstudio-connect and the rstudio-connect group with 0770 permissions.

Learn more about server-stored Shiny bookmarking state in this article.

The apps directory contains directories for each deployment. The top-level directory is owned by root with 0711 permissions. The first level of the apps hierarchy is a directory for each content deployment. These apps/A_ID directories are owned by rstudio-connect with 0700 permissions.

Beneath each apps/A_ID directory is a set of directories for each deployed bundle. The ownership and permissions for this hierarchy depend on whether or not the content is configured with a custom RunAs setting. Without a custom RunAs setting, permissions are simple: owned by rstudio-connect with directories having 0700 and files having 0600 permissions.

Learn more about using a custom RunAs in Section 14.5.

RStudio Connect needs a more complicated permission structure when content is configured with a custom RunAs setting. This is because the rstudio-connect user (Applications.RunAs) is used to install the necessary packages while the content-specific custom RunAs is used when running the deployed R code. The apps/A_ID/B_ID directory and reports/v2/A_ID/V_ID/R_ID directories are owned by the custom RunAs with group ownership set to rstudio-connect. Permissions on this directory are 0750. The packrat subdirectory is owned by rstudio-connect with group ownership of rstudio-connect. File permissions on this directory and its sub-directories are 0750 while files have 0640 permissions. Other than the packrat directory, all files underneath apps/A_ID/B_ID and reports/v2/A_ID/V_ID/R_ID have 0600 permissions and directories are given 0700.

All other data subdirectories are owned by root with 0700 permissions.

4.8 Backups

We recommend including the RStudio Connect configuration file in /etc/rstudio-connect as well as the variable data directory which defaults to /var/lib/rstudio-connect in your system backups. If you have configured the database to be stored outside the data directory, ensure that it is also included in the backup.

A running RStudio Connect server may be writing into the data directory if there are any active deployments, applications or documents. You should stop the RStudio Connect server before taking a backup.

sudo systemctl stop rstudio-connect
# Run appropriate backup steps here.
sudo systemctl start rstudio-connect

Your platform may use alternate commands to restart RStudio Connect. Please see Section 5.1 for instructions specific to your operating system version.

4.9 Server Migrations

There are a number of factors that must be considered before migrating your RStudio Connect installation from one server to another. We recommend making as few changes as possible during the initial migration. If, for instance, you will be migrating to a new server, upgrading to a new default version of R, and altering the default Applications.RunAs user, complete the migration first. Then upgrade R and alter the Application.RunAs user in subsequent steps.

In order to migrate a server, you will follow the same steps as when you perform a backup in order to obtain a consistent copy of the data in the necessary directories. These directories can then be copied to the new server.

  1. Install RStudio Connect on the new server, then stop the service. RStudio Connect v1.5.6 introduced features that make server migrations more reliable; migrating servers in older versions is not supported, so the new server should have v1.5.6 or later.
  2. Mirror the Unix accounts used by RStudio Connect on the existing server to the new server. Consider the Applications.RunAs user and any other users that might have been selected as the user responsible for running any content on the server. These Unix accounts must all exist on the new server and continue to be members of the default Applications.RunAs user’s primary group as discussed in 14.5.
  3. Copy the config and data directories while preserving the permissions and file ownership. Not all file transfer clients are able to preserve these attributes, so consider using rsync with the -a flag to copy the data. Bear in mind that certain applications may have overridden settings that alter how their files are stored on disk (for instance, by customizing the user account that runs their processes - e.g. Shiny application), so it is critical that ownership and permissions be preserved exactly during the migration.
  4. Update your /etc/rstudio-connect/rstudio-connect.gcfg file if you’ve changed settings like the path to your data directory.
  5. Sanity-check the permissions and ownership of the content working directories using the migrate repair-content-permissions command as documented in B.4.
  6. Install the same version(s) of R on the new server to mimic existing behavior. If you need additional versions or support for multiple versions of R, please see 16.
  7. On the new server, install any system dependency that may be used by an R package on the existing server. A list of recommended packages are available in 2.3. Whichever packages you chose to install on your existing server to support the R packages that users have deployed should also be installed on the new server. Otherwise, RStudio Connect will not be able to rebuild users’ deployed packages.
  8. Run migrate rebuild-packrat --force to delete the Packrat cache and rebuild it. This cache likely includes binaries that were compiled against particular versions of libraries on your existing system. Be aware that this step may take a very long time (easily multiple hours for large deployments with lots of content). It is recommended you start this before you start RStudio Connect, but you can start RStudio Connect once it starts. If any application or report is executed, the packrat directory for that application will be rebuilt at runtime.

If you are also migrating to a different database provider, see 10.3.