7 Files and Directories

7.1 Changing Ownership

Many of the files and directories mentioned in this chapter are, by default, owned by the rstudio-pm user. If you change the RunAs user for the RStudio Package Manager service, you will need to change ownership of these files and directories. See C for details on changing the RStudio Package Manager service RunAs user.

7.2 Program Files

The RStudio Package Manager installers place all program files into the /opt/rstudio-pm directory.

You should not need to change any files in the /opt/rstudio-pm hierarchy. Any alterations will be overwritten by subsequent re-installs or upgrades of RStudio Package Manager.

7.3 Configuration

The RStudio Package Manager configuration file is /etc/rstudio-pm/rstudio-pm.gcfg. This file is initially owned by rstudio-pm with permissions 0640. You will edit this file to properly configure RStudio Package Manager for your organization.

A configuration management tool like Puppet or Chef can be used to maintain the rstudio-pm.gcfg file. We recommend that it remain owned by rstudio-pm and have permissions 0640, as your configuration may need to contain passwords and other sensitive information.

RStudio Package Manager upgrades will not overwrite customizations to the rstudio-pm.gcfg file.

7.4 Server Log

The RStudio Package Manager server log is located at /var/log/rstudio-pm.log. This file is owned by rstudio-pm with permissions 0600.

If logrotate is available when RStudio Package Manager is installed, a logrotate configuration will be installed. The default configuration is to rotate the logfile daily. The old log file will be stored alongside the original with a numeric extension, .1, .2, etc. The rotated log files are compressed after one day. The .1 log file is retained uncompressed, but older logs are compressed. Most systems use gzip for compression, giving log files with extensions like .2.gz, .3.gz. Logs will be maintained for 30 days.

The manual for logrotate has more information.

7.5 Access Logs

The RStudio Package Manager HTTP access logs are located at /var/log/rstudio-pm.access.log. This file is owned by rstudio-pm with permissions 0600. Log files are stored in Apache Combined Log Format. See http://httpd.apache.org/docs/2.2/logs.html#combined for a description of this format.

If logrotate is available when RStudio Package Manager is installed, a logrotate configuration will be installed. The default configuration is to rotate the logfile daily. The old logfile will be compressed and stored alongside the original log file with a .1.gz extension (then .2.gz, etc.). Logs will maintained for 30 days.

7.6 Variable Data

RStudio Package Manager manages R packages and repositories. All package source bundles are stored in the server’s data directory. The RStudio Package Manager handles incoming requests for packages across repositories. Only a single copy of each package source is stored, even if the package is referenced in multiple repositories.

The RStudio Package Manager data directory also contains information used by the server to manage repositories including the RStudio Package Manager SQLite databases and encryption key if SQLite is used.

The default location for the RStudio Package Manager data directory is /var/lib/rstudio-pm. This can be customized by specifying an alternate DataDir in the Server section of your configuration file.

; /etc/rstudio-pm/rstudio-pm.gcfg

[Server]
DataDir = /mnt/rstudio-pm

If you customize the RStudio Package Manager data directory, make sure that the rstudio-pm user has permission to read, write, and create directories in the data directory.

The RStudio Package Manager SQLite databases must exist on local storage. If the location for DataDir is not local storage but a networked location over NFS, configure the Dir setting in the SQLite section of your server configuration file.

; /etc/rstudio-pm/rstudio-pm.gcfg

[Server]
DataDir = /mnt/rstudio-pm

[SQLite]
Dir = /var/lib/rstudio-pm/db

7.6.1 Permissions

/var/lib/rstudio-pm is owned by rstudio-pm with permissions 0700.

7.7 Variable Data Classes

All variable data storage locations default to subdirectories of the Server.DataDir setting. There are four classes of variable data, listed below

  • Cache - Stores data to increase performance for computationally intensive operations. Certain operations, such as Git package building, also temporarily cache data here. Defaults to <DataDir>/cache.
  • Launcher - Stores data for Job Launcher operations. This location currently stores the stdout and stderr data associated with each Git package builder operation. Defaults to <DataDir>/launcher.
  • Metrics - This directory contains aggregated metrics data to improve Usage Stats performance. Defaults to <DataDir>/metrics.
  • Packages - Package tarballs are stored here when downloaded lazily or eagerly. Defaults to <DataDir>/packages.

You can customize the storage directory for each storage class. For example:

; /etc/rstudio-pm/rstudio-pm.gcfg

[FileStorage "cache"]
Location = /mnt/rstudio-pm-cache

[FileStorage "launcher"]
Location = /mnt/rstudio-pm-launcher

[FileStorage "metrics"]
Location = /mnt/rstudio-pm-metrics

[FileStorage "packages"]
Location = /mnt/rstudio-pm-packages

Again, if you customize any of the RStudio Package Manager storage directories, make sure that the rstudio-pm user has permission to read, write, and create directories in each data directory.

7.8 Destinations

IMPORTANT: AWS S3 support is in beta. Please do not use S3 for production data at this time.

The four variable storage classes (see section 7.7, above) default to storing data on disk. Each storage class can optionally be configured to store data on S3. For example, to configure all four variable data storage classes for S3, use the following configuration:

; /etc/rstudio-pm/rstudio-pm.gcfg

[Storage]
Cache = s3
Launcher = s3
Metrics = s3
Packages = s3

; Default S3 settings. This is the minimum-required setting for using S3.
[S3Storage]
Bucket = your-s3-bucket

; Override default S3 settings for the "packages" class. This demonstrates
; all the available S3 configuration settings.
[S3Storage "packages"]
Bucket = another-s3-bucket
Prefix = rspm-packages
Profile = dev-rspm
Region = us-west-1
EnableSharedConfig = true

RStudio Package Manager’s AWS S3 support utilizes the AWS S3 SDK, which documents configuration and credential standards for interacting with S3 services.

See chapter 8 for information on configuring your system to use AWS S3.