9 Database

RStudio Connect supports multiple database options. Currently, the supported databases are SQLite and PostgreSQL.

Customize the Database.Provider property with a database scheme appropriate for your organization. See Section A.7 for details

Here is a partial configuration which chooses to use SQLite

Provider = sqlite

9.1 SQLite

SQLite is the default database provider.

RStudio Connect will use SQLite database if the Database.Provider setting has a value of sqlite or if Provider is not present in the configuration file.

Provider = sqlite

You can also specify the directory to store the SQLite file on your file system. This can be done by specifying SQLite.Dir in the configuration file.

Dir = /mnt/connect/sqlite

If this field is not specified, it will default to {Server.DataDir}/db. This location must exist on local storage.

If the location for Server.DataDir is not local storage but a networked location over NFS, configure the SQLite.Dir setting so it still resides on some local volume.

9.1.1 SQLite Backups

RStudio Connect can be configured to periodically back up its database while running.

Backup = true
BackupFrequency = 24h
BackupRetentionLimit = 7

The above config will execute an online backup every 24 hours. RStudio Connect will retain up to 7 of those backups. If an eigth backup is created, the oldest of the previous backups will be deleted. This provides a grace period for an administrator implementing a data retention policy, such as a practice of copying backups to tape periodically.

Backups are stored in the same directory as the main database file: /var/lib/rstudio-connect/db by default. Backups are lexically sortable, because they are timestamped with the UNIX epoch time padded to 11 digits. For example, if the server’s database file is /var/lib/rstudio-connect/db/connect.db, a backup of that database could be /var/lib/rstudio-connect/db/connect.db.01508526538.

The same process is applied to the instrumentation database file. It will be backed up based on the same settings as for the main database file.

Automatic SQLite backups are not a complete backup solution for RStudio Connect. You should also make regular backups of the Server.DataDir directory. This is especially important because the Server.DataDir directory is expected to be kept in sync with the database.

Restoring a SQLite backup is straightforward:

  • Ensure that the backup is valid with the command sqlite3 <backup file name> "PRAGMA integrity_check;"
  • Stop the RStudio Connect service
  • Copy the current database file as well as its .wal file, if any. (If you wish to analyze them or send a copy to RStudio Support)
  • Replace the current database file with the backup
  • If they exist, delete any .wal and .shm files associated with the previous database. Failure to do this could lead to further downtime and possible data corruption.
  • Start the RStudio Connect service

Note also that RStudio Connect has no way of restoring applications deployed or changes made since the last backup. Restoring the backup file will cause these changes to be lost permanently.

9.2 PostgreSQL

PostgreSQL is an available database provider which is more powerful and performant than SQLite.

You must provide your own Postgres server which will likely be a separate box from your RStudio Connect server (but not required). We currently support any 9.x or 10.x version greater than or equal to 9.2. Your Postgres server does not have to be dedicated to RStudio Connect, but it must have its own dedicated database.

To use Postgres, select it as your provider with Database.Provider = postgres. You will also need to provide a fully qualified Postgres URL in Postgres.URL. The user credentials supplied in this URL must be able to create and alter database tables, in addition to read/write permissions in the database referenced as the path of the URL. Please ensure that you have already created a blank database with the name given.

9.2.1 SSL

RStudio Connect assumes by default that SSL is enabled on the Postgres server. If you do not have SSL enabled on your Postgres database, you should add ?sslmode=disable to the Postgres.URL.

Here is an example configuration that has Connect using a Postgres database with SSL connections:

Provider = postgres

URL = "postgres://username:password@db.seed.co/connect"

Here is an example configuration that has Connect using a Postgres database without SSL connections:

Provider = postgres

URL = "postgres://username:password@db.seed.co/connect?sslmode=disable"

9.2.2 Schemas

If you need to, you can tell Connect to restrict itself to keeping its tables within a specific schema. You control this by giving PostgreSQL a search path as part of the URL by adding options=-csearch_path=<schema-name> to the URL. If it’s the only item you’re adding, separate it from the rest of the URL with ? (just like the sslmode item above). Otherwise, separate it from other items with ‘&’.

Here is an example configuration that specifies a schema to use, with SSL:

Provider = postgres

URL = "postgres://username:password@db.seed.co/connect?options=-csearch_path=connect_schema"

Here is an example configuration that specifies a schema to use, with SSL disabled:

Provider = postgres

URL = "postgres://username:password@db.seed.co/connect?sslmode=disable&options=-csearch_path=connect_schema"

RStudio Connect will refuse to start when given a schema that does not already exist. The schema must be owned by the connecting user or by a group that contains the connecting user.

9.3 Changing Database Provider

Connect includes a migrate command for migrating data from one database to another.

The migration utility is installed at /opt/rstudio-connect/bin/migrate. It uses the configuration defined in /etc/rstudio-connect/rstudio-connect.gcfg unless you specify an alternate configuration file with the --config flag.

The migrate utility must be run as root.

The migrate utility can only be run when Connect is stopped. See Section 5.1 for information on stopping and restarting Connect.

Note: Migration from PostgreSQL to SQLite is not supported at this time.

Note: Migration of instrumentation event data is not supported at this time.

If you are also migrating your RStudio Connect installation to a new server, see 4.8.

9.3.1 Database Migration Checklist

Use this checklist to guide your migration process:

  1. Shut down Connect - 5.1
  2. Back up your data - 4.7
  3. Ensure that you have a Postgres configuration section - 9.2
  4. Run the migration - B.2
  5. Update the Database.Provider configuration setting to point to the new database - A.7
  6. Restart Connect - 5.1

9.3.2 Configuration Requirements

When migrating data, the configuration file must contain valid configuration sections for both SQLite and Postgres. The migration utility will connect to the SQLite and PostgreSQL databases specified in the configuration.