6 High Availability and Load Balancing (Experimental)

Multiple instances of RStudio Connect can share the same data in highly available (HA) and load-balanced configurations. In this document, we refer to these configurations as “HA” for brevity.

Using Connect in a HA configuration is currently considered experimental. Please report any issues to support@rstudio.com.

6.1 HA Checklist

Follow the checklist below to configure multiple RStudio Connect instances for HA:

  1. Install and Configure the same version of RStudio Connect on each node - 2
  2. Migrate to a PostgreSQL database (if running SQLite) - 9.1. All nodes in the cluster must use the same PostgreSQL database.
  3. Configure each server’s Server.DataDir to point to the same shared location - 4.6 and 6.2.3
  4. Configure each server’s Server.LandingDir to point to the same shared location (if using a custom landing page) - C and 6.2.3
  5. Configure each server’s Metrics.DataPath directory to point to a unique-per-server location - A.18. Alternatively, you may also wish to consider using Graphite to write all metrics to a single location - 6.2.4
  6. Update each server’s configuration with LoadBalancing.EnforceMinRsconnectVersion = true to ensure that your clients use a compatible version of rsconnect - 6.2.6
  7. Configure your load balancer to route traffic to your RStudio Connect nodes with sticky sessions - 6.2.6

6.2 HA Limitations

6.2.1 Node Management

RStudio Connect nodes in a HA configuration are not self-aware of HA. The load-balancing responsibility is fully assumed by your load balancer, and the load balancer is responsible for directing requests to specific nodes and checking whether nodes are available to accept request

6.2.2 Database Requirements

RStudio Connect only supports HA when using a PostgreSQL database. If you are using SQLite, please switch to PostgreSQL. See 9.1.

6.2.3 Shared Data Directory Requirements

RStudio Connect manages uploaded content within the server’s data directory. This data directory must be a shared location, and each node’s Server.DataDir must point to the same shared location. See 4.6 for more information on the server’s data directory. We recommend and support NFS version 3 for file sharing.

6.2.4 Metrics Requirements

By default, RStudio Connect writes metrics to a set of RRD files. We do not support metrics aggregation, and each server must maintain a separate set of RRD files to avoid conflicts. The admin dashboard for a specific node will only show metrics for that node. See A.18 for information on configuring a unique Metrics.DataPath for each server

RStudio Connect includes optional support for writing metrics to Graphite. If you wish to aggregate metrics, consider using Graphite or any monitoring tool compatible with Carbon protocol. See 16 for more information.

6.2.5 Shiny Applications

Shiny applications depend on a persistent connection to a single server. Please configure your load-balancer to use cookie-based sticky sessions to ensure that Shiny applications function properly when using HA.

6.2.6 rsconnect Cookie Support

For cookie-based sticky session support, you will need to ensure that your clients use rsconnect version 0.8.3 or later. Versions of rsconnect prior to 0.8.3 did not include support for cookies. Please update each server’s configuration with the LoadBalancing.EnforceMinRsconnectVersion = true setting to ensure that clients must use a version of rsconnect with cookie support.

If you cannot enforce a minimum rsconnect version, you can consider alternatives like:

  • Non-cookie-based sticky sessions, or
  • Providing a separate host name for deployment from rsconnect to a single node in the cluster. Content deployed to a specific node will be available to the cluster assuming the database and shared storage are appropriately configured.

6.3 Updating HA Nodes

When applying updates to the RStudio Connect nodes in your HA configuration, you should follow these steps to avoid errors due to an inconsistent database schema:

  1. Stop all RStudio Connect nodes in your cluster.
  2. Upgrade one RStudio Connect node. The first update will upgrade the database schema (if necessary) and start RStudio Connect on that instance - 5.4.
  3. Upgrade the remaining nodes.

If you forget to stop any RStudio Connect nodes while upgrading another node, these nodes will be using a binary that expects an earlier schema version, and will be subject to unexpected and potentially serious errors. These nodes will detect an out-of-date database schema within 30 seconds and shut down automatically.

6.4 Downgrading

If you wish to move from an HA environment to a single-node environment, please follow these steps:

  1. Stop all Connect services on all nodes
  2. Reconfigure your network to route traffic directly to one of the nodes, unless you wish to continue using a load balancer.
  3. If you wish to move all shared file data to the node, then
    1. Configure the server’s Server.DataDir to point to a location on the node, and copy all the data from the NFS share to this location - 4.6
    2. If using a custom landing page, configure the server’s Server.LandingDir to point to a location on the node, and copy the custom landing page data from the NFS share to this location - C
    3. Configure the server’s Metrics.DataPath directory to point to an appropriate location. If necessary, copy the data from the NFS share to this location. - 6.2.4
  4. If you wish to move the database to this node, install PostgreSQL on the node and copy the data. Moving the PostgreSQL database from one server to another is beyond the scope of this guide. Please note that we do not support migrating from PostgreSQL back to SQLite.
  5. Start the Connect process 5.1

6.5 HA Details

6.5.1 Concurrent Scheduled Document Rendering

The Applications.ScheduleConcurrency configuration setting specifies the number of scheduled jobs that can run concurrently on a node. This setting defaults to 2 and can be adjusted to suit your needs. This setting will not affect ad-hoc rendering requests, hosted APIs, or Shiny applications.

6.5.2 Concurrent Shiny Applications and Ad-Hoc Rendering

Each R process associated with Shiny applications, hosted APIs, ad-hoc rendering requests, and bundle deployments runs on the server where the request was initiated. We depend on your load balancer to distribute these requests to an appropriate Connect node. The minimum and maximum process limits for Shiny applications are enforced per server. For example, if a Shiny application allows a maximum of 10 processes, a maximum of 10 process per server will be enforced. See A.16 for more information.

6.5.3 Polling

RStudio Connect nodes poll the data directory for new scheduled jobs:

  • Every 5 seconds, and
  • After every completed scheduled job.

6.5.4 Abandoned R Processes

While processing a scheduled job, the RStudio Connect node periodically updates the job’s metadata in the database with a “heartbeat”. If the node goes offline and the “heartbeat” ceases, another node will eventually claim the abandoned job and run it again. Hence, if a server goes offline or the Connect process gets shut down while a scheduled report is running, it is possible that the scheduled job could run twice.

6.5.5 Abandoned Shiny Applications

A Shiny applications depends on a persistent connection to a single server. If the server associated with a particular Shiny application session goes down, the Shiny application will fail. However, simply refreshing the application should result in a new session on an available server, assuming your load balancer detects the failed node and points you to a working one.

Shiny applications that support client-side reconnects using the session$allowReconnect(TRUE) feature will automatically reconnect the Shiny application to a working node. See https://shiny.rstudio.com/articles/reconnecting.html