Multi-server Installation

Workbench | Advanced

These directions describe a multi-server installation for Posit Workbench in a load-balanced cluster.

For alternative installation instructions, see our recommended installation paths.

This page includes instructions for downloading Posit professional products. Download and/or use of these products is governed under the terms of the Posit End User License Agreement. By downloading, you agree to the terms posted there. The same instructions apply if you still use a legacy RStudio Server Pro license configuration (no launcher enabled).

Prerequisites

  • Workbench: Workbench installed on two or more nodes.
  • Postgres: A Postgres database that Workbench will use to store metadata about itself.
  • Shared Storage: For Workbench session data and users’ home directories.
  • External Load Balancer: (Optional) An external load balancer that will provide users with a consistent entry point and cluster resilience.

Install Workbench on each node

To run Workbench in a load-balanced configuration, Workbench must be installed on two or more nodes.

On each node, reference the Install Workbench section to install R, Python, and Workbench:

Postgres database

A Postgres database is required to run Workbench in a load-balanced configuration.

  • You must create an empty database for the rstudio-server process to connect to.
  • This database must not be shared with other products or services.
  • If you’d like to use SSL certificate authorization in place of a password you’ll also need to configure Postgres to use SSL with your certificates.

For detailed requirements please refer to PostgreSQL.

Shared storage

In a load-balanced configuration, Workbench requires POSIX-compliant shared storage. Shared storage is used to persist the following data:

For the purposes of this guide we will assume that your NFS server has the following exported drives. For example:

/etc/exports
# Shared storage for user home directories
/var/nfs/workbench/home             *(rw,sync,no_subtree_check,no_root_squash)

# Shared storage for project sharing
/var/nfs/workbench/shared-storage   *(rw,sync,no_subtree_check,no_root_squash)
Tip

This guide uses an NFS server for shared storage. However, other POSIX-compliant shared storage solutions work as well. The most common shared storage solutions are:

External load balancer

Workbench includes an internal load balancer. We recommend to also use an external front-end load balancer (such as Nginx or Apache) for stronger cluster resilience. When configuring the external load balancer you should ensure that:

  • Sticky sessions are enabled for most efficient operation and must be enabled if using SAML for authentication.
  • The load balancing method distributes sessions across the available nodes. For example, you can load balance using an IP hash or round robin.
  • Websockets are forwarded correctly between the proxy server and Workbench to ensure that all Workbench functions work correctly. See Forwarding websockets for more details.
  • Sufficient timeouts are configured with a minimum of 60 seconds. See Check the connection timeout

To learn more, please refer to the following sections:

Workbench configuration

This section covers configuring Workbench to run in a load-balanced configuration.

Networking

The nodes running Workbench must have the following ports exposed for ingress:

Port Range Description
22 Port 22 is exposed for SSH access.
8787 If using HTTP, by default, Workbench runs on port 8787. This guide will use HTTP.
443 If using HTTPS, Workbench runs on port 443.
5559 By default, Launcher runs on port 5559.
ip_local_port_range, for example 32000 - 65535 When using Local Launcher, a wide range of ephemeral ports must be open for Launcher to claim. This port range is determined using the ip_local_port_range parameter. To determine the range for your instance, run the following command sudo cat /proc/sys/net/ipv4/ip_local_port_range. Then, adjust the firewall to allow node-to-node communication across the port range. For more details, please refer to launcher-local-proxy setting.

For additional information, please refer to Networking.

Mount shared storage

Workbench requires two different directories to be mounted on each node running Workbench:

  • Home directories: The users’ home directories must exist in shared storage and be accessible to each node.
  • Shared storage: An explicit directory must exist in shared storage and be accessible to each node to be used by Workbench.

Run the following commands on each Workbench node to install nfs-common and mount the NFS directories. If you are using existing shared storage, change the local and mount paths below, as required:

# Install nfs-common
sudo apt-get update
sudo apt-get install -y nfs-common

# Replace this value with the IP address for your NFS server
NFS_HOST="<REPLACE-WITH-YOUR-NFS-HOST>"

# Create the directories for the NFS mount
sudo mkdir -p /nfs/workbench/home
sudo mkdir -p /nfs/workbench/shared-storage

# Define the mount configuration
sudo tee -a /etc/fstab <<EOF
${NFS_HOST}:/var/nfs/workbench/home             /nfs/workbench/home              nfs auto,noac,nofail,noatime,nolock,intr,tcp,actimeo=1800 0 0
${NFS_HOST}:/var/nfs/workbench/shared-storage   /nfs/workbench/shared-storage    nfs auto,noac,nofail,noatime,nolock,intr,tcp,actimeo=1800 0 0
EOF

# Mount all of the mount points in /etc/fstab
sudo mount -av

Lastly, change the owner of directories:

  • /nfs/workbench/home: Should be owned by root:root.
  • /nfs/workbench/shared-storage: Should be owned by nobody:nogroup. For more information on the recommended mount options, refer to Project sharing and NFS.
sudo chown -R root:root /nfs/workbench/home
sudo chown -R nobody:nogroup /nfs/workbench/shared-storage

User provisioning

Because Workbench requires each user to have a Unix account and a home directory, each node in the cluster must have the following:

  • A Unix account for each user with consistent IDs.
  • Access to the users’ home directories.

This guide presents two common options for provisioning users:

For example, say you have two users: user1 and user2. To manually provision these users, do the following:

  • On one node only, run the following commands to create the user and a home directory in shared storage:
# Create user1
NAME="user1"
PASSWORD="password"
sudo useradd --create-home --home-dir /nfs/workbench/home/$NAME -s /bin/bash $NAME
echo -e "${PASSWORD}\n${PASSWORD}" | sudo passwd $NAME

# Create user2
NAME="user2"
PASSWORD="password"
sudo useradd --create-home --home-dir /nfs/workbench/home/$NAME -s /bin/bash $NAME
sudo echo -e "${PASSWORD}\n${PASSWORD}" | sudo passwd $NAME
  • On all other nodes, run the following commands to create the users. Note that you do not need to re-create the home directories:
# Create user1
NAME="user1"
PASSWORD="password"
sudo useradd --home-dir /nfs/workbench/home/$NAME -s /bin/bash $NAME
echo -e "${PASSWORD}\n${PASSWORD}" | sudo passwd $NAME

# Create user2
NAME="user2"
PASSWORD="password"
sudo useradd --home-dir /nfs/workbench/home/$NAME -s /bin/bash $NAME
echo -e "${PASSWORD}\n${PASSWORD}" | sudo passwd $NAME
Caution

Do not use password as your password; set a unique and secure password for each user.

It is common for customers to automate user provisioning with SSSD. For more details, see User Provisioning.

Below is an example of a SSSD configuration file. Your actual configuration will depend on your Active Directory/LDAP setup. Note that:

  • override_homedir ensures home directories are created in the shared storage.
/etc/sssd/sssd.conf
[sssd]
config_file_version = 2
services = nss, pam
domains = LDAP

[nss]
filter_users = root,named,avahi,haldaemon,dbus,radiusd,news,nscd
filter_groups =

[pam]

[domain/LDAP]
default_shell = /bin/bash
override_homedir = /nfs/workbench/home/%u
id_provider = ldap
auth_provider = ldap
chpass_provider = ldap
sudo_provider = ldap
enumerate = true
cache_credentials = false
ldap_schema = rfc2307
ldap_uri = "<REPLACE-WITH-YOUR-VALUE>"
ldap_search_base = dc=example,dc=org
ldap_user_search_base = dc=example,dc=org
ldap_user_object_class = posixAccount
ldap_user_name = uid
ldap_group_search_base = dc=example,dc=org
ldap_group_object_class = posixGroup
ldap_group_name = cn
ldap_id_use_start_tls = false
ldap_tls_reqcert = never
ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
ldap_default_bind_dn = cn=admin,dc=example,dc=org
ldap_default_authtok = admin
access_provider = ldap
ldap_access_filter = (objectClass=posixAccount)
min_id = 1
max_id = 0
ldap_user_uuid = entryUUID
ldap_user_shell = loginShell
ldap_user_uid_number = uidNumber
ldap_user_gid_number = gidNumber
ldap_group_gid_number = gidNumber
ldap_group_uuid = entryUUID
ldap_group_member = memberUid
ldap_auth_disable_tls_never_use_in_production = true
use_fully_qualified_names = false
ldap_access_order = filter

Configuration files

Set the following configuration files for each node on the Workbench cluster.

/etc/rstudio/rserver.conf

Ensure that server-shared-storage-path is set to a location in shared storage:

sudo tee -a /etc/rstudio/rserver.conf <<EOF
server-shared-storage-path=/nfs/workbench/shared-storage
EOF

Modify rserver.conf to set load-balancing-enabled=1. This setting is already explicitly set to load-balancing-enabled=0 so you will need to open the file and edit it.

load-balancing-enabled=1

See rserver.conf for all configuration options.

/etc/rstudio/launcher.conf

Modify launcher.conf to set address=0.0.0.0. This setting is already explicitly set to address=localhost so you will need to open the file and edit it.

address=0.0.0.0

See Launcher Configuration for all configuration options.

/etc/rstudio/database.conf

Before creating the database configuration file, determine whether you’ll be authenticating Workbench with the PostgreSQL server via password or SSL certificate authorization.

Run the appropriate bash snippet in a shell to generate or append to /etc/rstudio/database.conf, then distribute both this file and the secure-cookie-key file to all nodes in the cluster.

For password authorization
# Replace the variables with the appropriate value for your database
POSTGRES_HOST="localhost"
POSTGRES_DB="rstudio"
POSTGRES_USER="rstudio"
POSTGRES_PASSWORD="<plain-text-password>"

POSTGRES_PASSWORD_ENCRYPTED=$(echo $POSTGRES_PASSWORD | sudo rstudio-server encrypt-password)

sudo tee -a /etc/rstudio/database.conf <<EOF
provider=postgresql
password=${POSTGRES_PASSWORD_ENCRYPTED}
connection-uri=postgresql://${POSTGRES_USER}@${POSTGRES_HOST}:5432/${POSTGRES_DB}?sslmode=allow
EOF
For SSL certificate authorization
# Replace the variables with the appropriate value for your database
POSTGRES_HOST="<REPLACE-WITH-YOUR-VALUE>"
POSTGRES_DB="<REPLACE-WITH-YOUR-VALUE>"
POSTGRES_USER="<REPLACE-WITH-YOUR-VALUE>"
POSTGRES_SSL_CERT="<REPLACE-WITH-PATH-TO-CERT>"
POSTGRES_ROOT_CERT="<REPLACE-WITH-PATH-TO-CERT>"
POSTGRES_SSL_KEY="<REPLACE-WITH-PATH-TO-KEY>"


sudo tee -a /etc/rstudio/database.conf <<EOF
provider=postgresql
connection-uri=postgresql://${POSTGRES_USER}@${POSTGRES_HOST}:5432/${POSTGRES_DB}?sslcert=${POSTGRES_SSL_CERT}&sslkey=${POSTGRES_SSL_KEY}&sslrootcert=${POSTGRES_ROOT_CERT}
EOF

See PostgreSQL for additional PostgreSQL configuration options.

(Optional) /etc/rstudio/load-balancer

Configure any custom settings in /etc/rstudio/load-balancer. See Configuration for detailed options.

Restart the cluster

After you have finished configuring Workbench:

  • Restart the Workbench service (run on all nodes):

    sudo systemctl restart rstudio-server
  • Restart the Launcher service (run on all nodes):

    sudo systemctl restart rstudio-launcher

Verify

After completing the steps above, Workbench should be running in a load-balanced configuration.

Run the following command from one of the Workbench nodes to verify that Workbench is aware of all the nodes in the cluster:

sudo rstudio-server list-nodes

To verify that Workbench is operating correctly, do the following:

  • Open Workbench in a new browser tab:

    • If using an external load balancer, use the appropriate URL for the load balancer.
    • If not using an external load balancer, use the appropriate URL for one of the nodes in the cluster.
  • Log into Workbench using your username and password.

  • Click the + New Session button. Clear the Join session when ready check box. Then, launch four RStudio Pro sessions.

  • Once the sessions are running, visit each session and verify that you can use the IDE.

  • SSH into any one of the Workbench nodes and run the following commands:

    curl http://localhost:8787/load-balancer/status
    # ip-172-31-29-53:8787 - 172.31.29.53  Load: 0.0063, 0.019, 0
    #    48751 - user1
    #    39972 - user1
    # 
    # ip-172-31-28-224:8787 - 172.31.28.224  Load: 0.051, 0.041, 0.029
    #    39959 - user1
    #    40055 - user1

    You should see four sessions running. These sessions should be distributed across the two nodes.

  • If all sessions are on the same node, execute an R script to consume node resources in one of the sessions. For example:

    test.R
    df <- data.frame(
        x = c(1:1000000),
        y = c(1:1000000)
    )
    
    for (i in c(1:500)) {
        print(i)
        fit <- lm(x ~ y, df)
    }
  • Then, continue to create new sessions until you can verify that the sessions are being distributed.

  • Next, launch one session for the other three IDEs:

    • Jupyter Notebook
    • JupyterLab
    • VS Code
  • Once the sessions have started running, visit each session and verify that you can use the IDE.

Additional information

Additionally:

  • Job Launcher: This is the tooling that provides the ability for Posit Workbench to start processes locally. Job Launcher is required to start JupyterLab, Jupyter Notebook, and VS Code sessions. The focus of this guide uses the Local Plugin. Workbench can also be configured with other plugins such as the Slurm Plugin and the Kubernetes Plugin.
  • Local Plugin: This is a Job Launcher plugin that launches executables on the local machine (the same machine on which the Launcher is running).
  • Load balancing: Refers to Workbench’s ability to be configured to load balance sessions across two or more nodes within a cluster. This provides both increased capacity as well as higher availability.

When combined with load balancing, the Job Launcher and Local Plugin can launch sessions across different nodes in the cluster. Launcher with the Local Plugin is enabled by default.

flowchart LR
accTitle: Mermaid diagram
accDescr {
A mermaid diagram showing a map example of how the job launcher and local plugin combined with load balancing allows you to launch sessions across different nodes in a cluster.
}
u1(User)
u2(User)
u3(User)
b1(Browser)
b2(Browser)
b3(Browser)
workbench1(Workbench)
workbench2(Workbench)
session(RStudio Session)
jupyter(Jupyter Session)
vscode(VS Code Session)
job(Workbench Job)
lb(External Load Balancer)
nfs(Shared Storage)
pg(Postgres)
u1---b1
u2---b2
u3---b3
b1---lb
b2---lb
b3---lb
lb---workbench1
lb---workbench2
server1-.-nfs
server2-.-nfs
server1-.-pg
server2-.-pg
subgraph server1 [Linux Server]
    workbench1---jupyter
    workbench1---job
end
subgraph server2 [Linux Server]
    workbench2---session
    workbench2---vscode
end
workbench1-.-workbench2
classDef server fill:#e27e53,stroke:#ab4d26
classDef product fill:#447099,stroke:#213D4F,color:#F2F2F2
classDef session fill:#7494B1,color:#F2F2F2,stroke:#213D4F
classDef req fill:#72994E,stroke:#1F4F4F
class server1,server2 server
class workbench1,workbench2 product
class session,jupyter,vscode,job session
class u1,u2,u3,b1,b2,b3,lb element
class pg,nfs req