1 Job Launcher

1.1 Overview

The RStudio Job Launcher provides the ability for various RStudio applications, such as RStudio Server Pro and RStudio Connect, to start processes within various batch processing systems (e.g. IBM Spectrum LSF) and container orchestration platforms (e.g. Kubernetes). RStudio products integrate with the Job Launcher to allow you to utilize your existing cluster hardware for maximum process isolation and operations efficiency.

1.2 Configuration Options

To configure the Job Launcher, create and modify the /etc/rstudio/launcher.conf file. Configuration options are listed below.

Server Options

There should be one [server] section in the configuration file (see sample config below).

Config Option Description Required (Y/N) Default Value
address IPv4 or IPv6 address, or path to Unix domain socket Y
port Port number (0-65535) Y (when using IP Address)
enable-ssl Toggle usage of SSL encryption for connections N 0
certificate-file Certificate chain file of public certificates to present to incoming connections Y (Only required when SSL is enabled)
certificate-key-file Certificate private key file used for encryption Y (Only required when SSL is enabled)
server-user User to run the executable as. The Launcher should be started as root, and will lower its privilege to this user for normal execution. N rstudio-server
authorization-enabled Enables/disables authorization - this is required for all but test systems. Can be 1 (enabled) or 0 (disabled) N 1
admin-group Group name of users that are able to see/control all jobs in the system of other users. If using with RStudio Pro, this must match the rserver.conf’s server-user’s group value. N Empty
thread-pool-size Size of the thread pools used by the launcher N Number of CPUs * 2
request-timeout-seconds Number of seconds a plugin has to process a request before it is considered timed out N 120
bootstrap-timeout-seconds Number of seconds a plugin has to bootstrap before it is considered a failure N 120
max-message-size Maximum allowed message size of messages sent by plugins in bytes. It is strongly recommended you do not change this, but it may be bumped higher if you run into the limit. N 5242880
enable-debug-logging Enables/disables verbose debug logging. Can be 1 (enabled) or 0 (disabled) N 0
scratch-path Scratch directory where the launcher and its plugins write temporary state N /var/lib/rstudio-launcher
secure-cookie-key-file Location of the secure cookie key, which is used to perform authorization/authentication. It is strongly recommended you do not change this. N /etc/rstudio/secure-cookie-key

Cluster Options

There should be one [cluster] section in the configuration file per cluster to connect to / plugin to load (see sample config below).

Config Option Description Required (Y/N) Default Value
name Friendly name of the cluster Y
type Type of the cluster (for human consumption, display purposes) Y The plugin type. Can be one of Local, Kubernetes
exe Path to the plugin executable for this cluster N If using an RStudio plugin like Local, Kubernetes, this will be inferrered from the value of type. If using a custom plugin, you must provide its executable path in this option.
config-file Path to the configuration file for the plugin N Each plugin will have its own default config location
allowed-groups Comma-separated list of user groups that may access this cluster N Empty (all groups may access)

1.2.1 Sample Configuration

/etc/rstudio/launcher.conf

[server]
address=127.0.0.1
port=5559
server-user=rstudio-server
admin-group=devops
authorization-enabled=1
thread-pool-size=4
enable-debug-logging=1

[cluster]
name=Local
type=Local
exe=/usr/lib/rstudio-server/bin/rstudio-local-launcher
allowed-groups=devs,admins

1.2.2 Job Launcher Plugin Configuration

Each specific cluster plugin can be additionally configured via its own configuration file, and some plugins (such as the Kubernetes plugin) require additional configuration. Documentation for all plugins created by RStudio can be found in the following sections.

1.2.2.1 Local Plugin

The Local Job Launcher Plugin provides the capability to launch executables on the local machine (same machine that the Launcher is running on). It also provides the capability of running arbitrary PAM profiles. All of the sandboxing capability is provided via rsandbox.

The local plugin does not require configuration, and it is recommended you do not change any of the defaults.

/etc/rstudio/launcher.local.conf

Config Option Description Required (Y/N) Default Value
server-user User to run the executable as. The plugin should be started as root, and will lower its privilege to this user for normal execution. N rstudio-server
thread-pool-size Size of the thread pool used by the plugin N Number of CPUs * 2
enable-debug-logging Enables/disables verbose debug logging. Can be 1 (enabled) or 0 (disabled) N 0
scratch-path Scratch directory where the plugin writes temporary state N /var/lib/rstudio-launcher
job-expiry-hours Number of hours before completed jobs are removed from the system N 24
save-unspecified-output Enables/disables saving of stdout/stderr that was not specified in submitted jobs. This will allow users to view their output even if they do not explicitly save it, at the cost of disk space. N 1
rsandbox-path Location of rsandbox executable. N /usr/lib/rstudio-server/bin/rsandbox

1.2.2.2 Kubernetes Plugin

The Kubernetes Job Launcher Plugin provides the capability to launch executables on a Kubernetes cluster.

It is recommended not to change the default values which come from the Job Launcher itself and only configure required fields as outlined below.

/etc/rstudio/launcher.kubernetes.conf

Config Option Description Required (Y/N) Default Value
server-user User to run the executable as. The plugin should be started as root, and will lower its privilege to this user for normal execution. N rstudio-server
thread-pool-size Size of the thread pool used by the plugin. N Number of CPUs * 2
enable-debug-logging Enables/disables verbose debug logging. Can be 1 (enabled) or 0 (disabled). N 0
scratch-path Scratch directory where the plugin writes temporary state. N /var/lib/rstudio-launcher
job-expiry-hours Number of hours before completed jobs are removed from the system. N 24
profile-config Path to the user and group profiles configuration file (explained in more detail below). N /etc/rstudio/launcher.kubernetes.profiles.conf
api-url The Kubernetes API base URL. This can be an HTTP or HTTPS URL. The URL should be up to, but not including the /api endpoint. Y Example: https://192.168.99.100:8443
auth-token The auth token for the job-launcher service account. This is used to authenticate with the Kubernetes API. This should be base-64 encoded. See below for more information. Y
kubernetes-namespace The Kubernetes namespace to create jobs in. Note that the account specified by the auth-token setting must have full API privileges within this namespace. See Kubernetes Cluster Requirements below for more information. N rstudio
verify-ssl-certs Whether or not to verify SSL certificates when connecting to api-url. Only applicable if connecting over HTTPS. For production use, you should always have this set to true, but can be disabled for testing purposes. N 1
watch-timeout-seconds Number of seconds before the watch calls to Kubernetes stops. This is to help prevent job status updates from hanging in some environments. It is recommended to keep the default, but it can be raised if job status hangs are not apparent, or turned off by setting this to 0. N 300
fetch-limit The maximum amount of objects to request per API call from the Kubernetes Service for GET collection requests. It is recommended you only change the default if you run into size issues with the returned payloads. N 500

In order to retrieve the auth-token value, run the following commands. Note that the account must first be created and given appropriate permissions (see Kubernetes Cluster Requirements below).

KUBERNETES_AUTH_SECRET=$(kubectl get serviceaccount job-launcher --namespace=rstudio -o jsonpath='{.secrets[0].name}')
kubectl get secret $KUBERNETES_AUTH_SECRET --namespace=rstudio -o jsonpath='{.data.token}' | base64 -d
1.2.2.2.1 User and Group Profiles

The Kubernetes plugin also allows you to specify user and group configuration profiles, similar to RStudio Server Pro’s profiles, in the configuration file /etc/rstudio/launcher.kubernetes.profiles.conf (or any arbitrary file as specified in profile-config within the main configuration file; see above). These are entirely optional.

Profiles are divided into sections of three different types:

Global ([*])

Per-group ([@groupname])

Per-user ([username])

Here’s an example profiles file that illustrates each of these types:

/etc/rstudio/launcher.kubernetes.profiles.conf

[*]
placement-constraints=node,region:us,region:eu
default-cpus=1
default-mem-mb=512
max-cpus=2
max-mem-mb=1024
container-images=r-session:3.4.2,r-session:3.5.0
allow-unknown-images=0

[@rstudio-power-users]
default-cpus=4
default-mem-mb=4096
max-cpus=20
max-mem-mb=20480
container-images=r-session:3.4.2,r-session:3.5.0,r-session:preview
allow-unknown-images=1

[jsmith]
max-cpus=3

This configuration specifies that by default users will be allowed to launch jobs with a maximum of 1024 MB of memory, and use only two different R containers. It also specifies that members of the rstudio-power-users group will be allowed to use much more resources, and the ability to see the r-session:preview image, in addition to being able to run any image they specify.

Note that the profiles file is processed from top to bottom (i.e. settings matching the current user that occur later in the file always override ones that appeared prior). The settings available in the file are described in more depth in the table below.

/etc/rstudio/launcher.kubernetes.profiles.conf

Config Option Description Required (Y/N) Default Value
container-images Comma-separated string of allowed images that users may see and run. N
default-container-image The default container image to use for the Job if none is specified. N
allow-unknown-images Whether or not to allow users to run any image they want within their job containers, or if they have to use the ones specified in container-images N 1
placement-constraints Comma-separated string of available placement constraints in the form of key1:value1,key2:value2,... where the :value part is optional to indicate free-form fields. See next section for more details N
default-cpus Number of CPUs available to a job by default if not specified by the job. N 0.0 (infinite - managed by Kubernetes)
default-mem-mb Number of MB of RAM available to a job by default if not specified by the job. N 0.0 (infinite - managed by Kubernetes)
max-cpus Maximum number of CPUs available to a job. N 0.0 (infinite - managed by Kubernetes)
max-mem-mb Maximum number of MB of RAM available to a job. N 0.0 (infinite - managed by Kubernetes)
1.2.2.2.2 Kubernetes Cluster Requirements

In order for the Kubernetes plugin to run correctly, the following assumptions about the Kubernetes cluster must be true:

  • The Kubernetes API must be enabled and reachable from the machine running the Job Launcher
  • There must be a namespace to create jobs in, which can be specified via the kubernetes-namespace configuration mentioned above (this defaults to rstudio)
  • There must be a service account that has full API access for all endpoints and API groups underneath the aforementioned namespace, and the account’s auth token must be supplied to the plugin via the auth-token setting
  • The service account must have access to view the nodes list via the API (optional, but will restrict IP addresses returned for a job to the internal IP if not properly configured, as /nodes is needed to fetch a node’s external IP address)
  • The cluster must have the metrics-server addon running and working properly to provide job resource utilization streaming

In order to use placement constraints, you must attach labels to the node that match the given configured placement constraints. For example, if you have a node with the label az=us-east and have a placement constraint defined az:us-east, incoming jobs specified with the az:us-east placement constraint will be routed to the desired node. For more information on Kubernete’s placement constraints, see here.

The following sample script can be run to create a job-launcher service account and rstudio namespace, granting the service account (and thus, the launcher) full API access to manage RStudio jobs:

kubectl create namespace rstudio
kubectl create serviceaccount job-launcher --namespace rstudio
kubectl create rolebinding job-launcher-admin \
   --clusterrole=cluster-admin \
   --group=system:serviceaccounts:rstudio \
   --namespace=rstudio
kubectl create clusterrole job-launcher-clusters \
   --verb=get,watch,list \
   --resource=nodes
kubectl create clusterrolebinding job-launcher-list-clusters \
  --clusterrole=job-launcher-clusters \
  --group=system:serviceaccounts:rstudio

1.3 Running the Service

Once configured, you can run the Job Launcher via service by executing the command sudo rstudio-launcher start. The Launcher service needs root privilege for performing authentication and authorization, as well as providing any child plugin processes with root privilege (as needed). After initial setup, the Job Launcher lowers its privilege to the server user (see Configuration Options for more information).

If the Job Launcher service fails to start and continue running, one of its plugins exited in failure and is likely not configured properly. It is often easier to run the Job Launcher in the terminal directly when getting it set up for the first time so you can more easily see any reported errors and more quickly test configuration changes. In order to run from the console, execute the command sudo /usr/lib/rstudio-server/rstudio-launcher. If you are still having troubles starting the service, see Logging and Troubleshooting.

The service is not automatically configured to start on system startup, and you must enable this manually if desired by using the following commands:

systemd

systemctl enable rstudio-launcher.service

System V

chkconfig --add rstudio-launcher

1.4 Logging and Troubleshooting

By default, the Job Launcher and its plugins write logs to the system logger. If the service fails to start, check the system log to see if there are any errors, which should help you determine what is going wrong. In general, errors are usually a result of misconfiguration of the Job Launcher or one of its plugins. When initially setting up the Launcher, it is sometimes helpful to run it directly from the command line, as opposed to running it via the service. See Running the Service for more information.

When running into issues that you are unable to resolve, make sure to enable debug logging for the Job Launcher by adding the line enable-debug-logging=1 to /etc/rstudio/launcher.conf. This will cause the Launcher and all of its plugins to emit debug output. This debug output can be seen on the console (if running the Job Launcher manually in the terminal), or in a debug log file located under the /var/lib/rstudio-launcher folder for the Job Launcher service, and under the plugin’s subdirectory for plugin-specific logging.

1.5 Load Balancing and Monitoring

The Job Launcher can be load balanced. It is recommended that you use an active/active setup for maximum throughput and scalability. This means that you should have multiple Job Launcher nodes pointed to your specific cluster back-ends, and have a load balancer configured to round-robin traffic between them.

The ability for the Job Launcher to be load balanced effectively depends on each plugin’s individual design and whether or not it effectively supports load balancing. For example, the Local plugin does not provide load balancing capabilities. As such, the Local plugin should only be used in specific deployments scenarios and should not be used in most cases. However, the other RStudio plugins will work properly when used in a load balancing setup. Most plugins should support load balancing configurations, but you must be aware of which ones do not. RStudio cannot provide load balancing guarantees for third-party plugins.

The /status endpoint of the Job Launcher can be used to get the current health status and other connection information. Unlike other Job Launcher endpoints, this endpoint does not require authorization and may be queried by any monitoring or load balancing software to determine the health of a specific Job Launcher node. The status field indicates whether a node is experiencing no issues (“Green”), one or more plugins are restarting or unavailable (“Yellow”) or all plugins have failed and service shutdown is imminent (“Red”). It is recommended that you reroute traffic to another launcher node if you receive a “Yellow” or “Red” status, or if the page fails to load.