8 Auditing and Monitoring

8.1 Auditing Configuration

8.1.1 R Console Auditing

RStudio Server can be optionally configured to audit all R console activity by writing console input and output to a central location (the /var/lib/rstudio-server/audit/r-console directory by default). This feature can be enabled using the audit-r-console setting. For example:

/etc/rstudio/rserver.conf

audit-r-console=input

This will audit all R console input. If you wish to record both console input and output then you can use the all setting. For example:

/etc/rstudio/rserver.conf

audit-r-console=all

Note that if you choose to record both input and output you’ll need considerably more storage available than if you record input only. See the Storage Options section below for additional discussion of storage requirements and configuration.

8.1.1.1 Data Format

The R console activity for each user is written into individual files within the r-console data directory (by default /var/lib/rstudio-server/audit/r-console). The following fields are included:

session_id Unique identifier for R session where this action occurred.
project Path to RStudio project directory if the action occurred within a project.
pid Unix process ID where this console action occurred.
username Unix user which executed this console action.
timestamp Timestamp of action in milliseconds since the epoch.
type Console action type (prompt, input, output, or error).
data Console data associated with this action (e.g. output text).

The session_id field refers to a concurrent R session as described in the section on Multiple R Sessions (i.e. it can span multiple projects and/or pids).

The default format for the log file is CSV (Comma Separated Values). It’s also possible to write the data to Newline Delimited JSON by using the audit-r-console-format option. For example:

audit-r-console-format=json

Note that when using the JSON format the entire file is not a valid JSON object but rather each individual line is one. This follows the Newline Delimited JSON specification supported by several libraries including the R jsonlite package.

8.1.1.2 Storage Options

You can customize both the location where audit data is written as well as the maximum amount of data to log per-user (by default this is 50 MB). To specify the root directory for audit data you use the audit-data-path setting. For example:

/etc/rstudio/rserver.conf

audit-data-path=/audit-data

Note that this path affects the location of both R console auditing and R session auditing data.

To specify the maximum amount of data to write to an individual user’s R console log file you use the audit-r-console-user-limit-mb setting. For example:

/etc/rstudio/rserver.conf

audit-r-console-user-limit-mb=100

The default maximum R console log file size is 50 megabytes per-user. To configure no limit to the size of files which can be written you set the value to 0, for example:

/etc/rstudio/rserver.conf

audit-r-console-user-limit-mb=0

If you wish for RStudio to automatically roll the log files once the maximum size is reached, set the audit-r-console-user-limit-months setting. For example:

/etc/rstudio/rserver.conf

audit-r-console-user-limit-months=2

This will cause log files to be rolled over once the maximum size is reached, and only two months of data will be kept. Note that this setting is not set by default.

Note that if the month limit is not set, then log files will not be rolled automatically. Depending on the number of users and their activity level this means that you should either create a scheduled (e.g. cron) job to periodically move the files off the server onto auxiliary storage and/or ensure that the volume they are stored on has sufficient capacity.

8.1.2 R Session Auditing

RStudio Server can be optionally configured to write an audit log of session related events (e.g. login/logout, session start/suspend/exit) to a central location (the /var/lib/rstudio-server/audit/r-sessions directory by default). This feature can be enabled using the audit-r-sessions setting. For example:

/etc/rstudio/rserver.conf

audit-r-sessions=1

Note that this is enabled by default if you are using named user licenses.

8.1.2.1 Data Format

The R session event log is written by default to the file at /var/lib/rstudio-server/audit/r-sessions/r-sessions.csv. The following fields are included:

pid Unix process ID the event is associated with (for auth events this will be the main rserver process, for session events the rsession process).
username Unix user that the event is associated with.
timestamp Timestamp of event in milliseconds since the epoch.
type Event type (see documentation on event types below).
data Administrative user that initiated event (only applies to admin events and auth_login for login-as-user by admin).

The following values are valid for the event type field:

auth_login User logged in to RStudio Server
auth_logout User logged out of RStudio Server
auth_login_failed User login attempt failed
session_start R session started
session_suicide R session exiting due to suicide (internal error)
session_suspend R session exiting due to suspend
session_quit R session exiting due to user quit
session_exit R session exited
session_admin_suspend Administrator attempt to suspend R session
session_admin_terminate Administrator attempt to terminate R session

The default format for the log file is CSV (Comma Separated Values). It’s also possible to write the data to Newline Delimited JSON by using the audit-r-sessions-format option. For example:

audit-r-sessions-format=json

Note that when using the JSON format the entire file is not a valid JSON object but rather each individual line is one. This follows the Newline Delimited JSON specification supported by several libraries including the R jsonlite package.

8.1.2.2 Storage Options

You can customize both the location where audit data is written as well as the maximum amount of R session event data to log (by default this is 1 GB). To specify the root directory for audit data you use the audit-data-path setting. For example:

/etc/rstudio/rserver.conf

audit-data-path=/audit-data

Note that this path affects the location of both R console auditing and R session auditing data.

To specify the maximum amount of R session event data to log you use the audit-r-sessions-limit-mb setting. For example:

/etc/rstudio/rserver.conf

audit-r-sessions-limit-mb=2048

The default maximum R session event log file size is 1 GB (1024 MB). To configure no limit to the size of files which can be written you set the value to 0, for example:

/etc/rstudio/rserver.conf

audit-r-sessions-limit-mb=0

If you wish for RStudio to automatically roll the log files once the maximum size is reached, set the audit-r-sessions-limit-months setting. The default is set to 13 months. To set it manually, for example:

/etc/rstudio/rserver.conf

audit-r-sessions-limit-months=13

This will cause log files to be rolled over once the maximum size is reached, and only thirteen months of data will be kept. We do not recommend you change this setting if using named user licenses.

Note that if the month limit is not set, then log files will not be rolled automatically. This means that you should either create a scheduled (e.g. cron) job to periodically move the file off the server onto auxiliary storage and/or ensure that the volume that it is stored on has sufficient capacity.

In any case, the amount of data written to the R session event log file is not large (less than 1 KB per session) so a large number of session events can be stored within the default 1 GB maximum log file size.

8.2 Monitoring Configuration

8.2.1 System and Per-User Resources

RStudio Server monitors the use of resources (CPU, memory, etc.) on both a per-user and system wide basis. By default, monitoring data is written to a set of RRD (http://oss.oetiker.ch/rrdtool/) files and can be viewed using the Administrative Dashboard.

The storage of system monitoring data requires about 20MB of disk space and the storage of user monitoring data requires about 3.5MB per user. This data is stored by default at /var/lib/rstudio-server/monitor. If you have a large number of users you may wish to specify an alternate volume for monitoring data. You can do this using the monitor-data-path setting. For example:

/etc/rstudio/rserver.conf

monitor-data-path=/monitor-data

You also might wish to disable monitoring with RRD entirely. You can do this using the monitor-rrd-enabled setting. For example:

/etc/rstudio/rserver.conf

monitor-rrd-enabled=0

Note that changes to the configuration will not take effect until the server is restarted.

8.2.1.1 Analyzing RRD files

The RRD files powering RStudio’s Administrative Dashboard are available for your own analysis, too. You can find them in /var/lib/rstudio-server/monitor/rrd (unless you’ve changed monitor-data-path as described above); they store all the metrics you can see on the dashboard, so you can use the information for your own reports and insights.

More information on how to read and visualize RRD data from R is available in the following blog post:

Reading and analysing log files in the RRD database format

8.2.2 Using Graphite

If you are managing several servers it might be convenient to send server monitoring data to a centralized database and graphing facility as opposed to local RRD files. You can do this by configuring the server to send monitoring data to Graphite (or any other engine compatible with the Carbon protocol). This can be done in addition to or entirely in place of RRD.

There are four settings that control interaction with Graphite:

monitor-graphite-enabled Write monitoring data to Graphite (defaults to 0)
monitor-graphite-host Host running Graphite (defaults to 127.0.0.1)
monitor-graphite-port Port Graphite is listening on (defaults to 2003)
monitor-graphite-client-id Optional client ID for sender

For example, to enable Graphite monitoring on a remote host with the default Graphite port you would use these settings:

/etc/rstudio/rserver.conf

monitor-graphite-enabled=1
monitor-graphite-host=134.47.22.6

If you are using a service like hosted graphite.com that requires that you provide an API key as part of reporting metrics you can use the monitor-graphite-client-id setting. For example:

/etc/rstudio/rserver.conf

monitor-graphite-enabled=1
monitor-graphite-host=carbon.hostedgraphite.com
monitor-graphite-client-id=490662a4-1d8c-11e5-b06d-000c298f3d04

Note that changes to the configuration will not take effect until the server is restarted.

8.3 Server Health Checks

8.3.1 Enabling Health Checks

You may wish to periodically poll RStudio Server to ensure that it’s still responding to requests as well as to examine various indicators of server load. You can enable a health check endpoint using the server-health-check-enabled setting. For example:

/etc/rstudio/rserver.conf

server-health-check-enabled=1

After restarting the server, the following health-check endpoint will be available:

http://<server-address-and-port>/health-check

By default, the output of the health check will appear as follows:

active-sessions: 1
idle-seconds: 0
cpu-percent: 0.0
memory-percent: 64.2
swap-percent: 0.0
load-average: 4.1

8.3.2 Customizing Responses

The response to the health check is determined by processing a template that includes several variables. The default template is:

active-sessions: #active-sessions#
idle-seconds: #idle-seconds#
cpu-percent: #cpu-percent#
memory-percent: #memory-percent#
swap-percent: #swap-percent#
load-average: #load-average#

You can customize this template to return an alternate format (e.g. XML or JSON) that is parse-able by an external monitoring system. To do this you simply create a template and copy it to /etc/rstudio/health-check For example, an XML format:

/etc/rstudio/health-check

<?xml version="1.0" encoding="UTF-8"?>
<health-check>
  <active-sessions>#active-sessions#</active-sessions>
  <idle-seconds>#idle-seconds#</idle-seconds>
  <cpu-percent>#cpu-percent#</cpu-percent>
  <memory-percent>#memory-percent#</memory-percent>
  <swap-percent>#swap-percent#</swap-percent>
  <load-average>#load-average#</load-average>
</health-check>

Or a Prometheus endpoint. Prometheus is an open-source systems monitoring and alerting toolkit with a custom input format:

/etc/rstudio/health-check

# HELP active_sessions health_check metric Active RStudio sessions
# TYPE active_sessions gauge
active_sessions #active-sessions#
# HELP idle_seconds health_check metric Time since active RStudio sessions
# TYPE idle_seconds gauge
idle_seconds #idle-seconds#
# HELP cpu_percent health_check metric cpu (percentage)
# TYPE cpu_percent gauge
cpu_percent #cpu-percent#
# HELP memory_percent health_check metric memory used (percentage)
# TYPE memory_percent gauge
memory_percent #memory-percent#
# HELP swap_percent health_check metric swap used (percentage)
# TYPE swap_percent gauge
swap_percent #swap-percent#
# HELP load_average health_check metric cpu load average
# TYPE load_average gauge
load_average #load-average#

8.3.3 Changing the URL

It’s also possible to customize the URL used for health checks. RStudio Server will use the first file whose name begins with health-check in the /etc/rstudio directory as the template, and require that the full file name be specified in the URL. For example, a health check template located at the following path:

/etc/rstudio/health-check-B64C900E

Would be accessed using this URL:

http://<server-address-and-port>/health-check-B64C900E

Note that changes to the health check template will not take effect until the server is restarted.