pins R package provides a way for R
users to easily share data sets, models, and other R objects. Your resources
may be text files (CSV, JSON, etc.), R objects (
.Rda, etc.) or any
other format you use to share data. Pinned objects can be stored on a variety
of "boards", including local folders (to share on a networked drive or with
dropbox), RStudio Connect, Amazon S3, and more.
Sharing data can be useful in many situations, for example:
Multiple pieces of content require the same input data. Rather than copying that data, each piece of content references a single source of truth hosted on RStudio Connect.
Content depends on data or model objects that need to be regularly updated. Rather than redeploying the content each time the data changes, use a pinned resource and update only the data. The data update can occur using a scheduled R Markdown document. Your content will read the newest data on each run.
You need to share resources that aren't structured for traditional tools like databases. For example, models saved as R objects aren't easy to store in a database. Rather than using email or file systems to share data files, use RStudio Connect to host these resources as pins.
Pins and large data sets
An important factor in determining whether or not to use a pin is the size of the data or object in use. As a general rule of thumb, we don't recommend using pins with files over 500 MB. If you find yourself routinely pinning data larger than this, then you might need to reconsider your data engineering pipeline. Please see Reading and writing data to learn more.
Recommended Resource: Pro Tips for Pins
Create a Pin Board#
RStudio Connect is easy to use as a board for pinning R objects. Create a board
to use with
pins::board_rsconnect(). This function takes an
auth argument which informs how you will authenticate to RStudio Connect.
auth = "envvar" if you have already defined
CONNECT_API_KEY as environment variables in your R session.
library(pins) board <- board_rsconnect(auth = "envvar")
RStudio Connect will automatically apply values for these environment variables for deployed content at run time, so there is no need to include them in your code (never a best practice) or specify them in the Vars Pane unless your server administrator has disabled that function.
The automatic generation of these environment variables may be disabled for security reasons. Reach out to your RStudio Connect server administrator or review the Admin Guide for additional details.
Read and Write Pins#
Once you have a pin board, you can write data to it with
mtcars <- tibble::as_tibble(mtcars) board %>% pin_write(mtcars, "mtcars")
The first argument is the object to save (usually a data frame, but it can be
any R object), and the second argument gives the "name" of pin. On RStudio
Connect, this name will be used along with your Username to retrieve or read
data from the pin. Running the code above should yield a success message that
looks something like this:
Writing to pin 'my.username/mtcars'.
After you’ve pinned an object, you can read it back with
board %>% pin_read("my.username/mtcars")
Every pin is accompanied by some metadata that you can access with
This will return the metadata generated by default. Pins from RStudio Connect
will have a
When creating the pin, you can override the default description or provide additional metadata that is stored with the data:
board %>% pin_write(mtcars, description = "Data extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).", metadata = list( source = "Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37, 391–411." ) )
Learn more about Pin Metadata.
Using a Pin#
Once a pin has been deployed, it is easy to share the pin with colleagues. You can either share the link to the pin in RStudio Connect, or colleagues can search for resources using the pin package within RStudio. See Sharing tidied data to learn more.
RStudio Connect provides a preview of pinned data objects, their metadata, and a direct download link which can be accessed at the content url:
You can manage content settings for deployed pins just like you would for other content types. For example, you can manage access controls to pins to determine who should be able to view and utilize the resource.
Pins are objects, they are not backed by source code and so they cannot be
directly scheduled. A common pattern for updating pinned data on a schedule is
pins::pin_write() inside a scheduled R Markdown document. Writing to the same pin multiple times creates a version history which can be accessed under the "More" button dropdown menu.