B Package Ecosystem

The R package ecosystem has a few key components.

B.1 Packages

Packages are the primary extension mechanism for R. They can be used to share functions, datasets, and documentation. An R package can exist in a few states:

B.1.1 Source

An R package is composed of a series of directories and files. The source of an R package is just a top-level directory containing the coponents of the package. Package authors work with source packages during development. Git(hub) repositories store source packages.

B.1.2 Bundle

A bundled package is a package thats been compressed into a single file. By convention, package bundles in R use the extension .tar.gz.

B.1.3 Binary

A binary package is the result of building a source package for a specific operating system. Binary packages are single files that are ready for installation on their specific operating systems.

B.1.4 Installed

An installed package is a binary package that has been decompressed into a package library and is ready for use by R.

B.2 Repositories

Repositories organize R packages for distribution to end users. Repositories contain package bundles and binaries that are organized in a specific way so that users can install packages from the repository using R’s install.packages command. CRAN and Bioconductor are examples of R repositories.

B.3 Git(hub)

Many R package sources are stored in version controlled directories. A popular versioning tool is Git. Github, as an extension of Git, houses many package sources. The devtools R package includes convenience functions for installing packages from the package source contained on a Git repository, including Github. Used in this manner, git repositories and Github are one way to distribute R packages, but Github and Git repositories are not R package repositories.

B.4 Libraries

End users of R typically interact with installed packages that live in libraries. Package libraries are just directories containing installed packages. When a package is requested by R, R searches the different library directories to find the installed package.

R libraries are very flexible. In the past, R users have set up libraries for specific projects or set up a system-wide library used across multiple projects. In multi-tenant servers it has been common to have both a system library shared by all users and user-specific libraries.

A best practice is to set up per-project libraries alongside a package cache.