B Package Ecosystem
The R package ecosystem has a few key components.
Packages are the primary extension mechanism for R. They can be used to share functions, datasets, and documentation. An R package can exist in a few states:
An R package is composed of a series of directories and files. The source of an R package is just a top-level directory containing the coponents of the package. Package authors work with source packages during development. Git(hub) repositories store source packages.
A bundled package is a package thats been compressed into a single file. By convention, package bundles in R use the extension .tar.gz.
A binary package is the result of building a source package for a specific operating system. Binary packages are single files that are ready for installation on their specific operating systems.
An installed package is a binary package that has been decompressed into a package library and is ready for use by R.
Repositories organize R packages for distribution to end users. Repositories
contain package bundles and binaries that are organized in a specific way so that
users can install packages from the repository using R’s
command. CRAN and Bioconductor are examples of R repositories.
Many R package sources are stored in version controlled directories. A popular
versioning tool is Git. Github, as an extension of Git, houses many package
devtools R package includes convenience functions for installing
packages from the package source contained on a Git repository, including
Github. Used in this manner, git repositories and Github are one way to
distribute R packages, but Github and Git repositories are not R package
End users of R typically interact with installed packages that live in libraries. Package libraries are just directories containing installed packages. When a package is requested by R, R searches the different library directories to find the installed package.
R libraries are very flexible. In the past, R users have set up libraries for specific projects or set up a system-wide library used across multiple projects. In multi-tenant servers it has been common to have both a system library shared by all users and user-specific libraries.
A best practice is to set up per-project libraries alongside a package cache.