This CRAN task view contains a list of packages related to accessibility of different databases. This does not include data import/export or data management.
As datasets become larger and larger, it is impossible for people to save them in traditional file formats such as spreadsheet, raw text file, etc., which could not fit on devices with limited storage and could not be easily shared across collaborators. Instead, people nowadays tend to store data in databases for more scalable and reliable data management.
- R-cran free download. Ggplot2 ggplot2 is a system written in R for declaratively creating graphics. It is based on The Grammar of.
- DOI: 10.18129/B9.bioc.BiocGenerics S4 generic functions used in Bioconductor. Bioconductor version: Release (3.12) The package defines S4 generic functions used in Bioconductor.
The Official R.Crumb Site should be sent to: email protected OFFICIAL R.CRUMB SITE IS ADMINISTERED BY ROBERT CRUMB AND ALEXANDER WOOD OF WILD WOOD SERIGRAPHS ALL ARTWORK IS COPYRIGHT ROBERT CRUMB UNLESS OTHERWISE NOTED. The following binaries are not maintained or supported by R-core and are provided without any guarantee and for convenience only (Mac OS X 10.4.4 or higher required). They match the binaries used on the CRAN binary build machine and thus are recommended for use with CRAN R package binaries. What are R (あーる) and CRAN (しぃらん(´.ω.`))? R is ‘GNU S', a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc.
Database systems are often classified based on the database models that they support. Relational databases became dominant in the 1980s. The data in relational databases is modeled as rows and columns in a series of tables with the use of SQL to express the logic for writing and querying data. The tables are relational, e.g. you have a user who users your softwares and those softwares have creators and contributors. Non-relational databases became popular in recent years due to huge demand in storing unstructured data with the use of NoSQL as the query language. Users generally don't need to define the data schema up front. If there are changing requirements in the applications, non-relational databases can be much easier to use and manage.
The content presented in this Task View are undergoing rapid changes in industries and academia. Please send any suggestions to the task view maintainer or submit a pull request or issue to the Github repository of this task view .
Thectv package supports these Task Views. Its functionsinstall.views andupdate.views allow, respectively, installation or update of packages from a given Task View; the optioncoreOnly can restrict operations to packages labeled as core below.
Suggestions and corrections by Achim Zeileis, Kirill Müller, Hannes Mühleisen, Rich FitzJohn, Dirk Eddelbuettel, and Hadley Wickham (as well as others I may have forgotten to add here) are gratefully acknowledged. Thanks to Dirk Eddelbuettel who made the initial.ctv file and the Markdown conversion script available at the Github repository of CRAN Task View for High Performance Computing here . Last but not least, thanks to Achim Zeileis who helped me get started on organizing this task view.
Relational Databases
This section includes packages that provides access to relational databases within R.
- TheDBI package provides a database interface definition for communication between R and relational database management systems. It's worth noting that some packages try to follow this interface definition (DBI-compliant) but many existing packages don't.
- TheRODBC package provides access to databases through an ODBC interface.
- TheRMariaDB package provides a DBI-compliant interface to MariaDB and MySQL .
- TheRMySQL package provides the interface to MySQL. Note that this is the legacy DBI interface to MySQL and MariaDB based on old code ported from S-PLUS. A modern MySQL client based on Rcpp is available from the RMariaDB package we listed above.
- Packages for PostgreSQL , an open-source relational database:
- TheRPostgreSQL package andRPostgres package both provide fully DBI-compliant Rcpp-backed interfaces to PostgreSQL.
- Therpostgis package provides the interface to its spatial extension PostGIS .
- TheRGreenplum provides a fully DBI-compliant interface to Greenplum , an open-source parallel database on top of PostgreSQL.
- TheROracle package is a DBI-compliant Oracle database driver based on the OCI. Theora package provides convenience functions to query and browse a database through theROracle connection.
- Packages for SQLite , a self-contained, high-reliability, embedded, full-featured, public-domain, SQL database engine:
- TheRSQLite package embeds the SQLite database engine in R and provides an interface compliant with the DBI package.
- ThefilehashSQLite package is a simple key-value database using SQLite as the backend.
- Theliteq package provides temporary and permanent message queues for R, built on top of SQLite.
- Thebigrquery package provides the interface to Google BigQuery , Google's fully managed, petabyte scale, low cost analytics data warehouse.
- The RDruid package provides the interface to Apache Druid , a high performance analytics data store for event-driven data.
- TheRH2 package provides the interface to H2 Database Engine , the Java SQL database.
- Theinfluxdbr package provides the interface to InfluxDB , a time series database designed to handle high write and query loads.
- Theodbc package provides a DBI-compliant interface to drivers of Open Database Connectivity (ODBC) , which is a low-level, high-performance interface that is designed specifically for relational data stores.
- TheRPresto package implements a DBI-compliant interface to Presto , an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.
- TheRJDBC package is an implementation of R's DBI interface using JDBC as a back-end. This allows R to connect to any DBMS that has a JDBC driver.
- Theimplyr package provides the back-end for Apache Impala , which enables low-latency SQL queries on data stored in the Hadoop Distributed File System (HDFS), Apache HBase, Apache Kudu, Amazon Simple Storage Service (S3), Microsoft Azure Data Lake Store (ADLS), and Dell EMC Isilon.
- Thedbx package provides intuitive functions for high performance batch operations and safe inserts/updates/deletes without writing SQL on top ofDBI. It is designed for both research and production environments and supports multiple database backends such as Postgres, MySQL, MariaDB, and SQLite.
- Thesparklyr package provides provides adplyr interface to Apache Spark DataFrames as well as an R interface to Spark's distributed machine learning pipelines.
- TheRClickhouse is aDBI interface for Yandex Clickhouse , which is a high-performance relational column-store database to enable big data exploration and scaling to petabytes of data. It provides basicdplyr support by auto-generating SQL-commands usingdbplyr.
Non-Relational Databases
R Cran Download
This section includes packages that provides access to non-relational databases within R.
- Packages for Redis , an open-source, in-memory data structure store that can be used as a database, cache and message broker:
- TheRcppRedis package provides interface to Redis using the hiredis library .
- Theredux package provides a low-level interface to Redis, allowing execution of arbitrary Redis commands with almost no interface, and a high-level generated interface to more than 200 redis commands.
- Packages for Elasticsearch , an open-source, RESTful, distributed search and analytics engine:
- Theelastic package provides a general purpose interface to Elasticsearch.
- Theuptasticsearch package is a Elasticsearch client tailored to data science workflows.
- Themongolite package provides a high-level, high-performance MongoDB client based on libmongoc , including support for aggregation, indexing, map-reduce, streaming, SSL encryption and SASL authentication.
- TheR4CouchDB package provides a collection of functions for basic database and document management operations in CouchDB .
- TheRCassandra package provides a direct interface (without the use of Java) to the most basic functionality of Apache Cassanda such as login, updates and queries.
- The aws.dynamodb package provides access to Amazon DynamoDB .
- The rrocksdb package provides access to RocksDB .
Databases Tools
This section includes packages that provides tools for working and testing with databases, databases table manipulations, etc.
- Thepool package enables the creation of object pools, which make it less computationally expensive to fetch a new object.
- TheDBItest package is a helper that tests DBI back ends for conformity to the interface.
- Thedbplyr package is adplyr back-end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a DBI back-end; more advanced features require SQL translation to be provided by the package author.
- Thesqldf package provides functionalities to manipulate R Data Frames Using SQL.
- Thepointblank package provides tools to validate data tables in databases such as PostgreSQL and MySQL.
- TheTScompare package provides utilities for comparing the equality of series on two databases.
R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN (the Comprehensive R Archive Network).[1][2] The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor in driving the widespread adoption of the language in data science.[3][4][5][6]
Compared to libraries in other programming language, R packages must conform to a relatively strict specification.[3] The Writing R Extensions manual[7] specifies a standard directory structure for R source code, data, documentation, and package metadata, which enables them to be installed and loaded using R's in-built package management tools.[3] Packages distributed on CRAN must meet additional standards.[3][8] According to John Chambers, whilst these requirements 'impose considerable demands' on package developers, they improve the usability and long-term stability of packages for end users.[3]
Repositories[edit]
Comprehensive R Archive Network (CRAN)[edit]
The Comprehensive R Archive Network (CRAN) is R's central software repository, supported by the R Foundation.[9] It contains an archive of the latest and previous versions of the R distribution, documentation, and contributed R packages.[10] It includes both source packages and pre-compiledbinaries for Windows and macOS.[11] As of November 2020, more than 16,000 packages are available.[12] CRAN was created by Kurt Hornik and Friedrich Leisch in 1997,[13][14] with the name paralleling other early packing systems such as TeX's CTAN (released 1992) and Perl's CPAN (released 1995).[15] As of 2021, it is still maintained by Hornik and a team of volunteers.[9] The master site is located at the Vienna University of Economics and Business and is mirrored on servers around the world.[10]
The 'Task Views' page (subject list) on the CRAN website[16] lists a wide range of tasks (in fields such as Finance, Genetics, High Performance Computing, Machine Learning, Medical Imaging, Social Sciences and Spatial Statistics) for which R packages are available. Another way to browse CRAN packages is provided by Metacran,[17] which also maintains lists of featured, most downloaded, trending or most depended upon packages.
The number of CRAN packages has grown exponentially for many years,[18] and as of 2018 an average of 21 submissions of new or updated packages were made every day.[6] Since each submission is manually reviewed by a small team of CRAN maintainers, many of whom, according to R core developer Peter Dalgaard, are 'approaching pensionable age', there is a concern that this system is not sustainable in the long term.[6] The growth of CRAN has exposed limitations of its dependency management infrastructure, particularly the fact that it assumes that dependencies always refer to the latest version of a package, meaning that new releases of CRAN packages must always be backwards compatible,[19] and that CRAN packages cannot have dependencies that are not on CRAN.[20] It has also led to concerns about declining quality of packages.[21]
MRAN and RStudio Package Manager[edit]
The Microsoft R Application Network (MRAN) is a mirror of CRAN maintained by Microsoft which is based on the company's downstream distribution of R, Microsoft R Open (formerly Revolution R Open).[22] It also includes an archive of daily CRAN snapshots, branded as the 'CRAN Time Machine', which enables users of MRAN to bypass the dependency versioning limitations of CRAN by installing a fixed set of R package versions via the package checkpoint.[23][24]
RStudio Package Manager is a similar tool produced by RStudio, which in addition to CRAN snapshots includes an archive of R packages from Bioconductor and Python packages from the Python Package Index.[25] It also distributes pre-compiled binary packages for Linux (only Windows and macOS binaries are included on CRAN).[26]
Other repositories[edit]
The Bioconductor project provides R packages for the analysis of genomic data. This includes object-oriented: base, compiler, datasets, grDevices, graphics, grid, methods, parallel, splines, stats, stats4, tcltk, tools, and utils.[29]
In addition, there are fifteen 'recommended packages' from CRAN which are included with binary distributions of R: KernSmooth, MASS, Matrix, boot, class, cluster, codetools, foreign, lattice, mgcv, nlme, nnet, rpart, spatial, and survival.[29]
Other packages[edit]
A group of packages called the Tidyverse, which can be considered a 'dialect of the R language', is increasingly popular in the R ecosystem. As of 2020-06-13, Metacran[17] listed 7 of the 8 core packages of the Tidyverse in the list of most download R packages. The group of packages strives to provide a cohesive collection of functions to deal with common data science tasks, including data import, cleaning, transformation and visualisation (notably with the ggplot2 package).
The R Infrastructure packages[30] support coding and the development of R packages and as of 2021-05-04, Metacran[17] lists 16 of these packages among the 25 most downloaded packages.
Other R packages include datasets.load, written by Bastiaan Quast, which adds graphical and command-line interfaces for loading datasets from installed packages.[31]
See also[edit]
References[edit]
- ^Hornik, Kurt (2020-02-20). 'Frequently Asked Questions on R'. The Comprehensive R Archive Network. 7.29: What is the difference between package and library?. Retrieved 2 November 2020.CS1 maint: location (link)
- ^Wickham, Hadley; Bryan, Jennifer. 'Introduction'. R Packages (2nd ed.).
- ^ abcdeChambers, John M. (2020). 'S, R, and Data Science'. The R Journal. 12 (1): 462–476. ISSN2073-4859.
- ^Vance, Ashlee (2009-01-06). 'Data Analysts Captivated by R's Power'. New York Times.
- ^Tippmann, Sylvia (2014-12-29). 'Programming tools: Adventures with R'. Nature News. 517 (7532): 109. doi:10.1038/517109a.
- ^ abcThieme, Nick (2018). 'R generation'. Significance. 15 (4): 14–19. doi:10.1111/j.1740-9713.2018.01169.x. ISSN1740-9713.
- ^R Core Team. 'Writing R Extensions'. The Comprehensive R Archive Network. Retrieved 2020-11-02.CS1 maint: uses authors parameter (link)
- ^CRAN Repository Maintainers. 'CRAN Repository Policy'. The Comprehensive R Archive Network. Retrieved 2020-11-02.CS1 maint: uses authors parameter (link)
- ^ abCRAN Repository Maintainers. 'CRAN Repository Policy'. The Comprehensive R Archive Network. R Project. Retrieved 20 November 2020.
- ^ abHornik, Kurt (2020-02-20). 'Frequently Asked Questions on R'. The Comprehensive R Archive Network. 2.1: What is CRAN?: R Project. Retrieved 20 November 2020.CS1 maint: location (link)
- ^CRAN Repository Maintainers. 'The Comprehensive R Archive Network'. R Project. Retrieved 20 November 2020.
- ^CRAN Repository Maintainers. 'CRAN - Contributed Packages'. The Comprehensive R Archive Network. CRAN. Retrieved 20 November 2020.
- ^Hornik, Kurt (1997-04-23). 'ANNOUNCE: CRAN'. r-announce (Mailing list). Retrieved 20 November 2020.
- ^Thieme, Nick (2018). 'R generation'. Significance. 15 (4): 14–19. doi:10.1111/j.1740-9713.2018.01169.x. ISSN1740-9713.
- ^Fitzgerald, Brian (2016-02-09). 'A Survey of Programming Language Package Systems'. Some Things Are Obvious. Retrieved 4 May 2021.
- ^'CRAN Task Views'. cran.r-project.org. Retrieved 2018-09-16.
- ^ abc'Metacran'.
- ^April 21, Matt Asay in Open Source on; 2016; Pst, 12:32 Pm. 'Exponential growth of R's open source community threatens commercial competitors'. TechRepublic. Retrieved 2020-11-02.CS1 maint: numeric names: authors list (link)
- ^Ooms, Jeroen (2013). 'Possible Directions for Improving Dependency Versioning in R'. The R Journal. 5 (1): 197–206. ISSN2073-4859.
- ^Decan, A.; Mens, T.; Claes, M.; Grosjean, P. (2016). 'When GitHub Meets CRAN: An Analysis of Inter-Repository Package Dependency Problems'. 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 1: 493–504. doi:10.1109/SANER.2016.12.
- ^Hornik, Kurt (2012). 'Are There Too Many R Packages?'. Austrian Journal of Statistics. 41 (1): 59–66–59–66. doi:10.17713/ajs.v41i1.188. ISSN1026-597X.
- ^'Welcome to MRAN'. Microsoft R Application Network. Microsoft. Retrieved 4 May 2021.
- ^'Reproducibility: Using Fixed CRAN Repository Snapshots'. Microsoft R Application Network. Microsoft. Retrieved 4 May 2021.
- ^Smith, David (2019-05-22). 'MRAN snapshots, and you'. Revolutions. Revolution Analytics. Retrieved 4 May 2021.
- ^Lopp, Sean (2020-12-07). 'RStudio Package Manager 1.2.0 - Bioconductor & PyPI'. RStudio Blog. RStudio. Retrieved 4 May 2021.
- ^Lopp, Sean (2020-07-01). 'Announcing Public Package Manager and v1.1.6'. RStudio Blog. RStudio. Retrieved 4 May 2021.
- ^Huber, W; Carey, VJ; Gentleman, R; Anders, S; Carlson, M; Carvalho, BS; Bravo, HC; Davis, S; Gatto, L; Girke, T; Gottardo, R; Hahne, F; Hansen, KD; Irizarry, RA; Lawrence, M; Love, MI; MacDonald, J; Obenchain, V; Oleś, AK; Pagès, H; Reyes, A; Shannon, P; Smyth, GK; Tenenbaum, D; Waldron, L; Morgan, M (2015). 'Orchestrating high-throughput genomic analysis with Bioconductor'. Nature Methods. Nature Publishing Group. 12 (2): 115–121. doi:10.1038/nmeth.3252. PMC4509590. PMID25633503.
- ^'R-Forge: Welcome'. Retrieved 2018-09-16.
- ^ abHornik, Kurt (2020-02-20). 'Frequently Asked Questions on R'. The Comprehensive R Archive Network. 5.1: Which add-on packages exist for R?. Retrieved 2 November 2020.CS1 maint: location (link)
- ^'R infrastructure'.
- ^Laux, Michael (2017-01-05). 'R Packages worth a look'. Data Analytics & R. Archived from the original on 2020-01-03. Retrieved 2020-05-01.[self-published source]
Thectv package supports these Task Views. Its functionsinstall.views andupdate.views allow, respectively, installation or update of packages from a given Task View; the optioncoreOnly can restrict operations to packages labeled as core below.
Suggestions and corrections by Achim Zeileis, Kirill Müller, Hannes Mühleisen, Rich FitzJohn, Dirk Eddelbuettel, and Hadley Wickham (as well as others I may have forgotten to add here) are gratefully acknowledged. Thanks to Dirk Eddelbuettel who made the initial.ctv file and the Markdown conversion script available at the Github repository of CRAN Task View for High Performance Computing here . Last but not least, thanks to Achim Zeileis who helped me get started on organizing this task view.
Relational Databases
This section includes packages that provides access to relational databases within R.
- TheDBI package provides a database interface definition for communication between R and relational database management systems. It's worth noting that some packages try to follow this interface definition (DBI-compliant) but many existing packages don't.
- TheRODBC package provides access to databases through an ODBC interface.
- TheRMariaDB package provides a DBI-compliant interface to MariaDB and MySQL .
- TheRMySQL package provides the interface to MySQL. Note that this is the legacy DBI interface to MySQL and MariaDB based on old code ported from S-PLUS. A modern MySQL client based on Rcpp is available from the RMariaDB package we listed above.
- Packages for PostgreSQL , an open-source relational database:
- TheRPostgreSQL package andRPostgres package both provide fully DBI-compliant Rcpp-backed interfaces to PostgreSQL.
- Therpostgis package provides the interface to its spatial extension PostGIS .
- TheRGreenplum provides a fully DBI-compliant interface to Greenplum , an open-source parallel database on top of PostgreSQL.
- TheROracle package is a DBI-compliant Oracle database driver based on the OCI. Theora package provides convenience functions to query and browse a database through theROracle connection.
- Packages for SQLite , a self-contained, high-reliability, embedded, full-featured, public-domain, SQL database engine:
- TheRSQLite package embeds the SQLite database engine in R and provides an interface compliant with the DBI package.
- ThefilehashSQLite package is a simple key-value database using SQLite as the backend.
- Theliteq package provides temporary and permanent message queues for R, built on top of SQLite.
- Thebigrquery package provides the interface to Google BigQuery , Google's fully managed, petabyte scale, low cost analytics data warehouse.
- The RDruid package provides the interface to Apache Druid , a high performance analytics data store for event-driven data.
- TheRH2 package provides the interface to H2 Database Engine , the Java SQL database.
- Theinfluxdbr package provides the interface to InfluxDB , a time series database designed to handle high write and query loads.
- Theodbc package provides a DBI-compliant interface to drivers of Open Database Connectivity (ODBC) , which is a low-level, high-performance interface that is designed specifically for relational data stores.
- TheRPresto package implements a DBI-compliant interface to Presto , an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.
- TheRJDBC package is an implementation of R's DBI interface using JDBC as a back-end. This allows R to connect to any DBMS that has a JDBC driver.
- Theimplyr package provides the back-end for Apache Impala , which enables low-latency SQL queries on data stored in the Hadoop Distributed File System (HDFS), Apache HBase, Apache Kudu, Amazon Simple Storage Service (S3), Microsoft Azure Data Lake Store (ADLS), and Dell EMC Isilon.
- Thedbx package provides intuitive functions for high performance batch operations and safe inserts/updates/deletes without writing SQL on top ofDBI. It is designed for both research and production environments and supports multiple database backends such as Postgres, MySQL, MariaDB, and SQLite.
- Thesparklyr package provides provides adplyr interface to Apache Spark DataFrames as well as an R interface to Spark's distributed machine learning pipelines.
- TheRClickhouse is aDBI interface for Yandex Clickhouse , which is a high-performance relational column-store database to enable big data exploration and scaling to petabytes of data. It provides basicdplyr support by auto-generating SQL-commands usingdbplyr.
Non-Relational Databases
R Cran Download
This section includes packages that provides access to non-relational databases within R.
- Packages for Redis , an open-source, in-memory data structure store that can be used as a database, cache and message broker:
- TheRcppRedis package provides interface to Redis using the hiredis library .
- Theredux package provides a low-level interface to Redis, allowing execution of arbitrary Redis commands with almost no interface, and a high-level generated interface to more than 200 redis commands.
- Packages for Elasticsearch , an open-source, RESTful, distributed search and analytics engine:
- Theelastic package provides a general purpose interface to Elasticsearch.
- Theuptasticsearch package is a Elasticsearch client tailored to data science workflows.
- Themongolite package provides a high-level, high-performance MongoDB client based on libmongoc , including support for aggregation, indexing, map-reduce, streaming, SSL encryption and SASL authentication.
- TheR4CouchDB package provides a collection of functions for basic database and document management operations in CouchDB .
- TheRCassandra package provides a direct interface (without the use of Java) to the most basic functionality of Apache Cassanda such as login, updates and queries.
- The aws.dynamodb package provides access to Amazon DynamoDB .
- The rrocksdb package provides access to RocksDB .
Databases Tools
This section includes packages that provides tools for working and testing with databases, databases table manipulations, etc.
- Thepool package enables the creation of object pools, which make it less computationally expensive to fetch a new object.
- TheDBItest package is a helper that tests DBI back ends for conformity to the interface.
- Thedbplyr package is adplyr back-end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a DBI back-end; more advanced features require SQL translation to be provided by the package author.
- Thesqldf package provides functionalities to manipulate R Data Frames Using SQL.
- Thepointblank package provides tools to validate data tables in databases such as PostgreSQL and MySQL.
- TheTScompare package provides utilities for comparing the equality of series on two databases.
R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN (the Comprehensive R Archive Network).[1][2] The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor in driving the widespread adoption of the language in data science.[3][4][5][6]
Compared to libraries in other programming language, R packages must conform to a relatively strict specification.[3] The Writing R Extensions manual[7] specifies a standard directory structure for R source code, data, documentation, and package metadata, which enables them to be installed and loaded using R's in-built package management tools.[3] Packages distributed on CRAN must meet additional standards.[3][8] According to John Chambers, whilst these requirements 'impose considerable demands' on package developers, they improve the usability and long-term stability of packages for end users.[3]
Repositories[edit]
Comprehensive R Archive Network (CRAN)[edit]
The Comprehensive R Archive Network (CRAN) is R's central software repository, supported by the R Foundation.[9] It contains an archive of the latest and previous versions of the R distribution, documentation, and contributed R packages.[10] It includes both source packages and pre-compiledbinaries for Windows and macOS.[11] As of November 2020, more than 16,000 packages are available.[12] CRAN was created by Kurt Hornik and Friedrich Leisch in 1997,[13][14] with the name paralleling other early packing systems such as TeX's CTAN (released 1992) and Perl's CPAN (released 1995).[15] As of 2021, it is still maintained by Hornik and a team of volunteers.[9] The master site is located at the Vienna University of Economics and Business and is mirrored on servers around the world.[10]
The 'Task Views' page (subject list) on the CRAN website[16] lists a wide range of tasks (in fields such as Finance, Genetics, High Performance Computing, Machine Learning, Medical Imaging, Social Sciences and Spatial Statistics) for which R packages are available. Another way to browse CRAN packages is provided by Metacran,[17] which also maintains lists of featured, most downloaded, trending or most depended upon packages.
The number of CRAN packages has grown exponentially for many years,[18] and as of 2018 an average of 21 submissions of new or updated packages were made every day.[6] Since each submission is manually reviewed by a small team of CRAN maintainers, many of whom, according to R core developer Peter Dalgaard, are 'approaching pensionable age', there is a concern that this system is not sustainable in the long term.[6] The growth of CRAN has exposed limitations of its dependency management infrastructure, particularly the fact that it assumes that dependencies always refer to the latest version of a package, meaning that new releases of CRAN packages must always be backwards compatible,[19] and that CRAN packages cannot have dependencies that are not on CRAN.[20] It has also led to concerns about declining quality of packages.[21]
MRAN and RStudio Package Manager[edit]
The Microsoft R Application Network (MRAN) is a mirror of CRAN maintained by Microsoft which is based on the company's downstream distribution of R, Microsoft R Open (formerly Revolution R Open).[22] It also includes an archive of daily CRAN snapshots, branded as the 'CRAN Time Machine', which enables users of MRAN to bypass the dependency versioning limitations of CRAN by installing a fixed set of R package versions via the package checkpoint.[23][24]
RStudio Package Manager is a similar tool produced by RStudio, which in addition to CRAN snapshots includes an archive of R packages from Bioconductor and Python packages from the Python Package Index.[25] It also distributes pre-compiled binary packages for Linux (only Windows and macOS binaries are included on CRAN).[26]
Other repositories[edit]
The Bioconductor project provides R packages for the analysis of genomic data. This includes object-oriented: base, compiler, datasets, grDevices, graphics, grid, methods, parallel, splines, stats, stats4, tcltk, tools, and utils.[29]
In addition, there are fifteen 'recommended packages' from CRAN which are included with binary distributions of R: KernSmooth, MASS, Matrix, boot, class, cluster, codetools, foreign, lattice, mgcv, nlme, nnet, rpart, spatial, and survival.[29]
Other packages[edit]
A group of packages called the Tidyverse, which can be considered a 'dialect of the R language', is increasingly popular in the R ecosystem. As of 2020-06-13, Metacran[17] listed 7 of the 8 core packages of the Tidyverse in the list of most download R packages. The group of packages strives to provide a cohesive collection of functions to deal with common data science tasks, including data import, cleaning, transformation and visualisation (notably with the ggplot2 package).
The R Infrastructure packages[30] support coding and the development of R packages and as of 2021-05-04, Metacran[17] lists 16 of these packages among the 25 most downloaded packages.
Other R packages include datasets.load, written by Bastiaan Quast, which adds graphical and command-line interfaces for loading datasets from installed packages.[31]
See also[edit]
References[edit]
- ^Hornik, Kurt (2020-02-20). 'Frequently Asked Questions on R'. The Comprehensive R Archive Network. 7.29: What is the difference between package and library?. Retrieved 2 November 2020.CS1 maint: location (link)
- ^Wickham, Hadley; Bryan, Jennifer. 'Introduction'. R Packages (2nd ed.).
- ^ abcdeChambers, John M. (2020). 'S, R, and Data Science'. The R Journal. 12 (1): 462–476. ISSN2073-4859.
- ^Vance, Ashlee (2009-01-06). 'Data Analysts Captivated by R's Power'. New York Times.
- ^Tippmann, Sylvia (2014-12-29). 'Programming tools: Adventures with R'. Nature News. 517 (7532): 109. doi:10.1038/517109a.
- ^ abcThieme, Nick (2018). 'R generation'. Significance. 15 (4): 14–19. doi:10.1111/j.1740-9713.2018.01169.x. ISSN1740-9713.
- ^R Core Team. 'Writing R Extensions'. The Comprehensive R Archive Network. Retrieved 2020-11-02.CS1 maint: uses authors parameter (link)
- ^CRAN Repository Maintainers. 'CRAN Repository Policy'. The Comprehensive R Archive Network. Retrieved 2020-11-02.CS1 maint: uses authors parameter (link)
- ^ abCRAN Repository Maintainers. 'CRAN Repository Policy'. The Comprehensive R Archive Network. R Project. Retrieved 20 November 2020.
- ^ abHornik, Kurt (2020-02-20). 'Frequently Asked Questions on R'. The Comprehensive R Archive Network. 2.1: What is CRAN?: R Project. Retrieved 20 November 2020.CS1 maint: location (link)
- ^CRAN Repository Maintainers. 'The Comprehensive R Archive Network'. R Project. Retrieved 20 November 2020.
- ^CRAN Repository Maintainers. 'CRAN - Contributed Packages'. The Comprehensive R Archive Network. CRAN. Retrieved 20 November 2020.
- ^Hornik, Kurt (1997-04-23). 'ANNOUNCE: CRAN'. r-announce (Mailing list). Retrieved 20 November 2020.
- ^Thieme, Nick (2018). 'R generation'. Significance. 15 (4): 14–19. doi:10.1111/j.1740-9713.2018.01169.x. ISSN1740-9713.
- ^Fitzgerald, Brian (2016-02-09). 'A Survey of Programming Language Package Systems'. Some Things Are Obvious. Retrieved 4 May 2021.
- ^'CRAN Task Views'. cran.r-project.org. Retrieved 2018-09-16.
- ^ abc'Metacran'.
- ^April 21, Matt Asay in Open Source on; 2016; Pst, 12:32 Pm. 'Exponential growth of R's open source community threatens commercial competitors'. TechRepublic. Retrieved 2020-11-02.CS1 maint: numeric names: authors list (link)
- ^Ooms, Jeroen (2013). 'Possible Directions for Improving Dependency Versioning in R'. The R Journal. 5 (1): 197–206. ISSN2073-4859.
- ^Decan, A.; Mens, T.; Claes, M.; Grosjean, P. (2016). 'When GitHub Meets CRAN: An Analysis of Inter-Repository Package Dependency Problems'. 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 1: 493–504. doi:10.1109/SANER.2016.12.
- ^Hornik, Kurt (2012). 'Are There Too Many R Packages?'. Austrian Journal of Statistics. 41 (1): 59–66–59–66. doi:10.17713/ajs.v41i1.188. ISSN1026-597X.
- ^'Welcome to MRAN'. Microsoft R Application Network. Microsoft. Retrieved 4 May 2021.
- ^'Reproducibility: Using Fixed CRAN Repository Snapshots'. Microsoft R Application Network. Microsoft. Retrieved 4 May 2021.
- ^Smith, David (2019-05-22). 'MRAN snapshots, and you'. Revolutions. Revolution Analytics. Retrieved 4 May 2021.
- ^Lopp, Sean (2020-12-07). 'RStudio Package Manager 1.2.0 - Bioconductor & PyPI'. RStudio Blog. RStudio. Retrieved 4 May 2021.
- ^Lopp, Sean (2020-07-01). 'Announcing Public Package Manager and v1.1.6'. RStudio Blog. RStudio. Retrieved 4 May 2021.
- ^Huber, W; Carey, VJ; Gentleman, R; Anders, S; Carlson, M; Carvalho, BS; Bravo, HC; Davis, S; Gatto, L; Girke, T; Gottardo, R; Hahne, F; Hansen, KD; Irizarry, RA; Lawrence, M; Love, MI; MacDonald, J; Obenchain, V; Oleś, AK; Pagès, H; Reyes, A; Shannon, P; Smyth, GK; Tenenbaum, D; Waldron, L; Morgan, M (2015). 'Orchestrating high-throughput genomic analysis with Bioconductor'. Nature Methods. Nature Publishing Group. 12 (2): 115–121. doi:10.1038/nmeth.3252. PMC4509590. PMID25633503.
- ^'R-Forge: Welcome'. Retrieved 2018-09-16.
- ^ abHornik, Kurt (2020-02-20). 'Frequently Asked Questions on R'. The Comprehensive R Archive Network. 5.1: Which add-on packages exist for R?. Retrieved 2 November 2020.CS1 maint: location (link)
- ^'R infrastructure'.
- ^Laux, Michael (2017-01-05). 'R Packages worth a look'. Data Analytics & R. Archived from the original on 2020-01-03. Retrieved 2020-05-01.[self-published source]
Further reading[edit]
- Claes, M.; Mens, T.; Grosjean, P. (2014). 'On the maintainability of CRAN packages'. 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE): 308–312. doi:10.1109/CSMR-WCRE.2014.6747183.
- Decan, Alexandre; Mens, Tom; Claes, Maelick; Grosjean, Philippe (2015-09-07). 'On the Development and Distribution of R Packages: An Empirical Analysis of the R Ecosystem'. Proceedings of the 2015 European Conference on Software Architecture Workshops. ECSAW '15. Dubrovnik, Cavtat, Croatia: Association for Computing Machinery: 1–6. doi:10.1145/2797433.2797476. ISBN978-1-4503-3393-1.
- Fox, John (2009). 'Aspects of the Social Organization and Trajectory of the R Project'. The R Journal. 1 (2): 5–13. ISSN2073-4859.
- Fox, John; Leanage, Allison (12 September 2016). 'R and the Journal of Statistical Software'. Journal of Statistical Software. 73 (1): 1–13. doi:10.18637/jss.v073.i02. ISSN1548-7660.
- Plakidas, Konstantinos; Schall, Daniel; Zdun, Uwe (2017). 'Evolution of the R software ecosystem: Metrics, relationships, and their impact on qualities'. Journal of Systems and Software. 132: 119–146. doi:10.1016/j.jss.2017.06.095. ISSN0164-1212.
External links[edit]
- METACRAN, a directory of R packages
- CRAN Task Views, listing of CRAN packages by topics