Home > Back-end >  Does CRAN (or any of its relatives) have an API?
Does CRAN (or any of its relatives) have an API?

Time:03-23

I am interested in retieving machine readable meta information about R packages.

For example, when I go to CRAN I can see a short description about the package, before I download it: https://cran.r-project.org/web/packages/MASS/

I could not find any way to retrieve a different output from the CRAN server than HTML. I would like to avoid parsing HTML and instead somehow retrieve meta information about packages in a more convenient format (e.g., JSON).

I saw that each R package (at least to my knowledge) has a yaml-like (?) description text inside its source code package (the file is called DESCRIPTION). However, so far I could only find this kind of description inside tar archives, which means that I would have to download the package before I can access its description.

Here an example of the DESCRIPTION from the MASS package:

Package: MASS
Priority: recommended
Version: 7.3-55
Date: 2022-01-12
Revision: $Rev: 3559 $
Depends: R (>= 3.3.0), grDevices, graphics, stats, utils
Imports: methods
Suggests: lattice, nlme, nnet, survival
Authors@R: c(person("Brian", "Ripley", role = c("aut", "cre", "cph"),
                    email = "[email protected]"),
         person("Bill", "Venables", role = "ctb"),
         person(c("Douglas", "M."), "Bates", role = "ctb"),
         person("Kurt", "Hornik", role = "trl",
                     comment = "partial port ca 1998"),
         person("Albrecht", "Gebhardt", role = "trl",
                     comment = "partial port ca 1998"),
         person("David", "Firth", role = "ctb"))
Description: Functions and datasets to support Venables and Ripley,
  "Modern Applied Statistics with S" (4th edition, 2002).
Title: Support Functions and Datasets for Venables and Ripley's MASS
LazyData: yes
ByteCompile: yes
License: GPL-2 | GPL-3
URL: http://www.stats.ox.ac.uk/pub/MASS4/
Contact: <[email protected]>
NeedsCompilation: yes
Packaged: 2022-01-13 05:06:37 UTC; ripley
Author: Brian Ripley [aut, cre, cph],
  Bill Venables [ctb],
  Douglas M. Bates [ctb],
  Kurt Hornik [trl] (partial port ca 1998),
  Albrecht Gebhardt [trl] (partial port ca 1998),
  David Firth [ctb]
Maintainer: Brian Ripley <[email protected]>
Repository: CRAN
Date/Publication: 2022-01-13 08:05:04 UTC

Any suggestions how to get that directly in a machine-readable and convenient form?

I tried to look it up, but search engines did not bring me any useful result so far.

Edit / Clarification: I am looking for a solution that does not rely on R, but rather a web API that is agnostic of the used framework / language for meta data retrieval.

CodePudding user response:

Does tools::CRAN_package_db() have all the information you want? (See here for some discussion)

> dd <- tools::CRAN_package_db()
> names(dd)
 [1] "Package"                 "Version"                
 [3] "Priority"                "Depends"                
 [5] "Imports"                 "LinkingTo"              
 [7] "Suggests"                "Enhances"               
 [9] "License"                 "License_is_FOSS"        
[11] "License_restricts_use"   "OS_type"                
[13] "Archs"                   "MD5sum"                 
[15] "NeedsCompilation"        "Additional_repositories"
[17] "Author"                  "Authors@R"              
[19] "Biarch"                  "BugReports"             
[21] "BuildKeepEmpty"          "BuildManual"            
[23] "BuildResaveData"         "BuildVignettes"         
[25] "Built"                   "ByteCompile"            
[27] "Classification/ACM"      "Classification/ACM-2012"
[29] "Classification/JEL"      "Classification/MSC"     
[31] "Classification/MSC-2010" "Collate"                
[33] "Collate.unix"            "Collate.windows"        
[35] "Contact"                 "Copyright"              
[37] "Date"                    "Description"            
[39] "Encoding"                "KeepSource"             
[41] "Language"                "LazyData"               
[43] "LazyDataCompression"     "LazyLoad"               
[45] "MailingList"             "Maintainer"             
[47] "Note"                    "Packaged"               
[49] "RdMacros"                "StagedInstall"          
[51] "SysDataCompression"      "SystemRequirements"     
[53] "Title"                   "Type"                   
[55] "URL"                     "UseLTO"                 
[57] "VignetteBuilder"         "ZipData"                
[59] "Published"               "Path"                   
[61] "X-CRAN-Comment"          "Reverse depends"        
[63] "Reverse imports"         "Reverse linking to"     
[65] "Reverse suggests"        "Reverse enhances"       

CodePudding user response:

An acceptable solution is the METACRAN API that is available here: https://crandb.r-pkg.org/

CodePudding user response:

You can download https://cloud.r-project.org/src/contrib/PACKAGES.gz (or even in uncompressed form https://cloud.r-project.org/src/contrib/PACKAGES). It contains information about all currently available packages in DCF format, using some of the fields from DESCRIPTION files, and some others.

You don't need to use cloud.r-project.org, any of the CRAN mirrors will do.

  • Related