dataone Package Overview

dataone R Package Overview

The dataone R package enables R scripts to search, download and upload science data and metadata to the DataONE Federation. This package calls DataONE web services that allow client programs to interact with Member Nodes (MN) and DataONE Coordinating Nodes (CN).

Quick Start

See the full manual (help dataone) for documentation.

To search the DataONE Federation Member Node Knowledge Network for Biocomplexity (KNB) for a dataset:

library(dataone)
cn <- CNode("PROD")
mn <- getMNode(cn, "urn:node:KNB")
mySearchTerms <- list(q="abstract:salmon+AND+keywords:acoustics+AND+keywords:\"Oncorhynchus nerka\"",
                      fl="id,title,dateUploaded,abstract,size",
                      fq="dateUploaded:[2013-01-01T00:00:00.000Z TO 2014-01-01T00:00:00.000Z]",
                      sort="dateUploaded+desc")
result <- query(mn, solrQuery=mySearchTerms, as="data.frame")
result[1,c("id", "title")]
pid <- result[1,'id']

The metadata file located in the above search can be downloaded with the commands:

library(XML)
metadata <- rawToChar(getObject(mn, pid))

The metadata file that describes the located research can be viewed in an XML viewer or text editor, once it is written to a disk file. This file details a data file (CSV) that can be obtained using the listed identifier, using the commands:

dataRaw <- getObject(mn, "df35d.443.1")
dataChar <- rawToChar(dataRaw)
theData <- textConnection(dataChar)
df <- read.csv(theData, stringsAsFactors=FALSE)
df[1,]

Uploading a CSV file to a DataONE Member Node requires user authentication. DataONE user authentication is described in the vignette DataONE-Federation.

Once the authentication steps have been followed, uploading is done with:

library(datapack)
library(uuid)
d1c <- D1Client("STAGING", "urn:node:mnStageUCSB2")
id <- paste("urn:uuid:", UUIDgenerate(), sep="")
testdf <- data.frame(x=1:10,y=11:20)
csvfile <- paste(tempfile(), ".csv", sep="")
write.csv(testdf, csvfile, row.names=FALSE)
# Build a DataObject containing the csv, and upload it to the Member Node
d1Object <- new("DataObject", id, format="text/csv", filename=csvfile)
uploadDataObject(d1c, d1Object, public=TRUE)

Additional Resources

The dataone R package vignettes can be viewed using the R vignette command, for example vignette("dataone-overview").

The dataone vignettes describe these topics:

  • Searching for data in DataONE described in vignette searching-dataone.

  • Downloading data from DataONE described in vignette download-data

  • Uploading data to DataONE is described in vignettes upload-data and update-package.

  • The DataONE Federation is described in the vignettes DataONE-Federation.

Acknowledgements

Work on this package was supported by:

  • NSF-ABI grant #1262458 to C. Gries, M. Jones, and S. Collins.
  • NSF-DATANET grants #0830944 and #1430508 to W. Michener, M. Jones, D. Vieglais, S. Allard and P. Cruse

Additional support was provided for working group collaboration by the National Center for Ecological Analysis and Synthesis, a Center funded by the University of California, Santa Barbara, and the State of California.