Title: | Utilities for the Arctic Data Center |
---|---|
Description: | A set of utilities for working with the Arctic Data Center (https://arcticdata.io). |
Authors: | Bryce Mecum [aut, cre], Matt Jones [ctb], Jesse Goldstein [ctb] (Maintainer), Jeanette Clark [ctb] (Maintainer), Dominic Mullen [ctb], Emily O'Dean [ctb], Robyn Thiessen-Bock [ctb], Derek Strong [ctb], Rachel Sun [ctb], Jasmine Lai [ctb] |
Maintainer: | Bryce Mecum <[email protected]> |
License: | Apache License (== 2.0) |
Version: | 0.7.0 |
Built: | 2024-11-06 05:30:03 UTC |
Source: | https://github.com/NCEAS/arcticdatautils |
This package contains code for doing lots of useful stuff that's too specific for the dataone package, primarily functions that streamline Arctic Data Center operations.
Leave style=NA if you want to use the default ISO-to-EML stylesheet.
convert_iso_to_eml(path, style = NA)
convert_iso_to_eml(path, style = NA)
path |
(character) Path to the file to convert. |
style |
(xslt) The XSLT object to be used for transformation. |
(character) Location of the converted file.
## Not run: iso_path <- "~/Docuements/ISO_metadata.xml" eml_path <- convert_iso_to_eml(iso_path) ## End(Not run)
## Not run: iso_path <- "~/Docuements/ISO_metadata.xml" eml_path <- convert_iso_to_eml(iso_path) ## End(Not run)
Create a test data.frame of attributes.
create_dummy_attributes_dataframe(numberAttributes, factors = NULL)
create_dummy_attributes_dataframe(numberAttributes, factors = NULL)
numberAttributes |
(integer) Number of attributes to be created in the table. |
factors |
(character) Optional vector of factor names to include. |
(data.frame) A data.frame of attributes.
## Not run: # Create dummy attribute dataframe with 6 attributes and 1 factor attributes <- create_dummy_attributes_dataframe(6, c("Factor1", "Factor2")) ## End(Not run)
## Not run: # Create dummy attribute dataframe with 6 attributes and 1 factor attributes <- create_dummy_attributes_dataframe(6, c("Factor1", "Factor2")) ## End(Not run)
Create a test data.frame of enumeratedDomains.
create_dummy_enumeratedDomain_dataframe(factors)
create_dummy_enumeratedDomain_dataframe(factors)
factors |
(character) Vector of factor names to include. |
(data.frame) A data.frame of factors.
## Not run: # Create dummy dataframe of 2 factors/enumerated domains attributes <- create_dummy_enumeratedDomain_dataframe(c("Factor1", "Factor2")) ## End(Not run)
## Not run: # Create dummy dataframe of 2 factors/enumerated domains attributes <- create_dummy_enumeratedDomain_dataframe(c("Factor1", "Factor2")) ## End(Not run)
Create a test EML metadata object.
create_dummy_metadata(mn, data_pids = NULL)
create_dummy_metadata(mn, data_pids = NULL)
mn |
(MNode) The Member Node. |
data_pids |
(character) Optional. PIDs for data objects the metadata documents. |
(character) The PID of the published metadata document.
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- create_dummy_metadata(mn) ## End(Not run)
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- create_dummy_metadata(mn) ## End(Not run)
Create a test data object. Make sure the member node you use is not a production node.
create_dummy_object(mn)
create_dummy_object(mn)
mn |
(MNode) The Member Node. |
(character) The PID of the dummy object.
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- create_dummy_object(mn) ## End(Not run)
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- create_dummy_object(mn) ## End(Not run)
Create a full test data package with data objects and 1 metadata object. Size = the number of data objects you want in the dummy package + 1 metadata object.
create_dummy_package(mn, size = 2)
create_dummy_package(mn, size = 2)
mn |
(MNode) The Member Node. |
size |
(numeric) The number of files in the package, including the metadata file. |
(list) The PIDs for all elements in the data package.
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") #Create dummy package with 5 data objects and 1 metadata object pids <- create_dummy_package(mn, 6) ## End(Not run)
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") #Create dummy package with 5 data objects and 1 metadata object pids <- create_dummy_package(mn, 6) ## End(Not run)
Creates a more complete package than create_dummy_package()
but is otherwise based on the same concept. This dummy
package includes multiple data objects, responsible parties,
geographic locations, method steps, etc.
create_dummy_package_full(mn, title = "A Dummy Package")
create_dummy_package_full(mn, title = "A Dummy Package")
mn |
(MNode) The Member Node. |
title |
(character) Optional. Title of package. Defaults to "A Dummy Package". |
(list) The PIDs for all elements in the data package.
Create a test parent data package. Make sure the node is not a production node.
create_dummy_parent_package(mn, children)
create_dummy_parent_package(mn, children)
mn |
(MNode) The Member Node. |
children |
(character) Child package (resource maps) PIDs. |
(list) The resource map PIDs for both the parent and child packages.
## Not run: # Set environment ## End(Not run)
## Not run: # Set environment ## End(Not run)
This function first generates a new resource map RDF/XML document locally and
then uses the dataone::createObject()
function to create the object on the
specified MN.
create_resource_map( mn, metadata_pid, data_pids = NULL, child_pids = NULL, check_first = TRUE, ... )
create_resource_map( mn, metadata_pid, data_pids = NULL, child_pids = NULL, check_first = TRUE, ... )
mn |
(MNode) The Member Node |
metadata_pid |
(character) The PID of the metadata object to go in the package. |
data_pids |
(character) The PID(s) of the data objects to go in the package. |
child_pids |
(character) The resource map PIDs of the packages to be nested under the package. |
check_first |
(logical) Optional. Whether to check the PIDs passed in as
arguments exist on the MN before continuing. This speeds up the function,
especially when |
... |
Additional arguments that can be passed into |
If you only want to generate resource map RDF/XML, see generate_resource_map()
.
(character) The PID of the created resource map.
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") meta_pid <- 'urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe' dat_pid <- c('urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1', 'urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe') create_resource_map(mn, metadata_pid = meta_pid, data_pids = dat_pid) ## End(Not run)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") meta_pid <- 'urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe' dat_pid <- c('urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1', 'urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe') create_resource_map(mn, metadata_pid = meta_pid, data_pids = dat_pid) ## End(Not run)
Reduces the amount of copy pasting needed
eml_adcad_annotation(valueLabel)
eml_adcad_annotation(valueLabel)
valueLabel |
(character) One of the disciplines found in ADCAD |
list - a formatted EML annotation
eml_ecso_annotation("latitude coordinate")
eml_ecso_annotation("latitude coordinate")
Adds a landing page URL to the dataset, and corrects the metadata identifier by replacing the existing identifier with that which is passed. Note that this function constructs landing page URLs for the Arctic Data Center only and will not work correctly on other repositories.
eml_add_distribution(doc, identifier)
eml_add_distribution(doc, identifier)
doc |
(emld) An EML document |
identifier |
(character) A pre-issued, unassigned identifier (as from |
doc (emld) An EML document with distribution added
## Not run: library(EML) d1c <- dataone::D1Client("STAGING", "mnTestARCTIC") # read in any EML document doc <- read_eml(system.file("extdata/strix-pacific-northwest.xml", package="dataone")) # generate a doi id <- generateIdentifier(d1c@mn, "doi") doc <- eml_add_distribution(doc, id) ## End(Not run)
## Not run: library(EML) d1c <- dataone::D1Client("STAGING", "mnTestARCTIC") # read in any EML document doc <- read_eml(system.file("extdata/strix-pacific-northwest.xml", package="dataone")) # generate a doi id <- generateIdentifier(d1c@mn, "doi") doc <- eml_add_distribution(doc, id) ## End(Not run)
This function adds system information to entities in a document
eml_add_entity_system(doc)
eml_add_entity_system(doc)
doc |
(emld) An EML document |
(emld) An EML document
## Not run: # Add publisher information to an existing document doc <- eml_add_entity_system(doc) ## End(Not run)
## Not run: # Add publisher information to an existing document doc <- eml_add_entity_system(doc) ## End(Not run)
This function adds Arctic Data Center publisher information to an EML document
eml_add_publisher(doc)
eml_add_publisher(doc)
doc |
(emld) An EML document |
(emld) An EML document
## Not run: # Add publisher information to an existing document doc <- eml_add_publisher(doc) ## End(Not run)
## Not run: # Add publisher information to an existing document doc <- eml_add_publisher(doc) ## End(Not run)
Creates an annotation from the Arctic Report Card ontology
here
and inserts the annotation into the EML document doc
while retaining any existing
annotations such as the sensitivity annotations or dataset categorization. For a
list of available essay topics or key variables, see link above.
eml_arcrc_add_annotation(doc, property, label)
eml_arcrc_add_annotation(doc, property, label)
doc |
(emld) An EML document |
property |
(character) One of two properties: "isAbout" for key variables or "influenced" for essay topics |
label |
(character) One or more labels in title case from the ADCAD ontology. |
doc (emld) An EML document with annotation added
library(EML) # read in any EML document doc <- read_eml(system.file("extdata/strix-pacific-northwest.xml", package="dataone")) # add the dataset categories doc <- eml_arcrc_add_annotation(doc, "isAbout", c("sea ice thickness", "sea surface temperature"))
library(EML) # read in any EML document doc <- read_eml(system.file("extdata/strix-pacific-northwest.xml", package="dataone")) # add the dataset categories doc <- eml_arcrc_add_annotation(doc, "isAbout", c("sea ice thickness", "sea surface temperature"))
Reduces the amount of copy pasting needed
eml_arcrc_essay_annotation(valueLabel)
eml_arcrc_essay_annotation(valueLabel)
valueLabel |
(character) One of the essay topics found in ARCRC |
list - a formatted EML annotation
eml_arcrc_essay_annotation("Sea Ice Indicator")
eml_arcrc_essay_annotation("Sea Ice Indicator")
Reduces the amount of copy pasting needed
eml_arcrc_key_variable_annotation(valueLabel)
eml_arcrc_key_variable_annotation(valueLabel)
valueLabel |
(character) One of the key variables found in ARCRC |
list - a formatted EML annotation
eml_arcrc_key_variable_annotation("age of sea ice")
eml_arcrc_key_variable_annotation("age of sea ice")
See eml_party()
for details.
eml_associated_party(...)
eml_associated_party(...)
... |
Arguments passed on to |
(associatedParty) The new associatedParty.
## Not run: eml_associated_party("test", "user", email = "[email protected]", role = "Principal Investigator") ## End(Not run)
## Not run: eml_associated_party("test", "user", email = "[email protected]", role = "Principal Investigator") ## End(Not run)
Creates an annotation from the ADC Academic Disciplines ontology
here
and inserts the annotation into the EML document doc
while retaining any existing
annotations such as the sensitivity annotations. For a list of available disciplines,
see link above.
eml_categorize_dataset(doc, discipline)
eml_categorize_dataset(doc, discipline)
doc |
(emld) An EML document |
discipline |
(character) One or more disciplines in title case from the ADCAD ontology. |
doc (emld) An EML document with annotation added
library(EML) # read in any EML document doc <- read_eml(system.file("extdata/strix-pacific-northwest.xml", package="dataone")) # add the dataset categories doc <- eml_categorize_dataset(doc, c("Soil Science", "Ecology"))
library(EML) # read in any EML document doc <- read_eml(system.file("extdata/strix-pacific-northwest.xml", package="dataone")) # add the dataset categories doc <- eml_categorize_dataset(doc, c("Soil Science", "Ecology"))
eml_party()
eml_contact(...)
eml_contact(...)
... |
Arguments passed on to |
Please use the constructors in the EML package instead
(contact) The new contact.
## Not run: eml_contact("test", "user", email = "[email protected]") eml_creator("creator", "Bryce", "Mecum", userId = "https://orcid.org/0000-0002-0381-3766") eml_creator("creator", c("Dominic", "'Dom'"), "Mullen", c("NCEAS", "UCSB"), c("Data Scientist", "Programmer")) ## End(Not run)
## Not run: eml_contact("test", "user", email = "[email protected]") eml_creator("creator", "Bryce", "Mecum", userId = "https://orcid.org/0000-0002-0381-3766") eml_creator("creator", c("Dominic", "'Dom'"), "Mullen", c("NCEAS", "UCSB"), c("Data Scientist", "Programmer")) ## End(Not run)
eml_creator(...)
eml_creator(...)
... |
Arguments passed on to |
Please use the constructors in the EML package instead
(creator) The new creator.
## Not run: eml_creator("test", "user", email = "[email protected]") eml_creator("creator", "Bryce", "Mecum", userId = "https://orcid.org/0000-0002-0381-3766") eml_creator("creator", c("Dominic", "'Dom'"), "Mullen", c("NCEAS", "UCSB"), c("Data Scientist", "Programmer")) ## End(Not run)
## Not run: eml_creator("test", "user", email = "[email protected]") eml_creator("creator", "Bryce", "Mecum", userId = "https://orcid.org/0000-0002-0381-3766") eml_creator("creator", c("Dominic", "'Dom'"), "Mullen", c("NCEAS", "UCSB"), c("Data Scientist", "Programmer")) ## End(Not run)
Reduces the amount of copy pasting needed
eml_ecso_annotation(valueLabel)
eml_ecso_annotation(valueLabel)
valueLabel |
(character) the label for the annotation found in ECSO |
list - a formatted EML annotation
eml_ecso_annotation("latitude coordinate")
eml_ecso_annotation("latitude coordinate")
This function populates a spatialRaster element with the
required elements by reading a local raster file in. The
coord_name
argument can be found by examining the data.frame
that get_coord_list()
returns against the proj4string of the
raster file.
eml_get_raster_metadata(path, coord_name = NULL, attributes)
eml_get_raster_metadata(path, coord_name = NULL, attributes)
path |
(char) Path to a raster file |
coord_name |
(char) horizCoordSysDef name |
attributes |
(dataTable) attributes for raster |
This function is a convenience wrapper around EML::eml_get() which
returns the output as a simple list as opposed to an object of type
emld
by removing the attributes and context from the object. If an
element containing children is returned all of it's children will be
flattened into a named character vector. This function is best used
to extract values from elements that have no children.
eml_get_simple(doc, element)
eml_get_simple(doc, element)
doc |
(list) An EML object or child/descendant object |
element |
(character) Name of the element to be extracted. If multiple occurrences are found, will extract all. |
out (vector) A list of values contained in element given
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') doc <- EML::read_eml(dataone::getObject(adc, 'doi:10.18739/A2S17SS1M')) datatable_names <- eml_get_simple(doc$dataset$dataTable, element = "entityName") ## End(Not run)
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') doc <- EML::read_eml(dataone::getObject(adc, 'doi:10.18739/A2S17SS1M')) datatable_names <- eml_get_simple(doc$dataset$dataTable, element = "entityName") ## End(Not run)
This function takes a list of NSF award numbers and uses it to query the NSF API to get the award title, PIs, and coPIs. The return value is an EML project section. The function supports 1 or more award numbers
eml_nsf_to_project(awards, eml_version = "2.2")
eml_nsf_to_project(awards, eml_version = "2.2")
awards |
(list) A list of NSF award numbers as characters |
eml_version |
(char) EML version to use (2.1.1 or 2.2.0) |
project (emld) An EML project section
awards <- c("1203146", "1203473", "1603116") proj <- eml_nsf_to_project(awards, eml_version = "2.1.1") me <- list(individualName = list(givenName = "Jeanette", surName = "Clark")) doc <- list(packageId = "id", system = "system", dataset = list(title = "A Mimimal Valid EML Dataset", creator = me, contact = me)) doc$dataset$project <- proj EML::eml_validate(doc)
awards <- c("1203146", "1203473", "1603116") proj <- eml_nsf_to_project(awards, eml_version = "2.1.1") me <- list(individualName = list(givenName = "Jeanette", surName = "Clark")) doc <- list(packageId = "id", system = "system", dataset = list(title = "A Mimimal Valid EML Dataset", creator = me, contact = me)) doc$dataset$project <- proj EML::eml_validate(doc)
Convert an EML 'otherEntity' object to a 'dataTable' object. This will convert an otherEntity object as currently constructed - it does not add a physical or add attributes. However, if these are already in their respective slots, they will be retained.
eml_otherEntity_to_dataTable(doc, index, validate_eml = TRUE)
eml_otherEntity_to_dataTable(doc, index, validate_eml = TRUE)
doc |
(list) An EML document. |
index |
(integer) The indicies of the otherEntities to be transformed. |
validate_eml |
(logical) Optional. Whether or not to validate the EML after
completion. Setting this to |
Dominic Mullen [email protected]
## Not run: doc <- read_eml(system.file("example-eml.xml", package = "arcticdatautils")) doc <- eml_otherEntity_to_dataTable(doc, 1) ## End(Not run)
## Not run: doc <- read_eml(system.file("example-eml.xml", package = "arcticdatautils")) doc <- eml_otherEntity_to_dataTable(doc, 1) ## End(Not run)
eml_party( type = "associatedParty", given_names = NULL, sur_name = NULL, organization = NULL, position = NULL, email = NULL, phone = NULL, address = NULL, userId = NULL, role = NULL )
eml_party( type = "associatedParty", given_names = NULL, sur_name = NULL, organization = NULL, position = NULL, email = NULL, phone = NULL, address = NULL, userId = NULL, role = NULL )
type |
(character) The type of party (e.g. 'contact'). |
given_names |
(character) The party's given name(s). |
sur_name |
(character) The party's surname. |
organization |
(character) The party's organization name. |
position |
(character) The party's position. |
email |
(character) The party's email address(es). |
phone |
(character) The party's phone number(s). |
address |
(character) The party's address(es) as a valid EML address |
userId |
(character) The party's ORCID, in format https://orcid.org/WWWW-XXXX-YYYY-ZZZZ. |
role |
(character) The party's role. |
Please use the constructors in the EML package instead
You will usually want to use the high-level functions such as
eml_creator()
and eml_contact()
but using this is fine.
The userId
argument assumes an ORCID so be sure to adjust for that.
(party) An instance of the party specified by the type
argument.
## Not run: eml_party("creator", "Test", "User") eml_party("creator", "Bryce", "Mecum", userId = "https://orcid.org/0000-0002-0381-3766") eml_party("creator", given_names = list("Dominic", "'Dom'"), sur_name = "Mullen", list("NCEAS", "UCSB"), position = list("Data Scientist", "Programmer"), address = eml$address(deliveryPoint = "735 State St", city = "Santa Barbara", administrativeArea = "CA", postalCode = "85719")) ## End(Not run)
## Not run: eml_party("creator", "Test", "User") eml_party("creator", "Bryce", "Mecum", userId = "https://orcid.org/0000-0002-0381-3766") eml_party("creator", given_names = list("Dominic", "'Dom'"), sur_name = "Mullen", list("NCEAS", "UCSB"), position = list("Data Scientist", "Programmer"), address = eml$address(deliveryPoint = "735 State St", city = "Santa Barbara", administrativeArea = "CA", postalCode = "85719")) ## End(Not run)
eml_set_reference(element_to_reference, element_to_replace)
eml_set_reference(element_to_reference, element_to_replace)
element_to_reference |
(list) An EML element to reference. |
element_to_replace |
(list) An EML element to replace with a reference. |
please add references directly instead
This function creates a new object with the same class as element_to_replace
using a reference to element_to_reference
.
Dominic Mullen [email protected]
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') doc <- EML::read_eml(dataone::getObject(adc, 'doi:10.18739/A2S17SS1M')) # Set the first contact as a reference to the first creator doc$dataset$contact[[1]] <- eml_set_reference(doc$dataset$creator[[1]], doc$dataset$contact[[1]]) # This is also useful when we want to set references to a subset of 'dataTable' or 'otherEntity' objects # Add a few more objects first to illustrate the use: doc$dataset$dataTable[[3]] <- doc$dataset$dataTable[[1]] doc$dataset$dataTable[[4]] <- doc$dataset$dataTable[[1]] # Add references to the second and third elements only (not the 4th): for (i in 2:3) { doc$dataset$dataTable[[i]]$attributeList <- eml_set_reference( doc$dataset$dataTable[[1]]$attributeList, doc$dataset$dataTable[[i]]$attributeList) } # If we print the entire 'dataTable' list we see elements 2 and 3 have references while 4 does not. doc$dataset$dataTable ## End(Not run)
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') doc <- EML::read_eml(dataone::getObject(adc, 'doi:10.18739/A2S17SS1M')) # Set the first contact as a reference to the first creator doc$dataset$contact[[1]] <- eml_set_reference(doc$dataset$creator[[1]], doc$dataset$contact[[1]]) # This is also useful when we want to set references to a subset of 'dataTable' or 'otherEntity' objects # Add a few more objects first to illustrate the use: doc$dataset$dataTable[[3]] <- doc$dataset$dataTable[[1]] doc$dataset$dataTable[[4]] <- doc$dataset$dataTable[[1]] # Add references to the second and third elements only (not the 4th): for (i in 2:3) { doc$dataset$dataTable[[i]]$attributeList <- eml_set_reference( doc$dataset$dataTable[[1]]$attributeList, doc$dataset$dataTable[[i]]$attributeList) } # If we print the entire 'dataTable' list we see elements 2 and 3 have references while 4 does not. doc$dataset$dataTable ## End(Not run)
Get the current environment name.
env_get()
env_get()
(character) The environment name.
Find the newest object, based on dateUploaded, within the given set of objects.
find_newest_object(node, identifiers, rows = 1000)
find_newest_object(node, identifiers, rows = 1000)
node |
(MNode/CNode) The Member Node to query. |
identifiers |
(character) One or more identifiers. |
rows |
(numeric) Optional. Specify the size of the query result set. |
(character) The PID of the newest object. In the case of a tie (very unlikely) the first element, in natural order, is returned.
## Not run: mn <- MNode(...) find_newest_object(mn, c("PIDX", "PIDY", "PIDZ")) ## End(Not run)
## Not run: mn <- MNode(...) find_newest_object(mn, c("PIDX", "PIDY", "PIDZ")) ## End(Not run)
Returns the EML 2.1.1 format ID.
format_eml(version)
format_eml(version)
version |
The version of EML ('2.1.1' or '2.2.0') |
(character) The format ID for EML 2.1.1.
format_eml("2.1.1") ## Not run: # Upload a local EML 2.1.1 file: env <- env_load() publish_object(env$mn, "path_to_some_EML_file", format_eml("2.1")) ## End(Not run)
format_eml("2.1.1") ## Not run: # Upload a local EML 2.1.1 file: env <- env_load() publish_object(env$mn, "path_to_some_EML_file", format_eml("2.1")) ## End(Not run)
Returns the ISO 19139 format ID.
format_iso()
format_iso()
(character) The format ID for ISO 19139.
format_iso() ## Not run: # Upload a local ISO19139 XML file: env <- env_load() publish_object(env$mn, "path_to_some_EML_file", format_iso()) ## End(Not run)
format_iso() ## Not run: # Upload a local ISO19139 XML file: env <- env_load() publish_object(env$mn, "path_to_some_EML_file", format_iso()) ## End(Not run)
This is a convenience wrapper around the constructor of the ResourceMap
class from DataPackage
.
generate_resource_map( metadata_pid, data_pids = NULL, child_pids = NULL, other_statements = NULL, resolve_base = "https://cn.dataone.org/cn/v2/resolve", resource_map_pid = NULL )
generate_resource_map( metadata_pid, data_pids = NULL, child_pids = NULL, other_statements = NULL, resolve_base = "https://cn.dataone.org/cn/v2/resolve", resource_map_pid = NULL )
metadata_pid |
(character) PID of the metadata object. |
data_pids |
(character) PID(s) of the data objects. |
child_pids |
(character) Optional. PID(s) of child resource maps. |
other_statements |
(data.frame) Extra statements to add to the resource map. |
resolve_base |
(character) Optional. The resolve service base URL. |
resource_map_pid |
(character) The PID of a resource map. |
(character) Absolute path to the resource map on disk.
## Not run: generate_resource_map("X", "Y", "Z", other_statements = data.frame(subject="http://example.com/me", predicate="http://example.com/foo", object="http://example.com/bar")) ## End(Not run)
## Not run: generate_resource_map("X", "Y", "Z", other_statements = data.frame(subject="http://example.com/me", predicate="http://example.com/foo", object="http://example.com/bar")) ## End(Not run)
Get the PIDs of all versions of an object.
get_all_versions(node, pid)
get_all_versions(node, pid)
node |
(MNode) The Member Node to query. |
pid |
(character) Any object in the chain. |
(character) A vector of PIDs in the chain, in order.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1" ids <- get_all_versions(mn, pid) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1" ids <- get_all_versions(mn, pid) ## End(Not run)
Get a data.frame of EML coordinate reference systems that can be searched and filtered more easily than the raw XML file.
get_coord_list()
get_coord_list()
Get the base URL of a Member Node.
get_mn_base_url(mn)
get_mn_base_url(mn)
mn |
(character) The Member Node. |
(character) The URL.
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") ## End(Not run)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") ## End(Not run)
Get a data.frame of attributes from a NetCDF object.
get_ncdf4_attributes(nc)
get_ncdf4_attributes(nc)
nc |
(ncdf4/character) Either a ncdf4 object or a file path. |
(data.frame) A data.frame of the attributes.
## Not run: get_ncdf4_attributes("./path/to/my.nc") ## End(Not run)
## Not run: get_ncdf4_attributes("./path/to/my.nc") ## End(Not run)
Takes an ontology and returns a dataframe with all the URIs and labels. This is mainly used for MOSAiC because the ontology is modeled differently
get_ontology_concepts(ontology)
get_ontology_concepts(ontology)
ontology |
(list) the list form of a OWL file |
dataframe
mosaic <- read_ontology("mosaic") get_ontology_concepts(mosaic)
mosaic <- read_ontology("mosaic") get_ontology_concepts(mosaic)
get_package(node, pid, file_names = FALSE, rows = 5000)
get_package(node, pid, file_names = FALSE, rows = 5000)
node |
(MNode/CNode) The Coordinating/Member Node to run the query on. |
pid |
(character) The the resource map PID of the package. |
file_names |
(logical) Whether to return file names for all objects. |
rows |
(numeric) The number of rows to return in the query. This is only useful to set if you are warned about the result set being truncated. Defaults to 5000. |
Please use dataone::getDataPackage() when possible
Get a structured list of PIDs for the objects in a package, including the resource map, metadata, and data objects.
(list) A structured list of the members of the package.
## Not run: #Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "resource_map_urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1" ids <- get_package(mn, pid) ## End(Not run)
## Not run: #Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "resource_map_urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1" ids <- get_package(mn, pid) ## End(Not run)
Get the currently set authentication token.
get_token(node)
get_token(node)
node |
(MNode/CNode) The Member/Coordinating Node to query. |
(character) The token.
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") get_token(mn) ## End(Not run)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") get_token(mn) ## End(Not run)
Guess format from filename for a vector of filenames.
guess_format_id(filenames)
guess_format_id(filenames)
filenames |
(character) A vector of filenames. |
(character) DataONE format IDs.
formatid <- guess_format_id("temperature_data.csv")
formatid <- guess_format_id("temperature_data.csv")
Check if the user has authorization to perform an action on an object.
is_authorized(node, ids, action)
is_authorized(node, ids, action)
node |
(MNode/CNode) The Member/Coordinating Node to query. |
ids |
(character) The PID or SID to check. |
action |
(character) One of read, write, or changePermission. |
(logical)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") is_authorized(mn, pids, "write") ## End(Not run)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") is_authorized(mn, pids, "write") ## End(Not run)
Test whether the object is obsoleted by another object
is_obsolete(node, pids)
is_obsolete(node, pids)
node |
(MNode|CNode) The Coordinating/Member Node to run the query on. |
pids |
(character) One or more PIDs to query against. |
(logical) Whether or not the object is obsoleted by another object.
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1" is_obsolete(mn, pid) ## End(Not run)
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1" is_obsolete(mn, pid) ## End(Not run)
Check whether objects have public read access. No token needs to be set to use this function.
is_public_read(mn, pids, use.names = TRUE)
is_public_read(mn, pids, use.names = TRUE)
mn |
(MNode) The Member Node. |
pids |
(character) The PIDs of the objects to check for public read access. |
use.names |
(logical) If |
(logical) Whether an object has public read access.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") is_public_read(mn, pids) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") is_public_read(mn, pids) ## End(Not run)
Determine whether the set token is expired.
is_token_expired(node)
is_token_expired(node)
node |
(character) The Member Node. |
(logical)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") is_token_expired(mn) ## End(Not run)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") is_token_expired(mn) ## End(Not run)
Test whether a token is set.
is_token_set(node)
is_token_set(node)
node |
(MNode/CNode) The Member/Coordinating Node to query. |
(logical)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") is_token_set(mn) ## End(Not run)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") is_token_set(mn) ## End(Not run)
This function scores a metadata document against a MetaDIG suite. The default suite is for the Arctic Data Center.
mdq_run(document, suite_id = "arctic.data.center.suite.1")
mdq_run(document, suite_id = "arctic.data.center.suite.1")
document |
(eml/character) Either an EML object or path to a file on disk. |
suite_id |
(character) Specify a suite ID. Should be one of https://quality.nceas.ucsb.edu/quality/suites. |
(data.frame) A sorted data.frame of check results.
## Not run: # Check an EML document you are authoring library(EML) mdq_run(new("eml")) # Check an EML document that is saved to disk mdq_run(system.file("examples", "example-eml-2.1.1.xml", package = "EML")) ## End(Not run)
## Not run: # Check an EML document you are authoring library(EML) mdq_run(new("eml")) # Check an EML document that is saved to disk mdq_run(system.file("examples", "example-eml-2.1.1.xml", package = "EML")) ## End(Not run)
Add a MOSAiC (https://mosaic-expedition.org/) attribute annotation (the returned object does not include the id slot)
mosaic_annotate_attribute(eventLabel)
mosaic_annotate_attribute(eventLabel)
eventLabel |
(character) the event ID provided by the researcher |
(list) the attribute level annotation
mosaic_annotate_attribute("PS122/2_14-270")
mosaic_annotate_attribute("PS122/2_14-270")
The basis might differ depending on the campaign if it does not follow the pattern PS122/#. This function assumes the use of the Polarstern as the basis. Please verify this field before adding the annotation.
mosaic_annotate_dataset(campaign)
mosaic_annotate_dataset(campaign)
campaign |
(character vector) the campaign number (can be derrived from the eventID), PS122/# |
(list) the dataset level annotation
#with one campaign mosaic_annotate_dataset("PS122/2") #multiple campaigns mosaic_annotate_dataset(c("PS122/2", "PS122/1"))
#with one campaign mosaic_annotate_dataset("PS122/2") #multiple campaigns mosaic_annotate_dataset(c("PS122/2", "PS122/1"))
The function only selects the annotations that are used for method/devices (there are 500 + options). copy and paste the output into a portal document's choice filters
mosaic_portal_filter(class)
mosaic_portal_filter(class)
class |
(character) a class in the MOSAiC ontology to get the filters from |
character
mosaic_portal_filter("Method/Device") mosaic_portal_filter("Basis") mosaic_portal_filter("Campaign")
mosaic_portal_filter("Method/Device") mosaic_portal_filter("Basis") mosaic_portal_filter("Campaign")
Generate a new UUID PID.
new_uuid()
new_uuid()
(character) A new UUID PID.
id <- new_uuid()
id <- new_uuid()
This is a simple check for the HTTP status of a /meta/{PID} call on the provided Member Mode.
object_exists(node, pids)
object_exists(node, pids)
node |
(MNode) The Member Node to query. |
pids |
(character) The PID(s) to check the existence of. |
(logical) Whether the object exists.
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") object_exists(mn, pids) ## End(Not run)
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") object_exists(mn, pids) ## End(Not run)
Parse a resource map into a data.frame.
parse_resource_map(path)
parse_resource_map(path)
path |
(character) Path to the resource map (an RDF/XML file). |
(data.frame) The statements in the resource map.
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") rm_pid <- "resource_map_urn:uuid:6b2e5753-4a94-4e6f-971c-36420a446ecb" # Write resource map to file writeBin(getObject(mn, rm_pid), "~/Documents/resource_map.rdf") df <- parse_resource_map("~/Documents/resource_map.rdf") ## End(Not run)
## Not run: # Set environment cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") rm_pid <- "resource_map_urn:uuid:6b2e5753-4a94-4e6f-971c-36420a446ecb" # Write resource map to file writeBin(getObject(mn, rm_pid), "~/Documents/resource_map.rdf") df <- parse_resource_map("~/Documents/resource_map.rdf") ## End(Not run)
Create EML entity with physical section from any DataONE PID
pid_to_eml_entity(mn, pid, entity_type = "otherEntity", ...)
pid_to_eml_entity(mn, pid, entity_type = "otherEntity", ...)
mn |
(MNode) Member Node where the PID is associated with an object. |
pid |
(character) The PID of the object to create the sub-tree for. |
entity_type |
(character) What kind of object to create from the input. One of "dataTable", "spatialRaster", "spatialVector", "storedProcedure", "view", or "otherEntity". |
... |
(optional) Additional arguments to be passed to |
(list) The entity object.
## Not run: # Generate EML otherEntity pid_to_eml_entity(mn, pid, entity_type = "otherEntity", entityName = "Entity Name", entityDescription = "Description about entity") ## End(Not run)
## Not run: # Generate EML otherEntity pid_to_eml_entity(mn, pid, entity_type = "otherEntity", entityName = "Entity Name", entityDescription = "Description about entity") ## End(Not run)
This function creates a data object's physical.
pid_to_eml_physical(mn, pid, num_header_lines = 1)
pid_to_eml_physical(mn, pid, num_header_lines = 1)
mn |
(MNode) Member Node where the PID is associated with an object. |
pid |
(character) The PID of the object to create the physical for. |
num_header_lines |
(double) The number of headers in a csv/Excel file. Default is equal to 1. |
(list) A physical object.
## Not run: # Generate EML physical sections for an object in a data package phys <- pid_to_eml_physical(mn, pid, num_header_lines) ## End(Not run)
## Not run: # Generate EML physical sections for an object in a data package phys <- pid_to_eml_physical(mn, pid, num_header_lines) ## End(Not run)
Use sensible defaults to publish an object on a Member Node. If identifier is provided, use it, otherwise generate a UUID. If clone_id is provided, then retrieve the system metadata for that identifier and use it to provide rightsHolder, accessPolicy, and replicationPolicy metadata. Note that this function only uploads the object to the Member Node, and does not add it to a data package, which can be done separately.
publish_object( mn, path, format_id = NULL, pid = NULL, sid = NULL, clone_pid = NULL, public = TRUE )
publish_object( mn, path, format_id = NULL, pid = NULL, sid = NULL, clone_pid = NULL, public = TRUE )
mn |
(MNode) The Member Node to publish the object to. |
path |
(character) The path to the file to be published. |
format_id |
(character) Optional. The format ID to set for the object.
When not set, |
pid |
(character) Optional. The PID to use with the object. |
sid |
(character) Optional. The SID to use with the new object. |
clone_pid |
(character) PID of object to clone System Metadata from. |
public |
(logical) Whether object should be given public read access. |
pid (character) The PID of the published object.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") my_path <- "/home/Documents/myfile.csv" pid <- publish_object(mn, path = my_path, format_id = "text/csv", public = FALSE) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") my_path <- "/home/Documents/myfile.csv" pid <- publish_object(mn, path = my_path, format_id = "text/csv", public = FALSE) ## End(Not run)
Publish an update to a data package after updating data files or metadata.
publish_update( mn, metadata_pid, resource_map_pid, data_pids = NULL, child_pids = NULL, metadata_path = NULL, identifier = NULL, use_doi = FALSE, parent_resmap_pid = NULL, parent_metadata_pid = NULL, parent_data_pids = NULL, parent_child_pids = NULL, public = TRUE, check_first = TRUE, format_id = NULL )
publish_update( mn, metadata_pid, resource_map_pid, data_pids = NULL, child_pids = NULL, metadata_path = NULL, identifier = NULL, use_doi = FALSE, parent_resmap_pid = NULL, parent_metadata_pid = NULL, parent_data_pids = NULL, parent_child_pids = NULL, public = TRUE, check_first = TRUE, format_id = NULL )
mn |
(MNode) The Member Node to update the object on. |
metadata_pid |
(character) The PID of the EML metadata document to be updated. |
resource_map_pid |
(character) The PID of the resource map for the package. |
data_pids |
(character) PID(s) of data objects that will go in the updated package. |
child_pids |
(character) Optional. Child packages resource map PIDs. |
metadata_path |
(character or eml) Optional. An eml class object or a path to a metadata file to update with. If this is not set, the existing metadata document will be used. |
identifier |
(character) Manually specify the identifier for the new metadata object. |
use_doi |
(logical) Generate and use a DOI as the identifier for the updated metadata object. |
parent_resmap_pid |
(character) Optional. PID of a parent package to be updated. Not optional if a parent package exists. |
parent_metadata_pid |
(character) Optional. Identifier for the metadata document of the parent package. Not optional if a parent package exists. |
parent_data_pids |
(character) Optional. Identifier for the data objects of the parent package. Not optional if the parent package contains data objects. |
parent_child_pids |
(character) Optional. Resource map identifier(s) of child packages in the parent package.
|
public |
(logical) Optional. Make the update public. If |
check_first |
(logical) Optional. Whether to check the PIDs passed in as arguments exist on the MN before continuing.
Checks that objects exist and are of the right format type. This speeds up the function, especially when |
format_id |
(character) Optional. When omitted, the updated object will have the same formatId as |
This function can be used for a variety of tasks:
Publish an existing package with a DOI
Update a package with new data objects
Update a package with new metadata
The metadata_pid and resource_map_pid provide the identifier of an EML metadata
document and associated resource map, and the data_pids vector provides a list
of PIDs of data objects in the package. Update the metadata file and resource map
by generating a new identifier (a DOI if use_doi = TRUE
) and updating the Member
Node with a public version of the object. If metadata_file is not missing, it
should be an edited version of the metadata to be used to update the original. If
parent_resmap_pid is not missing, it indicates the PID of a parent package that
should be updated as well, using the parent_metadata_pid, parent_data_pids, and
parent_child_pids as members of the updated package. In all cases, the objects
are made publicly readable.
(character) Named character vector of PIDs in the data package, including PIDs for the metadata, resource map, and data objects.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") rm_pid <- "resource_map_urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" meta_pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" data_pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") meta_path <- "/home/Documents/myMetadata.xml" publish_update(mn, meta_pid, rm_pid, data_pids, meta_path, public = TRUE) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") rm_pid <- "resource_map_urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" meta_pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" data_pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") meta_path <- "/home/Documents/myMetadata.xml" publish_update(mn, meta_pid, rm_pid, data_pids, meta_path, public = TRUE) ## End(Not run)
Get an owl file from github
read_ontology(ontology_name)
read_ontology(ontology_name)
ontology_name |
the name of the onotology to read; one of mosaic or ecso |
list
read_ontology("mosaic") read_ontology("ecso")
read_ontology("mosaic") read_ontology("ecso")
Read a shapefile 'sf' from a pid that points to the zipped directory of the shapefile and associated files on a given member node.
read_zip_shapefile(mn, pid)
read_zip_shapefile(mn, pid)
mn |
(MNode) A DataOne Member Node |
pid |
(character) An object identifier |
shapefile (sf) The shapefile as an sf
object
Jeanette Clark [email protected]
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') pid <- "urn:uuid:294a365f-c0d1-4cc3-a508-2e16260aa70c" shapefile <- read_zip_shapefile(adc, pid) ## End(Not run)
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') pid <- "urn:uuid:294a365f-c0d1-4cc3-a508-2e16260aa70c" shapefile <- read_zip_shapefile(adc, pid) ## End(Not run)
Recovers failed submissions and writes the new, valid EML to a given path
recover_failed_submission(node, pid, path)
recover_failed_submission(node, pid, path)
node |
(MNode) The Member Node to publish the object to. |
pid |
The PID of the EML metadata document to be recovered. |
path |
path to write XML. |
recovers and write the valid EML to the indicated path
Rachel Sun [email protected]
## Not run: # Set environment cn <- dataone::CNode("STAGING2") mn <- dataone::getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:b1a234f0-eed5-4f58-b8d5-6334ce07c010" path <- tempfile("file", fileext = ".xml") recover_failed_submission(mn, pid, path) eml <- EML::read_eml(path) ## End(Not run)
## Not run: # Set environment cn <- dataone::CNode("STAGING2") mn <- dataone::getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:b1a234f0-eed5-4f58-b8d5-6334ce07c010" path <- tempfile("file", fileext = ".xml") recover_failed_submission(mn, pid, path) eml <- EML::read_eml(path) ## End(Not run)
Reformat the fileName field in an object's system metadata to follow Arctic Data Center system metdata naming conventions. Publish_object calls this function to rename the fileName field in system metadata.
reformat_file_name(path, sysmeta)
reformat_file_name(path, sysmeta)
path |
(character) full file path |
sysmeta |
(S4) A system metadata object |
Remove the given subjects from the access policy for the given objects on the given Member Node. For each type of permission, this function checks if the permission is already set and only updates the System Metadata when a change is needed.
remove_access( mn, pids, subjects, permissions = c("read", "write", "changePermission") )
remove_access( mn, pids, subjects, permissions = c("read", "write", "changePermission") )
mn |
(MNode) The Member Node. |
pids |
(character) The PIDs of the objects to set permissions for. |
subjects |
(character) The identifiers of the subjects to set permissions for, typically an ORCID or DN. |
permissions |
(character) Optional. The permissions to set. Defaults to read, write, and changePermission. |
(logical) Whether an update was needed.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") remove_access(mn, pids, subjects = "http://orcid.org/0000-000X-XXXX-XXXX", permissions = c("read", "write", "changePermission")) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") remove_access(mn, pids, subjects = "http://orcid.org/0000-000X-XXXX-XXXX", permissions = c("read", "write", "changePermission")) ## End(Not run)
Remove public read access for an object.
remove_public_read(mn, pids)
remove_public_read(mn, pids)
mn |
(MNode) The Member Node. |
pids |
(character) The PIDs of the objects to remove public read access for. |
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") remove_public_read(mn, pids) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") remove_public_read(mn, pids) ## End(Not run)
This function takes a named list of data objects, such as what is
returned from get_package
, and reorders them according to the order
they are given in the EML document.
reorder_pids(pid_list, doc)
reorder_pids(pid_list, doc)
pid_list |
(list) A named list of data pids |
doc |
(list) an |
ordered_pids (list) A list of reordered pids
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') ids <- get_package(adc, 'resource_map_doi:10.18739/A2S17SS1M', file_names = TRUE) doc <- EML::read_eml(dataone::getObject(adc, ids$metadata)) # return all entity types ordered_pids <- reorder_pids(ids$data, doc) ## End(Not run)
## Not run: cn <- dataone::CNode('PROD') adc <- dataone::getMNode(cn,'urn:node:ARCTIC') ids <- get_package(adc, 'resource_map_doi:10.18739/A2S17SS1M', file_names = TRUE) doc <- EML::read_eml(dataone::getObject(adc, ids$metadata)) # return all entity types ordered_pids <- reorder_pids(ids$data, doc) ## End(Not run)
Set the access policy for the given subjects for the given objects on the given Member Node. For each type of permission, this function checks if the permission is already set and only updates the System Metadata when a change is needed.
set_access( mn, pids, subjects, permissions = c("read", "write", "changePermission") )
set_access( mn, pids, subjects, permissions = c("read", "write", "changePermission") )
mn |
(MNode) The Member Node. |
pids |
(character) The PIDs of the objects to set permissions for. |
subjects |
(character) The identifiers of the subjects to set permissions for, typically an ORCID or DN. |
permissions |
(character) Optional. The permissions to set. Defaults to read, write, and changePermission. |
(logical) Whether an update was needed.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_access(mn, pids, subjects = "http://orcid.org/0000-000X-XXXX-XXXX", permissions = c("read", "write", "changePermission")) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_access(mn, pids, subjects = "http://orcid.org/0000-000X-XXXX-XXXX", permissions = c("read", "write", "changePermission")) ## End(Not run)
Set the file name for an object.
set_file_name(mn, pid, name)
set_file_name(mn, pid, name)
mn |
(MNode) The Member Node. |
pid |
(character) The PID of the object to set the file name on. |
name |
(character) The file name. |
(logical) Whether the update succeeded.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn, "urn:node:mnTestKNB") pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" set_file_name(mn, pid, "myfile.csv") ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn, "urn:node:mnTestKNB") pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" set_file_name(mn, pid, "myfile.csv") ## End(Not run)
Set public read access for an object.
set_public_read(mn, pids)
set_public_read(mn, pids)
mn |
(MNode) The Member Node. |
pids |
(character) The PIDs of the objects to set public read access for. |
(logical) Whether an update was needed.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_public_read(mn, pids) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_public_read(mn, pids) ## End(Not run)
Set public READ access on all versions of PIDs in data package.
set_public_read_all_versions(mn, resource_map_pid)
set_public_read_all_versions(mn, resource_map_pid)
mn |
(MNode) The Member Node to query. |
resource_map_pid |
(character) The resource map identifier (PID). |
## Not run: cn_staging <- CNode('STAGING') adc_test <- getMNode(cn_staging,'urn:node:mnTestARCTIC') # Create a dummy package then create another version with 'publish_update()' pkg <- create_dummy_package(adc_test) remove_public_read(mn, unlist(pkg)) pkg_v2 <- publish_update(adc_test, pkg$metadata, pkg$resource_map, pkg$data, public = FALSE) # Set public read on all versions set_public_read_all_versions(adc_test, pkg$resource_map) ## End(Not run)
## Not run: cn_staging <- CNode('STAGING') adc_test <- getMNode(cn_staging,'urn:node:mnTestARCTIC') # Create a dummy package then create another version with 'publish_update()' pkg <- create_dummy_package(adc_test) remove_public_read(mn, unlist(pkg)) pkg_v2 <- publish_update(adc_test, pkg$metadata, pkg$resource_map, pkg$data, public = FALSE) # Set public read on all versions set_public_read_all_versions(adc_test, pkg$resource_map) ## End(Not run)
Set the given subject as the rights holder and with given permissions for the given objects. This function only updates the existing System Metadata when a change is needed.
set_rights_and_access( mn, pids, subject, permissions = c("read", "write", "changePermission") )
set_rights_and_access( mn, pids, subject, permissions = c("read", "write", "changePermission") )
mn |
(MNode) The Member Node. |
pids |
(character) The PIDs of the objects to set the rights holder and access policy for. |
subject |
(character) The identifier of the new rights holder, typically an ORCID or DN. |
permissions |
(character) Optional. The permissions to set. Defaults to read, write, and changePermission. |
(logical) Whether an update was needed.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_rights_and_access(mn, pids, "http://orcid.org/0000-000X-XXXX-XXXX", permissions = c("read", "write", "changePermission")) ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_rights_and_access(mn, pids, "http://orcid.org/0000-000X-XXXX-XXXX", permissions = c("read", "write", "changePermission")) ## End(Not run)
Set the rights holder to the given subject for the given objects on the given Member Node. This function checks if the rights holder is already set and only updates the System Metadata when a change is needed.
set_rights_holder(mn, pids, subject)
set_rights_holder(mn, pids, subject)
mn |
(MNode) The Member Node. |
pids |
(character) The PIDs of the objects to set the rights holder for. |
subject |
(character) The identifier of the new rights holder, typically an ORCID or DN. |
(logical) Whether an update was needed.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_rights_holder(mn, pids, subjects = "http://orcid.org/0000-000X-XXXX-XXXX") ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") set_rights_holder(mn, pids, subjects = "http://orcid.org/0000-000X-XXXX-XXXX") ## End(Not run)
Show the indexing status of a set of PIDs.
show_indexing_status(mn, pids)
show_indexing_status(mn, pids)
mn |
(MNode) The Member Node to query. |
pids |
(character/list) One or more PIDs. |
NULL
## Not run: # Create a package then check its indexing status library(dataone) mn <- MNode(...) pkg <- create_dummy_package(mn) show_indexing_status(mn, pkg) ## End(Not run)
## Not run: # Create a package then check its indexing status library(dataone) mn <- MNode(...) pkg <- create_dummy_package(mn) show_indexing_status(mn, pkg) ## End(Not run)
This function creates an EML physical object based on what's in the System Metadata of an object. Note that it sets an Online Distribution URL of the DataONE v2 resolve service for the PID.
sysmeta_to_eml_physical(sysmeta)
sysmeta_to_eml_physical(sysmeta)
sysmeta |
(SystemMetadata) One or more System Metadata objects. |
(list) A list of physical objects.
## Not run: # Generate EML physical object from a system metadata object sm <- getSystemMetadata(mn, pid) sysmeta_to_eml_physical(sm) ## End(Not run)
## Not run: # Generate EML physical object from a system metadata object sm <- getSystemMetadata(mn, pid) sysmeta_to_eml_physical(sm) ## End(Not run)
Formats the eml file name based on the dataset title
title_to_file_name(title)
title_to_file_name(title)
title |
(character) title of the dataset |
(character) file path with underscores and extension (.xml)
title_to_file_name("Example title here")
title_to_file_name("Example title here")
This is a convenience wrapper around dataone::updateObject()
which copies in
fields from the old object's System Metadata such as the rightsHolder and
accessPolicy and updates only what needs to be changed.
update_object(mn, pid, path, format_id = NULL, new_pid = NULL, sid = NULL)
update_object(mn, pid, path, format_id = NULL, new_pid = NULL, sid = NULL)
mn |
(MNode) The Member Node to update the object on. |
pid |
(character) The PID of the object to update. |
path |
(character) The full path to the file to update with. |
format_id |
(character) Optional. The format ID to set for the object.
When not set, |
new_pid |
(character) Optional. Specify the PID for the new object. Defaults to automatically generating a new, random UUID-style PID. |
sid |
(character) Optional. Specify a Series ID (SID) to use for the new object. |
(character) The PID of the updated object.
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" my_path <- "/home/Documents/myfile.csv" new_pid <- update_object(mn, pid, my_path, format_id = "text/csv") ## End(Not run)
## Not run: cn <- CNode("STAGING2") mn <- getMNode(cn,"urn:node:mnTestKNB") pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" my_path <- "/home/Documents/myfile.csv" new_pid <- update_object(mn, pid, my_path, format_id = "text/csv") ## End(Not run)
This function first generates a new resource map RDF/XML document locally and
then uses the dataone::updateObject()
function to update an object on the
specified MN.
update_resource_map( mn, resource_map_pid, metadata_pid, data_pids = NULL, child_pids = NULL, other_statements = NULL, identifier = NULL, public = TRUE, check_first = TRUE )
update_resource_map( mn, resource_map_pid, metadata_pid, data_pids = NULL, child_pids = NULL, other_statements = NULL, identifier = NULL, public = TRUE, check_first = TRUE )
mn |
(MNode) The Member Node. |
resource_map_pid |
(character) The PID of the resource map to be updated. |
metadata_pid |
(character) The PID of the metadata object to go in the package. |
data_pids |
(character) The PID(s) of the data objects to go in the package. |
child_pids |
(character) The resource map PIDs of the packages to be nested under the package. |
other_statements |
(data.frame) Extra statements to add to the resource map. |
identifier |
(character) Manually specify the identifier for the new metadata object. |
public |
(logical) Whether or not to make the new resource map public read. |
check_first |
(logical) Optional. Whether to check the PIDs passed in as
arguments exist on the MN before continuing. This speeds up the function,
especially when |
If you only want to generate resource map RDF/XML, see generate_resource_map()
.
This function also can be used to add a new child packages to a parent package. For example, if you have:
Parent A B
and want to add C as a sibling package to A and B, e.g.:
Parent A B C
then you could use this function.
Note: This function currently replaces the rightsHolder on the resource map temporarily to allow updating but sets it back to the rightsHolder that was in place before the update.
(character) The PID of the updated resource map.
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") rm_pid <- "resource_map_urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" meta_pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" data_pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") rm_new <- update_resource_map(mn, rm_pid, meta_pid, data_pids) ## End(Not run)
## Not run: cn <- CNode('STAGING2') mn <- getMNode(cn,"urn:node:mnTestKNB") rm_pid <- "resource_map_urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" meta_pid <- "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe" data_pids <- c("urn:uuid:3e5307c4-0bf3-4fd3-939c-112d4d11e8a1", "urn:uuid:23c7cae4-0fc8-4241-96bb-aa8ed94d71fe") rm_new <- update_resource_map(mn, rm_pid, meta_pid, data_pids) ## End(Not run)
which_in_eml(doc, element, test)
which_in_eml(doc, element, test)
doc |
(list) An EML object. |
element |
(character) Element to evaluate. |
test |
(function/character) A function to evaluate (see examples). If test is a character,
will evaluate if |
please use eml_get_simple() and which() together instead
This function returns indices within an EML list that contain an instance where
test == TRUE
. See examples for more information.
Mitchell Maier [email protected]
## Not run: # Question: Which creators have a surName "Smith"? n <- which_in_eml(eml$dataset$creator, "surName", "Smith") # Answer: eml$dataset$creator[n] # Question: Which dataTables have an entityName that begins with "2016" n <- which_in_eml(eml$dataset$dataTable, "entityName", function(x) {grepl("^2016", x)}) # Answer: eml$dataset$dataTable[n] # Question: Which attributes in dataTable[[1]] have a numberType "natural"? n <- which_in_eml(eml$dataset$dataTable[[1]]$attributeList$attribute, "numberType", "natural") # Answer: eml$dataset$dataTable[[1]]$attributeList$attribute[n] #' # Question: Which dataTables have at least one attribute with a numberType "natural"? n <- which_in_eml(eml$dataset$dataTable, "numberType", function(x) {"natural" %in% x}) # Answer: eml$dataset$dataTable[n] ## End(Not run)
## Not run: # Question: Which creators have a surName "Smith"? n <- which_in_eml(eml$dataset$creator, "surName", "Smith") # Answer: eml$dataset$creator[n] # Question: Which dataTables have an entityName that begins with "2016" n <- which_in_eml(eml$dataset$dataTable, "entityName", function(x) {grepl("^2016", x)}) # Answer: eml$dataset$dataTable[n] # Question: Which attributes in dataTable[[1]] have a numberType "natural"? n <- which_in_eml(eml$dataset$dataTable[[1]]$attributeList$attribute, "numberType", "natural") # Answer: eml$dataset$dataTable[[1]]$attributeList$attribute[n] #' # Question: Which dataTables have at least one attribute with a numberType "natural"? n <- which_in_eml(eml$dataset$dataTable, "numberType", function(x) {"natural" %in% x}) # Answer: eml$dataset$dataTable[n] ## End(Not run)