Title: | Harvest and register data package citations |
---|---|
Description: | Harvests data package citations from several API sources, including PLOS, Scopus, and Springer. This package uses modified functions from `rplos`, which is no longer maintained. |
Authors: | Jeanette Clark [aut, cre] , Matthew B. Jones [aut] , Maya Samet [aut] , Althea Marks [aut] |
Maintainer: | Jeanette Clark <[email protected]> |
License: | Apache License (>= 2.0) |
Version: | 1.1.0 |
Built: | 2024-10-25 05:32:46 UTC |
Source: | https://github.com/DataONEorg/scythe |
Search for citations in text across all APIs
citation_search(identifiers, sources = c("plos", "scopus", "springer", "xdd"))
citation_search(identifiers, sources = c("plos", "scopus", "springer", "xdd"))
identifiers |
a vector of identifiers to be searched for |
sources |
a vector indicating which sources to query (one or more of plos, scopus, springer) |
tibble of matching dataset and publication identifiers
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search(identifiers, sources = c("plos")) ## End(Not run)
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search(identifiers, sources = c("plos")) ## End(Not run)
This function searches for citations in PLOS. Requests are throttled at one identifier every 6 seconds so as to not overload the PLOS API. This function uses modified source code from the 'rplos' package, which is no longer maintained.
citation_search_plos(identifiers)
citation_search_plos(identifiers)
identifiers |
a vector of identifiers to be searched for |
tibble of matching dataset and publication identifiers
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search_plos(identifiers) ## End(Not run)
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search_plos(identifiers) ## End(Not run)
This function searches for citations in Scopus. Requests are throttled 9 requests/second so as to not overload the Scopus API.
citation_search_scopus(identifiers)
citation_search_scopus(identifiers)
identifiers |
a vector of identifiers to be searched for |
tibble of matching dataset and publication identifiers
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search_scopus(identifiers) ## End(Not run)
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search_scopus(identifiers) ## End(Not run)
This function searches for citations from Springer. It requires that an API key be obtained from [Springer](https://dev.springernature.com/) and set using 'scythe_set_key()'. Requests are throttled at one identifier every second so as to not overload the PLOS API.
citation_search_springer(identifiers)
citation_search_springer(identifiers)
identifiers |
a vector of identifiers to be searched for |
tibble of matching dataset and publication identifiers
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search_springer(identifiers) ## End(Not run)
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR") result <- citation_search_springer(identifiers) ## End(Not run)
This function searches for citations in xDD. Uses 'snippets/term' function in xDD API and searches through all of xDD corpus (not limited to full-text documents).
citation_search_xdd(identifiers)
citation_search_xdd(identifiers)
identifiers |
a vector of identifiers to be searched for, without hypertext transfer protocol: "https://" or "http://" |
tibble of publications and their identifiers that contain
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR", "10.18739/A29K97") result <- citation_search_xdd(identifiers) ## End(Not run)
## Not run: identifiers <- c("10.18739/A22274", "10.18739/A2D08X", "10.5063/F1T151VR", "10.18739/A29K97") result <- citation_search_xdd(identifiers) ## End(Not run)
This function is from the 'rplos' package, which is no longer maintained.
ploscompact(l)
ploscompact(l)
l |
a list |
Report estimated wait for rate limited queries
report_est_wait(n_queries, wait_seconds)
report_est_wait(n_queries, wait_seconds)
n_queries |
number of queries |
wait_seconds |
wait time in seconds between queries |
Look for API keys for services, which are represented as character strings.
scythe_get_key(source)
scythe_get_key(source)
source |
the name of the source service to look up |
Secrets are typically stored in a keyring named "scythe" (see the keyring package), or, alternatively, in an environment variable with a name identical to "source".
character the secret value, or NA if not set
## Not run: scythe_get_key("scopus_key") ## End(Not run)
## Not run: scythe_get_key("scopus_key") ## End(Not run)
This function sets API keys using the 'keyring' package. 'keyring' uses your operating system's credential store to securely keep track of key-value pairs. Running this function for the first time will prompt you to set a password for your keyring, should you need to lock or unlock it. See 'keyring::keyring_unlock' for more details.
scythe_set_key(source, secret)
scythe_set_key(source, secret)
source |
(char) Key source, one of "scopus" or "springer" |
secret |
(char) API key value |
This function is adapted from the searchplos in the 'rplos' package, which is no longer maintained.
searchplos( q = NULL, fl = "id", fq = NULL, sort = NULL, start = 0, limit = 10, sleep = 6, errors = "simple", proxy = NULL, callopts = list(), progress = NULL, ... )
searchplos( q = NULL, fl = "id", fq = NULL, sort = NULL, start = 0, limit = 10, sleep = 6, errors = "simple", proxy = NULL, callopts = list(), progress = NULL, ... )
q |
Search terms, eg: field:query |
fl |
Fields to return |
fq |
Fields to filter query on |
sort |
Sort results according to field |
start |
Record to start at for pagination |
limit |
Number of results to return for pagination |
sleep |
Seconds to wait between requests |
errors |
One of simple or complete |
proxy |
List of args for proxy connection |
callopts |
Optional curl options |
progress |
Optional logic for progress bar |
... |
Addtl Solr arguments |
This function is from the 'rplos' package, which is no longer maintained.
strextract(str, pattern)
strextract(str, pattern)
str |
A string |
pattern |
A regex pattern |
Write citation pairs
write_citation_pairs(citation_list, path, extra_fields = NULL)
write_citation_pairs(citation_list, path, extra_fields = NULL)
citation_list |
(data.frame) data.frame of citation pairs containing variables article_id and dataset_id |
path |
(char) path to write JSON citation pairs to |
extra_fields |
(char) list of extra fields to pass to bib2df::bib2df |
## Not run: pairs <- data.frame(article_id = "10.1371/journal.pone.0213037", dataset_id = "10.18739/A22274") write_citation_pairs(citation_list = pairs, path = "citation_pairs.json") ## End(Not run)
## Not run: pairs <- data.frame(article_id = "10.1371/journal.pone.0213037", dataset_id = "10.18739/A22274") write_citation_pairs(citation_list = pairs, path = "citation_pairs.json") ## End(Not run)