Skip to contents

Query the TSL Zenodo community for available TSP datasets.

Usage

list_datasets(
  community = "tsl-structures",
  sandbox = FALSE,
  query = NULL,
  full = FALSE,
  refresh = FALSE
)

Arguments

community

Zenodo community identifier. Default "tsl-structures".

sandbox

If TRUE, query sandbox.zenodo.org instead of production.

query

Optional text search query. Searches title and description via Zenodo API.

full

If TRUE, fetch full metadata from each dataset's datapackage.json. This returns additional columns like predictors, structure_count, and formats. Results are cached for 24 hours.

refresh

Force refresh. If full = TRUE, this refreshes the cached datapackage.json metadata.

Value

A tibble with dataset information. Basic columns (always returned):

  • record_id: Zenodo record ID

  • title: Dataset title

  • doi: Dataset DOI

  • version: Version string

  • created: Creation date

  • description: Short description

Additional columns when full = TRUE:

  • name: TSP package name

  • profile_version: TSP specification version

  • predictors: List of prediction sources (e.g., "alphafold3", "boltz2")

  • structure_count: Number of structures in dataset

  • formats: List of structure formats (e.g., "pdb", "cif")

  • total_size_bytes: Total dataset size in bytes

  • licenses: List of license information

  • contributors: List of contributors

  • tsp_created: TSP package creation timestamp

Examples

if (FALSE) { # \dontrun{
# List datasets from production community
list_datasets()

# List datasets from sandbox (for testing)
list_datasets(sandbox = TRUE)

# Search for datasets
list_datasets(query = "arabidopsis")

# Get full metadata for filtering
list_datasets(full = TRUE) |>
  filter(has_predictor(predictors, "alphafold3"))
} # }