tsp-maker¶
Create TSP (TSL Structure Package) datasets from protein structure predictions.
tsp-maker is a command-line tool that converts outputs from structure prediction tools (AlphaFold2, AlphaFold3, Boltz2) into the TSP format for distribution via Zenodo.
What is TSP?¶
TSP (TSL Structure Package) is a data standard for distributing protein structure prediction datasets. It extends the Frictionless Data Package specification for structural biology.
A TSP package contains:
- metadata.parquet - Per-structure statistics and annotations
- structures/ - Batched structure archives (PDB/mmCIF)
- predictions/scores.parquet - Prediction confidence scores
- predictions/pae/ - PAE (Predicted Aligned Error) matrices
- datapackage.json - Package manifest
TSP packages are consumed by the tslstructures R package.
Workflow Overview¶
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Predictor │ │ Intermediate │ │ TSP │
│ Outputs │────▶│ Format │────▶│ Package │
│ (AF2/AF3/BZ2) │ │ │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
tsp-maker parse tsp-maker build tsp-maker upload
│
▼
┌─────────────────┐
│ Zenodo │
│ tsl-structures│
│ community │
└─────────────────┘
Quick Example¶
# Parse AlphaFold3 outputs
tsp-maker parse af3 /data/af3_predictions /intermediate
# Build TSP package
tsp-maker build /intermediate /my-dataset \
--name my-structures \
--title "My Structure Dataset"
# Upload to Zenodo
tsp-maker upload /my-dataset --token $ZENODO_TOKEN --publish
Features¶
- Multi-predictor support - AlphaFold2, AlphaFold3, Boltz2
- Automatic batching - Structures grouped into manageable archives
- PAE extraction - Full PAE matrices preserved
- Zenodo integration - Direct upload with metadata
- Validation - Check TSP package conformance
Getting Started¶
- Install tsp-maker
- Follow the Quick Start guide
- Read the command reference