Boltz2 Format¶
Expected input format for tsp-maker parse boltz2.
Directory Structure¶
boltz2_outputs/
├── P12345/ ← Folder name becomes protein ID
│ └── boltz_results_P12345/
│ ├── predictions/
│ │ └── P12345/
│ │ ├── P12345_model_0.pdb
│ │ ├── P12345_model_1.pdb
│ │ ├── confidence_P12345_model_0.json
│ │ ├── confidence_P12345_model_1.json
│ │ ├── plddt_P12345_model_0.npz
│ │ ├── plddt_P12345_model_1.npz
│ │ ├── pae_P12345_model_0.npz
│ │ └── pae_P12345_model_1.npz
│ └── processed/
│ └── manifest.json
├── Q67890/ ← Folder name becomes protein ID
│ └── ...
└── ...
Folder Names = Protein IDs
The top-level folder name (e.g., P12345) becomes the protein ID in your dataset. Ensure folders are named with the identifiers you want. See Protein ID Rules.
Directory Detection¶
The parser looks for Boltz2 outputs in this order:
- Directory named
boltz_results_* - Directory containing
predictions/ - Input directory itself if it matches above
Required Files¶
Model Files¶
For each model:
| File | Description |
|---|---|
{name}_model_{i}.pdb |
Structure file |
confidence_{name}_model_{i}.json |
Confidence metrics |
Optional Files¶
| File | Description |
|---|---|
plddt_{name}_model_{i}.npz |
Per-residue pLDDT |
pae_{name}_model_{i}.npz |
PAE matrix |
pde_{name}_model_{i}.npz |
PDE matrix |
confidence JSON¶
{
"confidence_score": 0.85,
"ptm": 0.82,
"iptm": 0.78,
"complex_plddt": 85.2,
"complex_pde": 1.2,
"chains_ptm": [0.85, 0.80],
"pair_chains_iptm": [[0.0, 0.78], [0.78, 0.0]]
}
manifest.json¶
Used to determine monomer vs multimer:
Ranking¶
Models are ranked by confidence_score:
- All
*_model_*.pdbfiles found - Corresponding confidence JSON loaded
- Sorted by
confidence_score(descending) - Top N models extracted
Extracted Metrics¶
| Metric | Source |
|---|---|
confidence_score |
confidence JSON |
ranking_score |
Same as confidence_score |
ptm |
confidence JSON |
iptm |
confidence JSON (multimers) |
complex_plddt |
confidence JSON |
complex_pde |
confidence JSON |
plddt_mean/min/median |
From plddt npz |
pae_mean/min/max |
From pae npz |
Multimer Metrics¶
For multimers (detected from manifest or iptm > 0):
protein_iptmligand_iptmcomplex_iplddtcomplex_ipdechains_ptmpair_chains_iptm
Example Command¶
Output Naming¶
Files are named with _BZ2_ suffix:
P12345_BZ2_1.pdbP12345_BZ2.jsonP12345_BZ2_1.npy
Ligand Support¶
Boltz2 supports protein-ligand predictions. Ligand information is preserved in the confidence metrics but the structure files contain standard PDB format.