Skip to contents

Loading Data

Functions for loading protein sequences and effector prediction scores from FASTA files and omnieff output.

load_proteins()
Load proteins from multiple assemblies
load_fasta()
Load a single FASTA file
load_scores()
Load effector scores from CSV
load_name_mapping()
Load name mapping from CSV

Creating Objects

Constructors for the core S3 classes used throughout the package.

new_protein_set()
Create a protein_set object
new_protein_collection()
Create a protein_collection object
new_orthogroup_result()
Create an orthogroup_result object
new_pa_matrix()
Create a pa_matrix object

Clustering

Functions for clustering proteins into orthogroups using sequence similarity.

cluster_proteins()
Cluster proteins across assemblies

Matrix Building

Build presence/absence matrices from clustering results.

build_pa_matrix()
Build presence/absence matrix from clustering results
filter_by_score()
Filter presence/absence matrix by score
`[`(<pa_matrix>)
Subset a pa_matrix object
as.data.frame(<pa_matrix>)
Convert pa_matrix to data.frame

Singletons

Functions for working with unclustered proteins (singletons).

get_singletons()
Get singletons from clustering result
n_singletons()
Count singletons
singletons_by_assembly()
Summarize singletons by assembly

Visualization

Publication-ready plots for exploring presence/absence patterns.

plot_heatmap()
Plot presence/absence heatmap
plot_upset()
Plot UpSet diagram of orthogroup sharing
plot_scores()
Plot effector score distributions
plot_dendro()
Plot assembly clustering dendrogram
plot_pan_structure()
Plot pan-genome structure
plot_assembly_composition()
Plot assembly composition

Utilities

Helper functions for external tool management.

check_tool_installed()
Check if an external tool is installed
find_tool()
Find an external tool