Tools (gd.tl)¶

The tools module provides functions for model training, computing projections, dimensionality reduction, imputation, and differential analysis.

Model Training¶

gedi

Run GEDI batch correction and dimensionality reduction.

gedi2py.tools.gedi(adata, batch_key, *, n_latent=10, layer=None, layer2=None, max_iterations=100, track_interval=5, mode='Bsphere', ortho_Z=True, C=None, H=None, key_added='gedi', random_state=None, verbose=True, n_jobs=-1, copy=False)[source]¶

Run GEDI batch correction and dimensionality reduction.

Gene Expression Decomposition for Integration (GEDI) learns shared metagenes and sample-specific factors for batch effect correction.

Parameters:

adata (AnnData) – Annotated data matrix with cells as observations.
batch_key (str) – Key in adata.obs for batch/sample labels.
n_latent (int, default: 10) – Number of latent factors (K).
layer (str | None, default: None) – Layer to use instead of adata.X. If None, uses adata.X. For paired data (e.g., CITE-seq), this is the first count matrix.
layer2 (str | None, default: None) – Second layer for paired count data (M_paired mode). When specified along with layer, GEDI models the log-ratio: Yi = log((M1+1)/(M2+1)). This is useful for CITE-seq ADT/RNA ratios or similar paired assays.
max_iterations (int, default: 100) – Maximum number of optimization iterations.
track_interval (int, default: 5) – Interval for tracking convergence metrics.
mode (Literal['Bl2', 'Bsphere'], default: 'Bsphere') – Normalization mode for B matrices: “Bsphere” (recommended) or “Bl2”.
ortho_Z (bool, default: True) – Whether to orthogonalize Z matrix.
C (ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]] | None, default: None) – Gene × pathway prior matrix for pathway analysis. Optional.
H (ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]] | None, default: None) – Covariate × sample prior matrix. Optional.
key_added (str, default: 'gedi') – Base key for storing results. Results stored as: - adata.obsm[f'X_{key_added}']: Cell embeddings - adata.varm[f'{key_added}_Z']: Gene loadings - adata.uns[key_added]: Parameters and metadata
random_state (int | None, default: None) – Random seed for reproducibility. If None, uses global settings.
verbose (bool, default: True) – Whether to print progress messages.
n_jobs (int, default: -1) – Number of parallel jobs. -1 uses all available cores.
copy (bool, default: False) – Whether to return a copy of adata.

Return type:

AnnData | None

Returns:

Returns ``None` if copy=False`, else returns an :class:`~anndata.AnnData.`
Sets the following fields
``.obsm[‘X_gedi’]`` (numpy.ndarray) – Cell embeddings (n_cells × n_latent).
``.varm[‘gedi_Z’]`` (numpy.ndarray) – Shared metagenes (n_genes × n_latent).
``.uns[‘gedi’]`` (dict) – Model parameters and metadata.

Examples

Standard usage with log-transformed data:

>>> import gedi2py as gd
>>> import scanpy as sc
>>> adata = sc.read_h5ad("data.h5ad")
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> sc.pp.neighbors(adata, use_rep="X_gedi")
>>> sc.tl.umap(adata)
>>> gd.pl.embedding(adata, color="sample")

Paired data mode (e.g., CITE-seq with two count layers):

>>> # adata.layers['adt'] = ADT counts
>>> # adata.layers['rna'] = RNA counts (for same features)
>>> gd.tl.gedi(
...     adata,
...     batch_key="sample",
...     layer="adt",
...     layer2="rna",
...     n_latent=10
... )

Example

import gedi2py as gd

# Basic usage
gd.tl.gedi(adata, batch_key="sample", n_latent=10)

# With more options
gd.tl.gedi(
    adata,
    batch_key="sample",
    n_latent=20,
    max_iterations=200,
    mode="Bsphere",
    ortho_Z=True,
    key_added="gedi",
)

# Paired data mode (e.g., CITE-seq with ADT/RNA counts)
# GEDI models the log-ratio: Yi = log((M1+1)/(M2+1))
gd.tl.gedi(
    adata,
    batch_key="sample",
    layer="adt",       # First count matrix (numerator)
    layer2="rna",      # Second count matrix (denominator)
    n_latent=10,
)

Stored Results

adata.obsm['X_gedi']: Cell embeddings (n_cells × n_latent)
adata.varm['gedi_Z']: Gene loadings (n_genes × n_latent)
adata.uns['gedi']: Model parameters and metadata

Projections¶

`get_projection`	Compute and retrieve GEDI projections.
`compute_zdb`	Compute ZDB (shared manifold) projection.
`compute_db`	Compute DB (latent factor embedding) projection.
`compute_adb`	Compute ADB (pathway activity) projection.

gedi2py.tools.get_projection(adata, which='zdb', *, key='gedi', key_added=None, copy=False)[source]¶

Compute and retrieve GEDI projections.

Projections transform the learned GEDI parameters into interpretable representations:

ZDB: Shared manifold projection (genes × cells) = Z @ diag(D) @ B
DB: Latent factor embedding (K × cells) = diag(D) @ B
ADB: Pathway activity projection (pathways × cells) = C_rot @ A @ diag(D) @ B

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results in .uns[key].
which (Literal['zdb', 'db', 'adb'], default: 'zdb') – Which projection to compute: "zdb", "db", or "adb".
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str | None, default: None) – Key to store the result in adata.obsm. If None, defaults to X_{key}_{which} (e.g., X_gedi_zdb).
copy (bool, default: False) – If True, return a copy of the projection array instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns the projection as a numpy array.
Otherwise, stores the result in ``adata.obsm[key_added]`` and returns ``None`.`

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> gd.tl.get_projection(adata, "zdb")
>>> adata.obsm["X_gedi_zdb"]  # (n_cells, n_genes)

gedi2py.tools.compute_zdb(adata, *, key='gedi', key_added=None, copy=False)[source]¶

Compute ZDB (shared manifold) projection.

Alias for get_projection(adata, "zdb", ...).

See get_projection() for full documentation.

Return type:: AnnData | ndarray | None

gedi2py.tools.compute_db(adata, *, key='gedi', key_added=None, copy=False)[source]¶

Compute DB (latent factor embedding) projection.

Alias for get_projection(adata, "db", ...).

See get_projection() for full documentation.

Return type:: AnnData | ndarray | None

gedi2py.tools.compute_adb(adata, *, key='gedi', key_added=None, copy=False)[source]¶

Compute ADB (pathway activity) projection.

Alias for get_projection(adata, "adb", ...).

See get_projection() for full documentation.

Return type:: AnnData | ndarray | None

Projection Types

Type	Shape	Description
zdb	(n_genes, n_cells)	Full projection: shared manifold
db	(n_latent, n_cells)	Latent factors: batch-corrected cell embeddings
adb	(n_pathways, n_cells)	Pathway activity scores (requires C matrix)

Example

# Get DB projection (latent factors)
gd.tl.get_projection(adata, which="db")
db = adata.obsm['X_gedi_db']

# Compute ZDB (full projection)
zdb = gd.tl.compute_zdb(adata)

Embeddings¶

`svd`	Compute factorized SVD from GEDI decomposition.
`pca`	Compute PCA coordinates from GEDI decomposition.
`umap`	Compute UMAP embedding from GEDI results.

gedi2py.tools.svd(adata, *, key='gedi', copy=False)[source]¶

Compute factorized SVD from GEDI decomposition.

Computes SVD while preserving GEDI’s factorized structure: SVD(Z) × SVD(middle) × SVD(DB). This maintains biological interpretability by respecting the decomposition structure.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results in .uns[key].
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
copy (bool, default: False) – If True, return the SVD result as a dict instead of storing in adata.

Return type:

dict | None

Returns:

If ``copy=True``, returns dict with keys ````’d’:py:class:``, :py:class:``’u’:py:class:``, :py:class:``’v’:py:class:`.`
Otherwise, stores results in ``adata.uns[key][``’svd’:py:class:`]` and returns ``None`.`
The SVD components are –
- d: Singular values (K,)
- u: Left singular vectors (n_genes, K) - gene loadings
- v: Right singular vectors (n_cells, K) - cell embeddings

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> gd.tl.svd(adata)
>>> adata.uns["gedi"]["svd"]["d"]  # singular values

gedi2py.tools.pca(adata, *, n_components=None, key='gedi', key_added=None, copy=False)[source]¶

Compute PCA coordinates from GEDI decomposition.

PCA coordinates are computed as V @ diag(d) from the factorized SVD, where V are the right singular vectors (cell embeddings).

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results in .uns[key].
n_components (int | None, default: None) – Number of PCs to compute. If None, uses all K latent factors.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str | None, default: None) – Key to store PCA in adata.obsm. Defaults to X_{key}_pca.
copy (bool, default: False) – If True, return the PCA coordinates instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns PCA coordinates as numpy array (n_cells, n_components).
Otherwise, stores in ``adata.obsm[key_added]`` and returns ``None`.`

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> gd.tl.pca(adata, n_components=20)
>>> adata.obsm["X_gedi_pca"]

gedi2py.tools.umap(adata, *, n_neighbors=15, min_dist=0.1, n_components=2, metric='euclidean', input_key='pca', key='gedi', key_added=None, random_state=None, copy=False)[source]¶

Compute UMAP embedding from GEDI results.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
n_neighbors (int, default: 15) – Size of local neighborhood for UMAP.
min_dist (float, default: 0.1) – Minimum distance between points in the embedding.
n_components (int, default: 2) – Dimensionality of the UMAP embedding.
metric (str, default: 'euclidean') – Distance metric for neighbor search.
input_key (Literal['pca', 'db', 'zdb'], default: 'pca') – Which GEDI representation to use as input: - "pca": PCA coordinates (default) - "db": DB latent factor embedding - "zdb": ZDB shared manifold projection
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str | None, default: None) – Key to store UMAP in adata.obsm. Defaults to X_{key}_umap.
random_state (int | None, default: None) – Random seed for reproducibility. If None, uses settings.random_state.
copy (bool, default: False) – If True, return UMAP coordinates instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns UMAP coordinates as numpy array (n_cells, n_components).
Otherwise, stores in ``adata.obsm[key_added]`` and returns ``None`.`

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> gd.tl.umap(adata, n_neighbors=30)
>>> gd.pl.embedding(adata, basis="X_gedi_umap", color="cell_type")

Example

# Compute all embeddings
gd.tl.pca(adata)
gd.tl.umap(adata)

# Access results
pca_coords = adata.obsm['X_gedi_pca']
umap_coords = adata.obsm['X_gedi_umap']

Imputation¶

`impute`	Compute imputed expression values from GEDI model.
`variance`	Compute gene variance explained by GEDI model.
`dispersion`	Compute gene dispersion from GEDI model.

gedi2py.tools.impute(adata, *, samples=None, key='gedi', layer_added=None, copy=False)[source]¶

Compute imputed expression values from GEDI model.

Imputed values are computed as the expected expression under the GEDI model: Y_imputed = Z @ Q_i @ D @ B_i + o + o_i + s_i for each sample.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
samples (list[int] | None, default: None) – List of sample indices to impute. If None, imputes all samples.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
layer_added (str | None, default: None) – Layer name to store imputed values. If None, defaults to {key}_imputed.
copy (bool, default: False) – If True, return the imputed matrix instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns imputed expression matrix (n_cells, n_genes).
Otherwise, stores in ``adata.layers[layer_added]`` and returns ``None`.`

Notes

The imputed expression is computed sample-by-sample to preserve the sample-specific components (Q_i, o_i, s_i).

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> gd.tl.impute(adata)
>>> adata.layers["gedi_imputed"]

gedi2py.tools.variance(adata, *, key='gedi', copy=False)[source]¶

Compute gene variance explained by GEDI model.

Variance is computed across the imputed expression values, representing the systematic variation captured by the model.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
copy (bool, default: False) – If True, return variance vector instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns variance vector (n_genes,).
Otherwise, stores in ``adata.var[``’{key}_variance’:py:class:`]` and returns ``None`.`

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> gd.tl.variance(adata)
>>> adata.var["gedi_variance"]

gedi2py.tools.dispersion(adata, *, key='gedi', copy=False)[source]¶

Compute gene dispersion from GEDI model.

Dispersion is computed as variance / mean (coefficient of variation squared) from the imputed expression values.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
copy (bool, default: False) – If True, return dispersion vector instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns dispersion vector (n_genes,).
Otherwise, stores in ``adata.var[``’{key}_dispersion’:py:class:`]` and returns ``None`.`

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", n_latent=10)
>>> gd.tl.dispersion(adata)
>>> adata.var["gedi_dispersion"]

Example

# Impute denoised expression
gd.tl.impute(adata)
imputed = adata.layers['gedi_imputed']

# Compute variance and dispersion
gd.tl.variance(adata)
gd.tl.dispersion(adata)

Differential Expression¶

`differential`	Compute differential expression effects from GEDI model.
`diff_q`	Compute cell-specific differential expression (diffQ).
`diff_o`	Compute global offset differential effect (diffO).

gedi2py.tools.differential(adata, contrast, *, mode='full', include_offset=True, key='gedi', key_added=None, copy=False)[source]¶

Compute differential expression effects from GEDI model.

Uses the sample-level covariate matrix H and learned regression coefficients (Rk, Ro) to compute differential expression effects for a given contrast.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
contrast (ndarray | list) – Contrast vector of length L (number of covariates). Specifies the linear combination of covariate effects to compute.
mode (Literal['full', 'offset', 'metagene'], default: 'full') – Type of differential effect to compute: - "full": Full cell-specific effect (diffQ, J × N matrix) - "offset": Global gene offset effect (diffO, J vector) - "metagene": Effect on metagene loadings
include_offset (bool, default: True) – If True and mode is "full", add the offset effect (diffO) to the cell-specific effect.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str | None, default: None) – Key to store results. Defaults depend on mode: - "full": adata.layers['{key}_diff'] - "offset": adata.var['{key}_diff_offset']
copy (bool, default: False) – If True, return the result instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

Depending on mode and copy
- ``mode=”full”``, ``copy=True`` ((n_cells, n_genes) array)
- ``mode=”offset”``, ``copy=True`` ((n_genes,) array)
- Otherwise (None (stores in adata))

Notes

Differential effects require that a covariate matrix H was provided during model training via gd.tl.gedi(..., H=covariate_matrix).

The contrast vector defines a linear combination of covariate effects. For example, with covariates [treatment, sex], a contrast of [1, 0] computes the treatment effect, while [1, -1] computes the interaction.

Examples

>>> import gedi2py as gd
>>> import numpy as np
>>> # Assuming H is (n_samples, 2) with [treatment, control]
>>> gd.tl.gedi(adata, batch_key="sample", H=H_matrix)
>>> contrast = np.array([1, -1])  # treatment - control
>>> gd.tl.differential(adata, contrast)
>>> adata.layers["gedi_diff"]  # (n_cells, n_genes)

gedi2py.tools.diff_q(adata, contrast, *, include_offset=True, key='gedi', key_added=None, copy=False)[source]¶

Compute cell-specific differential expression (diffQ).

Alias for differential(adata, contrast, mode="full", ...).

See differential() for full documentation.

Return type:: AnnData | ndarray | None

gedi2py.tools.diff_o(adata, contrast, *, key='gedi', key_added=None, copy=False)[source]¶

Compute global offset differential effect (diffO).

Alias for differential(adata, contrast, mode="offset", ...).

See differential() for full documentation.

Return type:: AnnData | ndarray | None

Example

import numpy as np

# Create contrast: sample 0 vs sample 1
n_samples = len(adata.obs['sample'].unique())
contrast = np.zeros(n_samples)
contrast[0] = 1
contrast[1] = -1

# Compute differential expression
gd.tl.differential(adata, contrast=contrast)

# Access results
de = adata.varm['gedi_differential']

Pathway Analysis¶

`pathway_associations`	Compute pathway-gene associations from GEDI model.
`pathway_scores`	Compute per-cell pathway activity scores.
`top_pathway_genes`	Get top genes associated with a pathway.

gedi2py.tools.pathway_associations(adata, *, sparse_output=False, key='gedi', key_added=None, copy=False)[source]¶

Compute pathway-gene associations from GEDI model.

Uses the pathway coefficient matrix A and gene loading matrix Z to compute associations between pathways and genes, accounting for the learned latent structure.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
sparse_output (bool, default: False) – If True, return a sparse matrix. Useful for large pathway sets.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str | None, default: None) – Key to store results in adata.varm. Defaults to {key}_pathway_assoc.
copy (bool, default: False) – If True, return the association matrix instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns pathway association matrix (n_genes, n_pathways).
Otherwise, stores in ``adata.varm[key_added]`` and returns ``None`.`

Notes

Pathway associations require that a gene-level prior matrix C was provided during model training via gd.tl.gedi(..., C=pathway_matrix).

The association matrix represents how strongly each gene is associated with each pathway through the learned latent factors.

Examples

>>> import gedi2py as gd
>>> # C is (n_genes, n_pathways) pathway membership matrix
>>> gd.tl.gedi(adata, batch_key="sample", C=pathway_matrix)
>>> gd.tl.pathway_associations(adata)
>>> adata.varm["gedi_pathway_assoc"]  # (n_genes, n_pathways)

gedi2py.tools.pathway_scores(adata, *, key='gedi', key_added=None, copy=False)[source]¶

Compute per-cell pathway activity scores.

Uses the ADB projection (pathway activity) to compute pathway scores for each cell.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str | None, default: None) – Key to store results in adata.obsm. Defaults to X_{key}_pathway_scores.
copy (bool, default: False) – If True, return the score matrix instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns pathway score matrix (n_cells, n_pathways).
Otherwise, stores in ``adata.obsm[key_added]`` and returns ``None`.`

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", C=pathway_matrix)
>>> gd.tl.pathway_scores(adata)
>>> adata.obsm["X_gedi_pathway_scores"]  # (n_cells, n_pathways)

gedi2py.tools.top_pathway_genes(adata, pathway_idx, *, n_genes=20, key='gedi')[source]¶

Get top genes associated with a pathway.

Returns the genes with the highest association scores for a given pathway, based on the pathway association matrix.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
pathway_idx (int) – Index of the pathway to query.
n_genes (int, default: 20) – Number of top genes to return.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.

Return type:

List of gene names with highest pathway associations.

Examples

>>> import gedi2py as gd
>>> gd.tl.pathway_associations(adata)
>>> top_genes = gd.tl.top_pathway_genes(adata, pathway_idx=0, n_genes=10)
>>> print(top_genes)

Example

# Run GEDI with pathway prior
C = load_pathway_matrix()  # genes × pathways
gd.tl.gedi(adata, batch_key="sample", C=C)

# Get pathway associations
gd.tl.pathway_associations(adata)

# Get top genes per pathway
top_genes = gd.tl.top_pathway_genes(adata, pathway_idx=0, n_genes=10)

Dynamics / Trajectory¶

`vector_field`	Compute vector field for trajectory between two conditions.
`gradient`	Compute gradient of pathway activity across cells.
`pseudotime`	Compute pseudotime ordering based on GEDI embeddings.

gedi2py.tools.vector_field(adata, start_contrast, end_contrast, *, n_steps=10, key='gedi', key_added=None, copy=False)[source]¶

Compute vector field for trajectory between two conditions.

Uses the differential expression framework to compute a vector field representing the transcriptional trajectory from one condition to another.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
start_contrast (ndarray | list) – Contrast vector defining the starting condition.
end_contrast (ndarray | list) – Contrast vector defining the ending condition.
n_steps (int, default: 10) – Number of interpolation steps between conditions.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str | None, default: None) – Key to store results. Defaults to {key}_vector_field.
copy (bool, default: False) – If True, return the vector field dict instead of storing in adata.

Return type:

AnnData | dict | None

Returns:

If ``copy=True``, returns dict with keys –
- vectors: (n_steps, n_genes) array of expression changes
- positions: (n_steps, L) array of contrast positions
Otherwise, stores in ``adata.uns[key_added]`` and returns ``None`.`

Notes

The vector field represents the direction and magnitude of transcriptional change at each point along the trajectory from start to end condition.

This requires that a covariate matrix H was provided during model training.

Examples

>>> import gedi2py as gd
>>> import numpy as np
>>> # Define trajectory from control to treatment
>>> start = np.array([0, 1])  # control
>>> end = np.array([1, 0])    # treatment
>>> gd.tl.vector_field(adata, start, end, n_steps=20)
>>> adata.uns["gedi_vector_field"]

gedi2py.tools.gradient(adata, pathway_idx=None, *, key='gedi', copy=False)[source]¶

Compute gradient of pathway activity across cells.

Uses the pathway coefficient matrix A and cell loadings B to compute the gradient of pathway activity in the latent space.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
pathway_idx (int | None, default: None) – Index of the pathway to compute gradient for. If None, computes gradients for all pathways.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
copy (bool, default: False) – If True, return the gradient array instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True`` –
- If pathway_idx is specified: (n_cells, K) gradient array
- If pathway_idx is None: (n_cells, K, n_pathways) gradient array
Otherwise, stores in ``adata.obsm[``’{key}_gradient’:py:class:`]` and returns ``None`.`

Notes

The gradient represents the direction in latent space that maximally increases pathway activity. This can be used for trajectory analysis or identifying cells transitioning along a pathway.

This requires that a pathway prior matrix C was provided during model training.

Examples

>>> import gedi2py as gd
>>> gd.tl.gedi(adata, batch_key="sample", C=pathway_matrix)
>>> gd.tl.gradient(adata, pathway_idx=0)
>>> adata.obsm["gedi_gradient"]  # direction to increase pathway 0

gedi2py.tools.pseudotime(adata, start_cells, *, use_rep=None, key='gedi', key_added='gedi_pseudotime', copy=False)[source]¶

Compute pseudotime ordering based on GEDI embeddings.

Uses diffusion-based pseudotime computation on the GEDI latent representation to order cells along a trajectory.

Parameters:

adata (AnnData) – Annotated data matrix with GEDI results.
start_cells (ndarray | list) – Indices or boolean mask of cells to use as trajectory start points.
use_rep (str | None, default: None) – Representation to use. Defaults to X_{key}_pca or X_{key}.
key (str, default: 'gedi') – Key in adata.uns where GEDI results are stored.
key_added (str, default: 'gedi_pseudotime') – Key to store pseudotime in adata.obs.
copy (bool, default: False) – If True, return pseudotime array instead of storing in adata.

Return type:

AnnData | ndarray | None

Returns:

If ``copy=True``, returns pseudotime array (n_cells,).
Otherwise, stores in ``adata.obs[key_added]`` and returns ``None`.`

Notes

This is a simple geodesic distance-based pseudotime. For more sophisticated trajectory inference, consider using specialized tools like scanpy’s PAGA or scvelo.

Examples

>>> import gedi2py as gd
>>> # Use cells in cluster 0 as starting point
>>> start_cells = adata.obs["cluster"] == "0"
>>> gd.tl.pseudotime(adata, start_cells)
>>> adata.obs["gedi_pseudotime"]

Example

# Define start and end contrasts
start = np.array([1, 0, 0, 0])
end = np.array([0, 0, 0, 1])

# Compute vector field
gd.tl.vector_field(adata, start_contrast=start, end_contrast=end)

# Compute pseudotime
gd.tl.pseudotime(adata, root_contrast=start)