Skip to contents

Parses and processes spliced and unspliced gene expression matrices from one or more Velocyto output directories. The function applies barcode filtering using an external whitelist or filtered barcodes file, and optionally merges the results across samples into unified matrices.

Usage

make_velo_count(
  velocyto_dirs,
  sample_ids,
  whitelist_barcodes = NULL,
  use_internal_whitelist = TRUE,
  merge_counts = FALSE
)

Arguments

velocyto_dirs

A character vector or list of strings. Each element should be a path to a Velocyto output directory. Each directory must contain subdirectories (typically filtered or raw) with the required matrix files: spliced.mtx, unspliced.mtx, barcodes.tsv, and genes.tsv or features.tsv.

sample_ids

A character vector or list of unique sample identifiers corresponding to each entry in velocyto_dirs.

whitelist_barcodes

A list of character vectors. Each element should provide a whitelist of barcodes to retain for the corresponding sample. If NULL (default), the function will attempt to use the internally provided filtered barcodes when use_internal_whitelist = TRUE.

use_internal_whitelist

Logical (default TRUE). If TRUE, and whitelist_barcodes is NULL, the function uses the filtered barcode file (if present in the directory). If FALSE, all barcodes from the raw matrix will be used unless a whitelist is explicitly provided.

merge_counts

Logical (default FALSE). If TRUE, spliced and unspliced matrices across all samples are merged into two combined matrices (one for spliced, one for unspliced). If FALSE, the results are returned per sample.

Value

A list containing processed gene expression matrices:

  • If merge_counts = FALSE, returns a named list of sample-specific matrices. Each entry contains:

    spliced

    Sparse matrix of spliced transcript counts.

    unspliced

    Sparse matrix of unspliced transcript counts.

  • If merge_counts = TRUE, returns a list with two elements:

    spliced

    Merged sparse matrix of spliced counts across all samples.

    unspliced

    Merged sparse matrix of unspliced counts across all samples.

Details

The function assumes that each Velocyto directory follows the 10X-like structure typically produced by tools like Loom or Velocyto CLI. Barcode filtering ensures that only high-quality or selected barcodes are retained for downstream RNA velocity analysis.

When merging matrices, barcodes are prefixed with their corresponding sample ID to avoid collisions and preserve traceability.

Dependencies

Requires the Matrix package for sparse matrix operations and data.table for efficient file parsing.