make_m2 (Integrated with Automatic Batching)

Creates the M2 matrix from a given m1_inclusion_matrix and eventdata with intelligent memory management. Automatically detects when the operation would exceed memory limits and switches to a batched sparse matrix approach.

Usage

make_m2(
  m1_inclusion_matrix,
  eventdata,
  batch_size = 5000,
  memory_threshold = 2e+09,
  force_fast = FALSE,
  multi_thread = FALSE,
  n_threads = 1,
  use_cpp = TRUE,
  verbose = FALSE
)

Arguments

m1_inclusion_matrix: A sparse matrix to be modified and used for creating the M2 matrix.
eventdata: A data.table containing event information with at least group_id and an index column.
batch_size: An integer specifying the number of groups to process per batch (default: 5000). Only used when batched processing is triggered.
memory_threshold: A numeric value representing the maximum number of rows allowed in the summary before switching to batched processing (default: 2e9, which is ~93% of 2^31).
force_fast: A logical flag to force fast processing regardless of size estimates (default: FALSE). WARNING: This may cause memory errors on large datasets.
multi_thread: A logical flag to enable parallel processing for batched operations (default: FALSE). Only used when batched processing is triggered. Requires parallel package.
n_threads: Number of threads for C++ parallel processing (default: 1). Only used when use_cpp = TRUE.
use_cpp: Logical flag to use fast C++ implementation (default: TRUE). Falls back to R implementation if FALSE.
verbose: A logical flag for detailed progress reporting (default: FALSE).

Value

A sparse matrix M2 with the dummy row removed and proper adjustments made.

Examples

junction_abundance_object <- load_toy_SJ_object()
m1_obj <- make_m1(junction_ab_object = junction_abundance_object)
#> Starting M1 matrix creation...
#> Combined eventdata from 1 samples
#> Found 915 unique junctions
#> Creating coordinate groups...
#> Start coordinate alternative events: 25 
#> End coordinate alternative events: 31 
#> Combined eventdata has 56 alternative splicing events
#> Processing junction abundance matrices...
#> Applying count threshold filtering...
#> Filtered from 56 to 23 events
#> Events removed: 33 
#> Finished processing M1.
#> 
#> Summary:
#>   Input junctions: 915 
#>   Alternative splicing events: 56 
#>   Events passing threshold: 23 
#>   Total cells: 4120 

# obtaining the m1 and eventdata
m1 <- m1_obj$m1_inclusion_matrix
eventdata <- m1_obj$event_data
m2 <- make_m2(m1_inclusion_matrix = m1, eventdata = eventdata)
#> Starting M2 matrix creation...
#> +-- Using C++ implementation for faster computation
#> |   |-- Events:  23 
#> |   |-- Cells:  4120 
#> |   +-- Groups:  16 
#> Finished M2 matrix creation.