Class for representing a group of TileDB groups/arrays that consitute a SOMA (stack of matrices, annotated), which includes:

Public fields

obs

AnnotationDataframe object containing observation-aligned annotations

var

AnnotationDataframe object containing variable-aligned annotations

X

named list of AssayMatrix object containing matrix-like assay data with string dimensions obs_id and var_id that align to the dimensions of the obs and var arrays, respectively.

obsm

named list of AnnotationMatrix objects aligned with obs

varm

named list of AnnotationMatrix objects aligned with var

obsp

named list of AnnotationPairwiseMatrix objects aligned with obs

varp

named list of AnnotationPairwiseMatrix objects aligned with var

uns

Named list of unstructured objects.

Methods

Inherited methods


Method new()

Create a new SOMA. The existing array group is opened at the specified array uri if one is present, otherwise a new array group is created.

Usage

SOMA$new(uri, verbose = TRUE, config = NULL, ctx = NULL)

Arguments

uri

URI of the TileDB group

verbose

Print status messages

config

optional configuration

ctx

optional tiledb context


Method set_query()

Set query parameters to slice by dimension values or filter by attribute values.

Usage

SOMA$set_query(
  obs_ids = NULL,
  var_ids = NULL,
  obs_attr_filter = NULL,
  var_attr_filter = NULL
)

Arguments

obs_ids, var_ids

character vector containing observation- or variable-identifiers.

obs_attr_filter, var_attr_filter

a TileDB query condition for attribute filtering pushdown.

Details

A SOMA can be filtered in two ways:

  1. dimension slicing: vectors of cell- or feature-identifiers passed to obs_ids and/or var_ids, respectively, which are applied to the selected ranges of member arrays with the appropriate dimension(s).

  2. attribute filtering: logical expressions that reference attributes within the obs and var arrays are applied to each array's query condition.

Dimension slicing is applied whenever an array member is accessed, causing only data for the specified identifiers to be read into memory.

Attribute filters are applied immediately to obs and/or var and the identifiers that pass the specified conditions are applied to the selected ranges of member arrays with the appropriate dimension(s).

Filters are applied automatically to all members of a SOMA with the exception of uns


Method reset_query()

Reset query dimensions and attribute filters.

Usage

SOMA$reset_query()

Returns

NULL


Method from_seurat_assay()

Convert a Seurat Assay to a TileDB-backed sc_group.

Usage

SOMA$from_seurat_assay(
  object,
  obs = NULL,
  var = TRUE,
  layers = c("counts", "data", "scale.data")
)

Arguments

object

A SeuratObject::Assay object

obs

An optional data.frame containing annotations for cell/sample-level observations. If no annotations are provided and the obs array doesn't yet exist, an array with 0 attributes is created.

var

Should the Assay's' feature-level annotations be ingested into the var array? If FALSE and the var array does not yet exist then var is created as an array with 0 attributes.

layers

A vector of assay layer names to ingest. Must be some combination of "counts", "data", "scale.data".

Details

Assay data

The SeuratObject::Assay class stores different transformations of an assay in the counts, data, and scale.data slots. Data from each of these slots is ingested into a separate layer of the X group, named for the corresponding slot.

By default Seurat populates the data slot with a reference to the same data stored in counts. To avoid ingesting redundant data, we check to see if counts and data are identical and skip the data slot if they are.

Annotations

Cell- and feature-level annotations are stored in the obs and var arrays, respectively. These arrays are always created during the initial ingestion in order to maintain the full set of cell and feature identifiers in the array dimension.

Variable features

Variable features in the var.features slot are maintained by creating a highly_variable attribute in var that records 1 or 0 for each feature indicating whether it was a variable feature or not.

Metadata
  • key (optional): Contains value of the the Seurat Assay's key slot if it is set.


Method to_seurat_assay()

Convert to a SeuratObject::Assay object.

Usage

SOMA$to_seurat_assay(
  layers = c("counts", "data", "scale.data"),
  min_cells = 0,
  min_features = 0,
  check_matrix = FALSE,
  batch_mode = FALSE,
  ...
)

Arguments

layers

A vector of assay layer names to retrieve. Must match one or more of the available X AssayMatrix layers.

min_cells

Include features detected in at least this many cells. Will subset the counts matrix as well. To reintroduce excluded features, create a new object with a lower cutoff.

min_features

Include cells where at least this many features are detected.

check_matrix

Check counts matrix for NA, NaN, Inf, and non-integer values

batch_mode

logical, if TRUE, batch query mode is enabled for retrieving X layers. See AssayMatrix$to_dataframe() for more information.

...

Arguments passed to SeuratObject::as.sparse


Method add_seurat_dimreduction()

Convert a SeuratObject::DimReduc object

Usage

SOMA$add_seurat_dimreduction(object, technique = NULL, metadata = NULL)

Arguments

object

A SeuratObject::DimReduc object

technique

Name of the dimensional reduction technique. By default, the key slot is used to determine the technique.

metadata

Named list of metadata to add.

Details

On-Disk Format

Seurat DimReduc objects contain a variety of slots to accommodate the various types of results produced by each of the supported dimensional reduction techniques. Each slot is stored as an AnnotationMatrix object in the obsm or varm slot group for the assay, depending whether the data is observation- or variable-aligned. The individual arrays are named dimreduction_<technique>.

Metadata
  • dimreduction_technique: Name of the dimensional reduction technique used.

  • dimreduction_key: String prefix used in the dimensional reduction results column names (required by Seurat)


Method get_seurat_dimreduction()

Convert to a SeuratObject::DimReduc object.

Usage

SOMA$get_seurat_dimreduction(technique = NULL, batch_mode = FALSE)

Arguments

technique

Name of the dimensionality reduction technique. Used to identify which obsm/varm array will be retrieved. If NULL, we default to the first obsm/dimreduction_ array.

batch_mode

logical, if TRUE, batch query mode is enabled for retrieving X layers. See AssayMatrix$to_dataframe() for more information.


Method get_seurat_dimreductions_list()

Retrieve a list of all SeuratObject::DimReduc objects.

Usage

SOMA$get_seurat_dimreductions_list(batch_mode = FALSE)

Arguments

batch_mode

logical, if TRUE, batch query mode is enabled for retrieving X layers. See AssayMatrix$to_dataframe() for more information.


Method to_seurat_object()

Convert to a SeuratObject::Seurat object.

Usage

SOMA$to_seurat_object(project = "SeuratProject")

Arguments

project

SeuratObject::Project name for the Seurat object


Method to_summarized_experiment()

Convert to a SummarizedExperiment::SummarizedExperiment object.

Usage

SOMA$to_summarized_experiment(
  layers = c("counts", "data", "scale.data"),
  batch_mode = FALSE
)

Arguments

layers

A vector of assay layer names to retrieve. Must match one or more of the available X AssayMatrix layers. If layers is named (e.g., c(logdata = "counts")) the assays will adopt the names of the layers vector.

batch_mode

logical, if TRUE, batch query mode is enabled for retrieving X layers. See AssayMatrix$to_dataframe() for more information.

Details

Layers

Note that SummarizedExperiment::Assays() requires that all assays share identical dimensions, so the conversion will fail if scale.data created with a subset of features is included.


Method to_single_cell_experiment()

Convert to a Bioconductor SingleCellExperiment::SingleCellExperiment object.

Usage

SOMA$to_single_cell_experiment(
  layers = c("counts", "data"),
  batch_mode = FALSE
)

Arguments

layers

A vector of assay layer names to retrieve. Must match one or more of the available X AssayMatrix layers. If layers is named (e.g., c(logdata = "counts")) the assays will adopt the names of the layers vector.

batch_mode

logical, if TRUE, batch query mode is enabled for retrieving X layers. See AssayMatrix$to_dataframe() for more information.


Method get_annotation_matrix_arrays()

Retrieve AnnotationMatrix arrays in obsm/varm groups.

Usage

SOMA$get_annotation_matrix_arrays(prefix = NULL)

Arguments

prefix

String prefix to filter the array names.

Returns

A list with "obsm"/"varm" slots containing arrays matching the prefix.


Method get_annotation_pairwise_matrix_arrays()

Retrieve AnnotationPairwiseMatrix arrays in obsp/varp groups.

Usage

SOMA$get_annotation_pairwise_matrix_arrays(prefix = NULL)

Arguments

prefix

String prefix to filter the array names.

Returns

A list with "obsp"/"varp" slots containing arrays matching the prefix.


Method clone()

The objects of this class are cloneable with this method.

Usage

SOMA$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.