NEWS.md
    CsparseMatrix validation issue present 1.5.2r-lib/actions/setup-r-dependencies@v2 is now used to install dependenciessoma_legacy_validity, is now attached to all arrays created by SOMA objects. By default this value is "false" unless the TileDB-R legacy validity mode was enabled at creation time (i.e., r.legacy_validity_mode). When reading arrays from disk, the AnnotationDataFrame class will check for this tag on initialization and when performing reads or writes. If the tag is present and set to "true", legacy validity mode is enabled globally (as its not possible to set on a per-array basis). Legacy validity mode is also enabled when reading AnnotationDataFrame arrays that lack the tag, as this indicates the array was created with an older version of the package. These checks are limited to AnnotationDataFrame arrays because the incorrect validity map values only affect nullable string attributes. See TileDB-R’s release notes for more information.AnnotationMatrix’s to_matrix() method now supports batched reads via the batch_mode argument. This functionality can also be leveraged from SOMA’s get_seurat_dimreductions_list() and get_seurat_dimreduction() methods. (#86)SOMACollection’s to_seurat() method gains a somas argument that makes it possible to select a subset of SOMAs and X layers to be retrieved. (#89)setup-r GitHub Action to v2 (#90)batch_mode option to methods that read X layers (i.e., AssayMatrix objects) into memory. When enabled, batch mode leverages the family of Batched classes added to tiledb-r in version 0.14.0 to detect partial query results and resubmit until all results are retrieved. This feature is currently disabled by default and only applies to X layers (which are typically the largest arrays). You can enable batch mode from the following methods:
SOMACollection$to_seurat()SOMA$to_seurat_assay()SOMA$to_summarized_experiment()SOMA$to_single_cell_experiment()AssayMatrix$to_dataframe()AssayMatrix$to_matrix()TileDBGroups with remove_member()
vignette("quickstart") which provides new users with a high-level overview of the packagedataset_seurat_pbmc3k() to download the pbmc 3k dataset from 10X and import as a Seurat object without requiring any extra dependencies. This dataset is used in the new vignetteMakefile to add targets for generating pre-computed vignettes and performing common dev operationsCONTRIBUTING.md to reference TileDB’s CoC and document the Makefile
scaled.data from a Seurat Assay as an attribute of the X array, along with counts/data. This is no longer necessary as each layer is now ingested into a separate array within the X group (#73).dgtmatrix_to_dataframe() was replaced with matrix_to_coo(), which converts Matrix-like objects to COO data frames much more efficiently (#75).pad_matrix() can now pad a matrix by adding empty rows (#79).has_dimnames() was replaced with is_labeled_matrix() for clarity (#79).AssayMatrix now respects the verbose optionSOMA now looks for a raw group and warns the user it will be ignored. Currently tiledbsc-py creates a raw group when converting anndata objects where .raw is populated. However, Seurat/BioC objects do not have an obvious place to store this data, so ignoring it improves compatibility.dgtmatrix_to_dataframe() function used to convert unordered dgTMatrix objects to COO data frames (#73).TileDBObject has been improved so that the class name is displayed first (#79).Seurat object (#80, thanks @dan11mcguire)This release changes the names of the 2 top-level classes in the tiledbsc package to follow new nomenclature adopted by the single-cell data model specification, which was implemented here. You can read more about the rationale for this change here.
Additionally, the misc slot has been renamed to uns. See below for details.
New class names
SCGroup is replaced by SOMA (stack of matrices, annotated)SCDataset is replaced by SOMACollection
There are no functional changes to either class. SOMA is a drop-in replacement for SCGroup and SOMACollection is a drop-in replacement for SCDataset. However, with the new names two of SOMACollection’s methods have changed accordingly:
scgroups field is now somas
scgroup_uris() is now soma_uris()
To ease the transition, the SCDataset and SCGroup classes are still available as aliases for SOMACollection and SOMA, respectively. However, they have been deprecated and will be removed in the future.
Previously, the SCDataset and SCGroup classes included a TileDB group called misc that was intended for miscellaneous/unstructured data. To better align with the SOMA specification this group has been renamed to uns. Practically, this means new SOMAs and SOMACollections will create TileDB groups named uns, rather than misc. And these groups can be accessed with the SOMA and SOMACollection classes using SOMA$uns.
For backwards compatibility: - if a misc group exists within a SOMACollection or SOMA on disk, it will be accessible via the uns field of the parent class - the deprecated SCDataset and SCGroup will continue to provide a misc field (actually an active binding that aliases the uns slot) so users can continue to use the old name
It’s now possible to read only a specific subset of data into memory.
The following classes now have a set_query() method:
TileDBArray and its subclassesAnnotationGroup and its subclassesSOMASOMACollectionWith set_query() you can specify:
See the new Filtering vignette for details.
TileDBObject base class to provide fields and methods common to both TileDBArray- and TileDBGroup-based classesarray_exists() and group_exists() methods have been deprecated in favor of the more general exists()
TileDBGroup class, TileDBArray now maintains a reference to the underlying array pointerobjects field to provide direct access to the underlying TileDB objectsconfig/ctx fields to AnnotationGroup
AnnotationDataframe gains ids() to retrieve all values from the array’s dimensionsoma_object_type and soma_encoding_version metadata are written to groups/arrays at write timeAnnotationDataframe$from_dataframe() no longer coerces logical columns to integers, as TileDB 2.10 provides support for BOOL data typesAnnotationArrays so updates will overwrite existing cellsImprove handling of Seurat objects with empty cell identities (#58).
tiledbsc now uses the enhanced Group API’s introduced in TileDB v2.8 and TileDB-R 0.12.0.
Note: The next version of tiledbsc will migrate to the new SOMA-based naming scheme described here. ## On-disk changes
Group-level metadata is now natively supported by TileDB so TileDBGroup-based classes no longer create nested __tiledb_group_metadata arrays for the purpose of storing group-level metadata.
See TileDB 2.8 release notes for additional changes.
TileDBGroup and its child classes:arrays field has been replaced with members, which includes both TileDB arrays and groupsget_array() has been replaced with get_member() which add a type argument to filter by object typecount_members(), list_members(), list_member_uris(), and add_member()
NEWS.md file to track changes to the packageSCGroup’s from_seurat_assay() method gained two new arguments: layers, to specify which Seurat Assay slots should be ingested, and var, to control whether feature-level metadata is ingestedSCGroup’s from_seurat_assay() method will no longer ingest the data slot if it is identical to counts
TileDBURI class for handling various URI formatsuri field for all TileDB(Array|Group)-based classes is now an active binding that retrieves the URI from the private tiledb_uri fieldX, obs, and var arrays more efficiently on disk (#50)active_ident attribute of the obs array (#56)