NEWS.md
CsparseMatrix
validation issue present 1.5.2r-lib/actions/setup-r-dependencies@v2
is now used to install dependenciessoma_legacy_validity
, is now attached to all arrays created by SOMA
objects. By default this value is "false"
unless the TileDB-R legacy validity mode was enabled at creation time (i.e., r.legacy_validity_mode
). When reading arrays from disk, the AnnotationDataFrame
class will check for this tag on initialization and when performing reads or writes. If the tag is present and set to "true"
, legacy validity mode is enabled globally (as its not possible to set on a per-array basis). Legacy validity mode is also enabled when reading AnnotationDataFrame
arrays that lack the tag, as this indicates the array was created with an older version of the package. These checks are limited to AnnotationDataFrame
arrays because the incorrect validity map values only affect nullable string attributes. See TileDB-R’s release notes for more information.AnnotationMatrix
’s to_matrix()
method now supports batched reads via the batch_mode
argument. This functionality can also be leveraged from SOMA
’s get_seurat_dimreductions_list()
and get_seurat_dimreduction()
methods. (#86)SOMACollection
’s to_seurat()
method gains a somas
argument that makes it possible to select a subset of SOMA
s and X
layers to be retrieved. (#89)setup-r
GitHub Action to v2 (#90)batch_mode
option to methods that read X
layers (i.e., AssayMatrix
objects) into memory. When enabled, batch mode leverages the family of Batched
classes added to tiledb-r in version 0.14.0 to detect partial query results and resubmit until all results are retrieved. This feature is currently disabled by default and only applies to X
layers (which are typically the largest arrays). You can enable batch mode from the following methods:
SOMACollection$to_seurat()
SOMA$to_seurat_assay()
SOMA$to_summarized_experiment()
SOMA$to_single_cell_experiment()
AssayMatrix$to_dataframe()
AssayMatrix$to_matrix()
TileDBGroup
s with remove_member()
vignette("quickstart")
which provides new users with a high-level overview of the packagedataset_seurat_pbmc3k()
to download the pbmc 3k dataset from 10X and import as a Seurat
object without requiring any extra dependencies. This dataset is used in the new vignetteMakefile
to add targets for generating pre-computed vignettes and performing common dev operationsCONTRIBUTING.md
to reference TileDB’s CoC and document the Makefile
scaled.data
from a Seurat Assay
as an attribute of the X
array, along with counts
/data
. This is no longer necessary as each layer is now ingested into a separate array within the X
group (#73).dgtmatrix_to_dataframe()
was replaced with matrix_to_coo()
, which converts Matrix-like objects to COO data frames much more efficiently (#75).pad_matrix()
can now pad a matrix by adding empty rows (#79).has_dimnames()
was replaced with is_labeled_matrix()
for clarity (#79).AssayMatrix
now respects the verbose
optionSOMA
now looks for a raw
group and warns the user it will be ignored. Currently tiledbsc-py creates a raw
group when converting anndata objects where .raw
is populated. However, Seurat/BioC objects do not have an obvious place to store this data, so ignoring it improves compatibility.dgtmatrix_to_dataframe()
function used to convert unordered dgTMatrix
objects to COO data frames (#73).TileDBObject
has been improved so that the class name is displayed first (#79).Seurat
object (#80, thanks @dan11mcguire)This release changes the names of the 2 top-level classes in the tiledbsc package to follow new nomenclature adopted by the single-cell data model specification, which was implemented here. You can read more about the rationale for this change here.
Additionally, the misc
slot has been renamed to uns
. See below for details.
New class names
SCGroup
is replaced by SOMA
(stack of matrices, annotated)SCDataset
is replaced by SOMACollection
There are no functional changes to either class. SOMA
is a drop-in replacement for SCGroup
and SOMACollection
is a drop-in replacement for SCDataset
. However, with the new names two of SOMACollection
’s methods have changed accordingly:
scgroups
field is now somas
scgroup_uris()
is now soma_uris()
To ease the transition, the SCDataset
and SCGroup
classes are still available as aliases for SOMACollection
and SOMA
, respectively. However, they have been deprecated and will be removed in the future.
Previously, the SCDataset
and SCGroup
classes included a TileDB group called misc
that was intended for miscellaneous/unstructured data. To better align with the SOMA specification this group has been renamed to uns
. Practically, this means new SOMA
s and SOMACollection
s will create TileDB groups named uns
, rather than misc
. And these groups can be accessed with the SOMA
and SOMACollection
classes using SOMA$uns
.
For backwards compatibility: - if a misc
group exists within a SOMACollection
or SOMA
on disk, it will be accessible via the uns
field of the parent class - the deprecated SCDataset
and SCGroup
will continue to provide a misc
field (actually an active binding that aliases the uns
slot) so users can continue to use the old name
It’s now possible to read only a specific subset of data into memory.
The following classes now have a set_query()
method:
TileDBArray
and its subclassesAnnotationGroup
and its subclassesSOMA
SOMACollection
With set_query()
you can specify:
See the new Filtering vignette for details.
TileDBObject
base class to provide fields and methods common to both TileDBArray
- and TileDBGroup
-based classesarray_exists()
and group_exists()
methods have been deprecated in favor of the more general exists()
TileDBGroup
class, TileDBArray
now maintains a reference to the underlying array pointerobjects
field to provide direct access to the underlying TileDB objectsconfig
/ctx
fields to AnnotationGroup
AnnotationDataframe
gains ids()
to retrieve all values from the array’s dimensionsoma_object_type
and soma_encoding_version
metadata are written to groups/arrays at write timeAnnotationDataframe$from_dataframe()
no longer coerces logical
columns to integer
s, as TileDB 2.10 provides support for BOOL
data typesAnnotationArray
s so updates will overwrite existing cellsImprove handling of Seurat objects with empty cell identities (#58).
tiledbsc now uses the enhanced Group API’s introduced in TileDB v2.8 and TileDB-R 0.12.0.
Note: The next version of tiledbsc will migrate to the new SOMA-based naming scheme described here. ## On-disk changes
Group-level metadata is now natively supported by TileDB so TileDBGroup
-based classes no longer create nested __tiledb_group_metadata
arrays for the purpose of storing group-level metadata.
See TileDB 2.8 release notes for additional changes.
TileDBGroup
and its child classes:arrays
field has been replaced with members
, which includes both TileDB arrays and groupsget_array()
has been replaced with get_member()
which add a type
argument to filter by object typecount_members()
, list_members()
, list_member_uris()
, and add_member()
NEWS.md
file to track changes to the packageSCGroup
’s from_seurat_assay()
method gained two new arguments: layers
, to specify which Seurat Assay
slots should be ingested, and var
, to control whether feature-level metadata is ingestedSCGroup
’s from_seurat_assay()
method will no longer ingest the data
slot if it is identical to counts
TileDBURI
class for handling various URI formatsuri
field for all TileDB(Array|Group)-based classes is now an active binding that retrieves the URI from the private tiledb_uri
fieldX
, obs
, and var
arrays more efficiently on disk (#50)active_ident
attribute of the obs
array (#56)