utilities.consolidate

cloud.utilities.consolidate

Functions

Name Description
consolidate Consolidate fragments
consolidate_and_vacuum Consolidate and vacuum commits and fragment metadata, with an option to
consolidate_fragments Consolidate fragments in an array.
group_fragments Get a list of fragment info objects, optionally grouping fragments that have the

consolidate

cloud.utilities.consolidate.consolidate(
    array_uri
    fragments
    *
    config=None
    max_fragment_size=MAX_FRAGMENT_SIZE_BYTES
)

Consolidate fragments

Parameters

Name Type Description Default
array_uri str array URI required
fragments Sequence[tiledb.FragmentInfo] list of fragments required
config Optional[Mapping[str, Any]] config dictionary, defaults to None None
max_fragment_size int max size of consolidated fragments, defaults to MAX_FRAGMENT_SIZE_BYTES MAX_FRAGMENT_SIZE_BYTES

consolidate_and_vacuum

cloud.utilities.consolidate.consolidate_and_vacuum(
    array_uri
    *
    config=None
    vacuum_fragments=False
)

Consolidate and vacuum commits and fragment metadata, with an option to vacuum fragments as the first step.

Parameters

Name Type Description Default
array_uri str array URI required
config Optional[Mapping[str, Any]] config dictionary, defaults to None None
vacuum_fragments bool vacuum fragments first, defaults to False False

consolidate_fragments

cloud.utilities.consolidate.consolidate_fragments(
    array_uri
    *
    acn=None
    config=None
    group_by_first_dim=False
    graph=None
    dependencies=None
    consolidate_resources=None
    namespace=None
    max_fragment_size=MAX_FRAGMENT_SIZE_BYTES
)

Consolidate fragments in an array.

If group_by_first_dim is True, fragments with the same value for the first dimension will be consolidated together. Otherwise, all fragments will be consolidated together.

If graph is provided, the consolidation task nodes will be submitted to the graph. If dependencies is provided, the consolidation nodes will depend on the nodes in the list.

If graph is not provided, a new graph will be created and submitted to TileDB Cloud.

Parameters

Name Type Description Default
array_uri str array URI required
acn Optional[str] Access Credentials Name (ACN) registered in TileDB Cloud (ARN type), defaults to None None
config Optional[Mapping[str, Any]] config dictionary, defaults to None None
group_by_first_dim bool group fragment by first dimension, defaults to True False
graph Optional[dag.DAG] graph to submit nodes to, defaults to None None
dependencies Optional[Sequence[dag.Node]] list of nodes in the graph to depend on, defaults to None None
consolidate_resources Optional[Mapping[str, str]] resources for the consolidate node, defaults to None None
namespace Optional[str] TileDB Cloud namespace, defaults to the user’s default namespace None
max_fragment_size int max size of consolidated fragments, defaults to MAX_FRAGMENT_SIZE_BYTES MAX_FRAGMENT_SIZE_BYTES

group_fragments

cloud.utilities.consolidate.group_fragments(
    array_uri
    *
    config=None
    group_by_first_dim=True
)

Get a list of fragment info objects, optionally grouping fragments that have the same value for the first dimension.

Parameters

Name Type Description Default
array_uri str array URI required
config Optional[Mapping[str, Any]] config dictionary, defaults to None None
group_by_first_dim bool group by first dimension, defaults to True True

Returns

Name Type Description
Sequence[Sequence[tiledb.FragmentInfo]] list of lists of fragment info objects