soma.mapper

client.soma.mapper

Functions

Name Description
build_collection_mapper_workflow_graph The primary entrypoint for the mapper module. The caller passes in either a
experiment_to_anndata_slice This function is not to be called directly: please use
experiment_to_axis_counts Returns a tuple of (obs_counts, var_counts) if counts_only is True.
run_collection_mapper_workflow This is an asynchronous entry point, which launches the task graph and returns

build_collection_mapper_workflow_graph

client.soma.mapper.build_collection_mapper_workflow_graph(
    soma_collection_uri=None,
    soma_experiment_uris=None,
    experiment_names=None,
    measurement_name,
    X_layer_name,
    obs_query_string=None,
    var_query_string=None,
    obs_attrs=None,
    var_attrs=None,
    callback=lambda x: x,
    args_dict=None,
    extra_tiledb_config=None,
    platform_config=None,
    workspace=None,
    task_graph_name='SOMAExperiment Collection Mapper',
    counts_only=False,
    use_batch_mode=False,
    resource_class=None,
    resources=None,
    access_credentials_name=None,
    verbose=False,
)

The primary entrypoint for the mapper module. The caller passes in either a sequence of SOMAExperiment URIs or a SOMACollection, which is simply a collection of SOMAExperiment objects. The caller also passes in query terms and a callback lambda which will be called on the to_anndata output of each experiment’s query. The result will be a dictionary mapping experiment names to the callback lambda’s output for each input experiment.

For example, if the lambda maps an anndata object to its .shape, then with SOMA experiments A and B, the task graph would return the dict {"A": (56868, 43050), "B": (23539, 42044)}.

Parameters for input data: :param soma_collection_uri: URI of a SOMACollection containing SOMAExperiment objects to be processed. Please specify only one of soma_collection_uri or soma_experiment_uris. :param soma_experiment_uris: List/tuple of URIs of SOMAExperiment objects to be processed.

:param experiment_names: Optional list of experiment names. If not provided, all SOMAExperiment objects are processed as specified by soma_collection_uri or soma_experiment_uris. If provided, experiment_names can be used to further subset/restrict which SOMAExperiment objects will be processed. :param measurement_name: Which SOMAMeasurement to query within the specified SOMAExperiment objects. For example, "RNA". :param X_layer_name: Which X layer to query within the specified SOMAMeasurement objects. For example, "data", "raw", "normalized".

Query parameters: :param obs_query_string: Optional query string for obs. For example: 'cell_type == "liver"'. :param var_query_string: Optional query string for var. For example: 'n_cells > 100'. :param obs_attrs: Optional list of obs attributes to return as query output. Default: all. :param var_attrs: Optional list of var attributes to return as query output. Default: all.

Parameters for data processing: :param callback: Your code to run on each UDF node, one for each SOMAExperiment. On each node, tiledbsoma.AxisQuery is run, using parameters you specify as above, and then query.to_anndata is run on that query output. Your callback function receives that query-output AnnData object. For example: lambda ad: ad.obs.shape. :param args_dict: Optional additional arguments to be passed to your callback. If provided, this must be a dict from string experiment name, to dict of key-value pairs. :param counts_only: If specified, only return obs/var counts, not the result of the provided callback.

TileDB configs: :param extra_tiledb_config: Currently unused; reserved for future use. :param platform_config: Currently unused; reserved for future use.

Cloud configs: :param workspace: TileDB workspace in which to run the UDFs. :param task_graph_name: Optional name for your task graph, so you can find it more easily among other runs.

Real-time vs batch modes: :param use_batch_mode: If false (the default), uses real-time UDFs. These have lower latency but fewer resource options. :param resource_class: "standard" or "large". Only valid when use_batch_mode is False. :param resources: Only valid when use_batch_mode is True. Example: resources={"cpu": "2", "memory": "8Gi"}. :param access_credentials_name: Only valid when use_batch_mode is True.

Other: :param verbose: If True, enable verbose logging. Default: False.

Return value: A DAG object. If you’ve named this dag, you’ll need to call dag.compute(), dag.wait(), and dag.end_results().

experiment_to_anndata_slice

client.soma.mapper.experiment_to_anndata_slice(
    exp,
    *,
    measurement_name,
    X_layer_name,
    obs_query_string=None,
    var_query_string=None,
    obs_attrs=None,
    var_attrs=None,
)

This function is not to be called directly: please use run_collection_mapper_workflow or build_collection_mapper_workflow_graph. This is the function that runs as a UDF node for each SOMAExperiment you specify.

experiment_to_axis_counts

client.soma.mapper.experiment_to_axis_counts(
    exp,
    *,
    measurement_name,
    X_layer_name,
    obs_query_string=None,
    var_query_string=None,
    obs_attrs=None,
    var_attrs=None,
)

Returns a tuple of (obs_counts, var_counts) if counts_only is True.

run_collection_mapper_workflow

client.soma.mapper.run_collection_mapper_workflow(
    soma_collection_uri=None,
    soma_experiment_uris=None,
    experiment_names=None,
    measurement_name,
    X_layer_name,
    obs_query_string=None,
    var_query_string=None,
    obs_attrs=None,
    var_attrs=None,
    callback=lambda x: x,
    args_dict=None,
    extra_tiledb_config=None,
    platform_config=None,
    workspace=None,
    task_graph_name='SOMAExperiment Collection Mapper',
    counts_only=False,
    use_batch_mode=False,
    resource_class=None,
    resources=None,
    access_credentials_name=None,
    verbose=False,
)

This is an asynchronous entry point, which launches the task graph and returns tracking information. Nominally this is not the primary use-case. Please see build_collection_mapper_workflow_graph for information about arguments and return value.