array
cloud.array
Classes
Name | Description |
---|---|
ArrayList | This class incrementally builds a list of UDFArrayDetails |
ArrayList
self) cloud.array.ArrayList(
This class incrementally builds a list of UDFArrayDetails for use in multi array UDFs list[UDFArrayDetails]
Methods
Name | Description |
---|---|
add | Adds an array to list |
get | Returns the list of UDFArrayDetails |
add
=None, ranges=None, buffers=None, layout=None) cloud.array.ArrayList.add(uri
Adds an array to list
get
cloud.array.ArrayList.get()
Returns the list of UDFArrayDetails
Functions
Name | Description |
---|---|
apply | Apply a user defined function to an array, synchronously. |
apply_async | Apply a user-defined function to an array, asynchronously. |
apply_base | Apply a user-defined function to an array, and return data and metadata. |
array_activity | Fetch array activity |
delete_array | Deregister the array from the tiledb cloud service, |
deregister_array | Deregister the from the tiledb cloud service. |
exec_multi_array_udf | Apply a user-defined function to multiple arrays, synchronously. |
exec_multi_array_udf_async | Apply a user-defined function to multiple arrays, asynchronously. |
exec_multi_array_udf_base | Apply a user defined function to multiple arrays. |
info | Returns the cloud metadata |
list_shared_with | Return array sharing policies |
parse_ranges | Takes a list of the following objects per dimension: |
register_array | Register this array with the tiledb cloud service |
share_array | Shares array with give namespace and permissions |
unshare_array | Removes sharing of an array from given namespace |
update_file_properties | Update an Array to indicate its a file and has given properties. |
update_info | Update an array’s info |
apply
apply(*args, **kwargs) cloud.array.
Apply a user defined function to an array, synchronously.
All arguments are exactly as in :func:apply_base
, but this returns the data only.
Example:
import tiledb, tiledb.cloud, numpy def median(df): … return numpy.median(df[“a”]) # Open the array then run the UDF tiledb.cloud.array.apply(“tiledb://TileDB-Inc/quickstart_dense”, median, [(0,5), (0,5)], attrs=[“a”, “b”, “c”]) 2.0
apply_async
*args, **kwargs) cloud.array.apply_async(
Apply a user-defined function to an array, asynchronously.
All arguments are exactly as in :func:apply_base
, but this returns the data as a future-like AsyncResponse.
apply_base
cloud.array.apply_base(
uri=None
func=()
ranges=None
name=()
attrs=None
layout='default'
image_name='deflate'
http_compressor=True
include_source_lines=None
task_name=None
v2=models.ResultFormat.NATIVE
result_format=None
result_format_version=False
store_results=()
stored_param_uuids=None
timeout=None
resource_class=True
_download_results=None
namespace=None
_server_graph_uuid=None
_client_node_uuid**kwargs
)
Apply a user-defined function to an array, and return data and metadata.
Parameters
Name | Type | Description | Default |
---|---|---|---|
uri | str | The tiledb://... URI of the array to apply the function to. |
required |
func | Union[str, Callable, None] | The function to run. This can be either a callable function, or the name of a registered user-defined function | None |
ranges | Sequence | ranges to issue query on | () |
name | Optional[str] | Deprecated. If func is None , the name of the registered user-defined function to call. |
None |
attrs | Sequence | list of attributes or dimensions to fetch in query | () |
layout | Optional[str] | tiledb query layout | None |
image_name | str | udf image name to use, useful for testing beta features | 'default' |
http_compressor | str | set http compressor for results | 'deflate' |
include_source_lines | bool | True to send the source code of your UDF to the server with your request. (This means it can be shown to you in stack traces if an error occurs.) False to send only compiled Python bytecode. | True |
task_name | str | optional name to assign the task for logging and audit purposes | None |
v2 | Ignored. | None |
|
result_format | ResultFormat | result serialization format | models.ResultFormat.NATIVE |
result_format_version | Deprecated and ignored. | None |
|
store_results | bool | True to temporarily store results on the server side for later retrieval (in addition to downloading them). | False |
timeout | int | Timeout for UDF in seconds | None |
resource_class | Optional[str] | The name of the resource class to use. Resource classes define maximum limits for cpu and memory usage. | None |
_download_results | bool | True to download and parse results eagerly. False to not download results by default and only do so lazily (e.g. for an intermediate node in a graph). | True |
namespace | Optional[str] | The namespace to execute the UDF under. | None |
_server_graph_uuid | Optional[uuid.UUID] | If this function is being executed within a DAG, the server-generated ID of the graph’s log. Otherwise, None. | None |
_client_node_uuid | Optional[uuid.UUID] | If this function is being executed within a DAG, the ID of this function’s node within the graph. Otherwise, None. | None |
kwargs | Any | named arguments to pass to function Example >>> import tiledb, tiledb.cloud, numpy >>> def median(df): … return numpy.median(df[“a”]) >>> # Open the array then run the UDF >>> tiledb.cloud.array.apply_base(“tiledb://TileDB-Inc/quickstart_dense”, median, [(0,5), (0,5)], attrs=[“a”, “b”, “c”]).result 2.0 | {} |
array_activity
=False) cloud.array.array_activity(uri, async_req
Fetch array activity
Parameters
Name | Type | Description | Default |
---|---|---|---|
uri | required | ||
async_req | return future instead of results for async support | False |
Returns
Name | Type | Description |
---|---|---|
delete_array
*, async_req=False) cloud.array.delete_array(uri,
Deregister the array from the tiledb cloud service, then deletes physical array from disk.
All access to the array and cloud metadata will be removed.
Parameters
Name | Type | Description | Default |
---|---|---|---|
async_req | return future instead of results for async support | False |
deregister_array
=False) cloud.array.deregister_array(uri, async_req
Deregister the from the tiledb cloud service. This does not physically delete the array, it will remain in your bucket. All access to the array and cloud metadata will be removed.
Parameters
Name | Type | Description | Default |
---|---|---|---|
async_req | return future instead of results for async support | False |
exec_multi_array_udf
*args, **kwargs) cloud.array.exec_multi_array_udf(
Apply a user-defined function to multiple arrays, synchronously.
All arguments are exactly as in :func:exec_multi_array_udf_base
.
exec_multi_array_udf_async
*args, **kwargs) cloud.array.exec_multi_array_udf_async(
Apply a user-defined function to multiple arrays, asynchronously.
All arguments are exactly as in :func:exec_multi_array_udf_base
.
exec_multi_array_udf_base
cloud.array.exec_multi_array_udf_base(=None
func=None
array_list=None
namespace=None
name=None
layout='default'
image_name='deflate'
http_compressor=True
include_source_lines=None
task_name=models.ResultFormat.NATIVE
result_format=None
result_format_version=False
store_results=()
stored_param_uuids=None
resource_class=True
_download_results=None
_server_graph_uuid=None
_client_node_uuid**kwargs
)
Apply a user defined function to multiple arrays.
:param func: The function to run. This can be either a callable function,
or the name of a registered user-defined function
:param array_list: The list of arrays to run the function on,
as an already-built ArrayList object.
:param namespace: namespace to run udf under
:param layout: Ignored.
:param image_name: udf image name to use, useful for testing beta features
:param http_compressor: set http compressor for results
:param str task_name: optional name to assign the task
for logging and audit purposes
:param ResultFormat result_format: result serialization format
:param str result_format_version: Deprecated and ignored.
:param store_results: True to temporarily store results on the server side
for later retrieval (in addition to downloading them).
:param _server_graph_uuid: If this function is being executed within a DAG,
the server-generated ID of the graph's log. Otherwise, None.
:param _client_node_uuid: If this function is being executed within a DAG,
the ID of this function's node within the graph. Otherwise, None.
:param resource_class: The name of the resource class to use. Resource classes
define maximum limits for cpu and memory usage.
:param kwargs: named arguments to pass to function
:return: A future containing the results of the UDF.
>>> import numpy as np
>>> from tiledb.cloud import array
>>> import tiledb.cloud
>>> dense_array = "tiledb://andreas/quickstart_dense_local"
>>> sparse_array = "tiledb://andreas/quickstart_sparse_local"
>>> def median(numpy_ordered_dictionary):
... return np.median(numpy_ordered_dictionary[0]["a"]) + np.median(numpy_ordered_dictionary[1]["a"])
>>> array_list = array.ArrayList()
>>> array_list.add(dense_array, [(1, 4), (1, 4)], ["a"])
>>> array_list.add(sparse_array, [(1, 2), (1, 4)], ["a"])
>>> namespace = "namespace"
>>> res = array.exec_multi_array_udf(median, array_list, namespace)
>>> print("Median Multi UDF:
{} “.format(res))
info
=False) cloud.array.info(uri, async_req
Returns the cloud metadata
Parameters
Name | Type | Description | Default |
---|---|---|---|
async_req | return future instead of results for async support | False |
Returns
Name | Type | Description |
---|---|---|
metadata object |
parse_ranges
cloud.array.parse_ranges(ranges)
Takes a list of the following objects per dimension:
- scalar index
- (start,end) tuple
- list of either of the above types
Parameters
Name | Type | Description | Default |
---|---|---|---|
ranges | list of (scalar, tuple, list) | required | |
builder | function taking arguments (dim_idx, start, end) | required |
Returns
Name | Type | Description |
---|---|---|
register_array
cloud.array.register_array(
uri=None
namespace=None
array_name=None
description=None
access_credentials_name=False
async_req=None
dest_uri )
Register this array with the tiledb cloud service
Parameters
Name | Type | Description | Default |
---|---|---|---|
namespace | str | The user or organization to register the array under. If unset will default to the user | None |
array_name | str | name of array | None |
description | str | optional description | None |
access_credentials_name | str | optional name of access credentials to use, if left blank default for namespace will be used | None |
async_req | return future instead of results for async support | False |
|
dest_uri | Optional[str] | If set, the tiledb:// URI of the destination. |
None |
update_file_properties
cloud.array.update_file_properties(
uri=None
file_type=None
file_properties=False
async_req )
Update an Array to indicate its a file and has given properties. Any properties set are returned with the array info.
Parameters
Name | Type | Description | Default |
---|---|---|---|
uri | str | uri of array to update | required |
file_type | str | file type to set | None |
file_properties | dict | dictionary of properties to set | None |
Returns
Name | Type | Description |
---|---|---|
update_info
cloud.array.update_info(
uri=None
array_name=None
description=None
access_credentials_name=None
tags=False
async_req )
Update an array’s info
Parameters
Name | Type | Description | Default |
---|---|---|---|
namespace | str | The username or organization that owns the array. If unset, will use the logged-in user. | required |
array_name | str | name of array to rename to | None |
description | str | optional description | None |
access_credentials_name | str | The access credentials to use when accessing the backing array. Leave unset to not change. | None |
tags | list | to update to | None |
file_type | str | array represents give file type | required |
file_properties | str | set file properties on array | required |
async_req | return future instead of results for async support | False |