object_readers.ObjectReader
vector_search.object_readers.ObjectReader()
Abstract class that can be used to read Objects from different sources and formats.
Methods
Name | Description |
---|---|
get_partitions | Returns a list of ObjectPartitions for the reader. |
init_kwargs | Returns a dictionary containing kwargs that can be used to re-initialize the ObjectReader. |
metadata_array_uri | Returns the URI of a TileDB array that can be used to read Object metadata. |
metadata_attributes | Returns a list of TileDB Attributes describing the metadata of the Objects. |
partition_class_name | Returns the class name of ObjectPartition generated by this ObjectReader. |
read_objects | Reads the objects corresponding to an ObjectPartition. |
read_objects_by_external_ids | Reads the objects corresponding to a list of external_ids . |
get_partitions
vector_search.object_readers.ObjectReader.get_partitions(**kwargs)
Returns a list of ObjectPartitions for the reader. Each partition can be read independently and used for distributed embedding and ingestion.
init_kwargs
vector_search.object_readers.ObjectReader.init_kwargs()
Returns a dictionary containing kwargs that can be used to re-initialize the ObjectReader.
This is used to serialize the ObjectReader and pass it as argument to UDF tasks.
metadata_array_uri
vector_search.object_readers.ObjectReader.metadata_array_uri()
Returns the URI of a TileDB array that can be used to read Object metadata. This array should have only one external_id
dimension and attributes the list of TileDB attributes returned by metadata_attributes
.
Returns None, if a metadata array does not exist and should be materialized by object ingestion.
metadata_attributes
vector_search.object_readers.ObjectReader.metadata_attributes()
Returns a list of TileDB Attributes describing the metadata of the Objects.
Returns None, if there are no Object metadata.
partition_class_name
vector_search.object_readers.ObjectReader.partition_class_name()
Returns the class name of ObjectPartition generated by this ObjectReader.
The ObjectPartition class should be defined in the same Python file as the ObjectReader.
read_objects
vector_search.object_readers.ObjectReader.read_objects(partition)
Reads the objects corresponding to an ObjectPartition.
Returns a tuple containing the object data and metadata respectively. Data and metadata are OrderedDicts having structure similar to TileDB-Py read results. Data and metadata should contain at least an external_id
dimension used to identify the different objects.
read_objects_by_external_ids
vector_search.object_readers.ObjectReader.read_objects_by_external_ids(ids)
Reads the objects corresponding to a list of external_ids
.
Returns an OrderedDict, containing the object data, having structure similar to TileDB-Py read results.