Create a TileDB dense or sparse array from a given data.frame
Object
Source: R/DataFrame.R
fromDataFrame.Rd
The supplied data.frame
object is (currently) limited to integer,
numeric, or character. In addition, three datetime columns are supported
with the R representations of Date
, POSIXct
and nanotime
.
Usage
fromDataFrame(
obj,
uri,
col_index = NULL,
sparse = TRUE,
allows_dups = sparse,
cell_order = "COL_MAJOR",
tile_order = "COL_MAJOR",
filter = "ZSTD",
capacity = 10000L,
tile_domain = NULL,
tile_extent = NULL,
mode = c("ingest", "schema_only", "append"),
debug = FALSE
)
Arguments
- obj
A
data.frame
object.- uri
A character variable with an Array URI.
- col_index
An optional column index, either numeric with a column index, or character with a column name, designating an index column; default is NULL implying an index column is added when the array is created
- sparse
A logical switch to select sparse (the default) or dense
- allows_dups
A logical switch to select if duplicate values are allowed or not, default is the same value as ‘sparse’.
- cell_order
A character variable with one of the TileDB cell order values, default is “COL_MAJOR”.
- tile_order
A character variable with one of the TileDB tile order values, default is “COL_MAJOR”.
- filter
A character variable vector, defaults to ‘ZSTD’, for one or more filters to be applied to each attribute;
- capacity
A integer value with the schema capacity, default is 10000.
- tile_domain
An integer vector or list or
NULL
. If an integer vector of size two it specifies the integer domain of the row dimension; if a list then a named element is used for the dimension of the same name; or ifNULL
the row dimension of theobj
is used.- tile_extent
An integer value for the tile extent of the row dimensions; if
NULL
the row dimension of theobj
is used. Note that thetile_extent
cannot exceed the tile domain.- mode
A character variable with possible values ‘ingest’ (for schema creation and data ingestion, the default behavior), ‘schema_only’ (to create the array schema without writing to the newly-created array) and ‘append’ (to only append to an already existing array).
- debug
Logical flag to select additional output.
Details
The created (dense or sparse) array will have as many attributes as there
are columns in the data.frame
. Each attribute will be a single column.
For a sparse array, one or more columns have to be designated as dimensions.
At present, factor variable are converted to character.
Examples
ctx <- tiledb_ctx(limitTileDBCores())
if (FALSE) {
uri <- tempfile()
## turn factor into character
irisdf <- within(iris, Species <- as.character(Species))
fromDataFrame(irisdf, uri)
arr <- tiledb_array(uri, as.data.frame=TRUE, sparse=FALSE)
newdf <- arr[]
all.equal(iris, newdf)
}