Title: DBI Connector to Presto
Version: 1.4.7
Copyright: Meta Platforms, Inc. 2015-present.
Description: Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: https://prestodb.io/.
Depends: R (≥ 3.1.0), methods
Imports: DBI (≥ 0.3.0), httr (≥ 0.6), openssl, jsonlite, stringi, stats, utils, purrr, dplyr (≥ 0.7.0), dbplyr (≥ 2.3.3), tibble, bit64, rlang, lifecycle, lubridate, progress, vctrs
Suggests: testthat, hms, knitr, rmarkdown, withr
License: BSD_3_clause + file LICENSE
URL: https://github.com/prestodb/RPresto
BugReports: https://github.com/prestodb/RPresto/issues
Encoding: UTF-8
Collate: 'PrestoDriver.R' 'Presto.R' 'PrestoSession.R' 'PrestoConnection.R' 'PrestoQuery.R' 'PrestoResult.R' 'RPresto-package.R' 'chunk.R' 'create.dummy.tables.R' 'cte.R' 'dbAppendTable.R' 'dbClearResult.R' 'dbConnect.R' 'dbCreateTable.R' 'dbCreateTableAs.R' 'dbDataType.R' 'dbDisconnect.R' 'dbExistsTable.R' 'dbFetch.R' 'dbGetInfo.R' 'dbGetQuery.R' 'dbGetRowCount.R' 'dbGetRowsAffected.R' 'dbGetStatement.R' 'dbHasCompleted.R' 'dbIsValid.R' 'dbListFields.R' 'dbListTables.R' 'dbQuoteIdentifier.R' 'dbQuoteLiteral.R' 'dbReadTable.R' 'dbRemoveTable.R' 'dbRenameTable.R' 'dbSendQuery.R' 'dbUnloadDriver.R' 'dbWriteTable.R' 'dbplyr-db.R' 'dbplyr-sql.R' 'dbplyr-src.R' 'default.R' 'fetch.R' 'presto.field.R' 'presto.field_utilities.R' 'request_headers.R' 'sqlCreateTable.R' 'sqlCreateTableAs.R' 'zzz.R'
RoxygenNote: 7.3.2
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2025-01-08 01:22:46 UTC; jarodm
Author: Onur Ismail Filiz [aut], Sergey Goder [aut], Jarod G.R. Meng [aut, cre], Thomas J. Leeper [ctb], John Myles White [ctb]
Maintainer: Jarod G.R. Meng <jarodm@fb.com>
Repository: CRAN
Date/Publication: 2025-01-08 05:40:17 UTC

RPresto: DBI Connector to Presto

Description

Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: https://prestodb.io/.

Package options

rpresto.max.rows

Max number of rows to be fetched before a warning is given.

rpresto.quiet

Verbose output during processing? The default value, NA, turns on verbose output for interactive queries that run longer than two seconds. Use FALSE for immediate verbose output, TRUE for quiet operation.

Author(s)

Maintainer: Jarod G.R. Meng jarodm@fb.com

Authors:

Other contributors:

See Also

Useful links:


Connect to a Presto database

Description

Connect to a Presto database

Usage

Presto(...)

## S4 method for signature 'PrestoDriver'
dbConnect(
  drv,
  catalog,
  schema,
  user,
  host = "localhost",
  port = 8080,
  source = methods::getPackageName(),
  session.timezone = "",
  output.timezone = "",
  parameters = list(),
  ctes = list(),
  request.config = httr::config(),
  use.trino.headers = FALSE,
  extra.credentials = "",
  bigint = c("integer", "integer64", "numeric", "character"),
  ...
)

## S4 method for signature 'PrestoConnection'
dbDisconnect(conn)

Arguments

...

currently ignored

drv

A driver object generated by Presto()

catalog

The catalog to be used

schema

The schema to be used

user

The current user

host

The presto host to connect to

port

Port to use for the connection

source

Source to specify for the connection

session.timezone

Time zone of the Presto server. Presto returns timestamps without time zones with respect to this value. The time arithmetic (e.g. adding hours) will also be done in the given time zone. This value is passed to Presto server via the request headers.

output.timezone

The time zone using which TIME WITH TZ and TIMESTAMP values in the output should be represented. Default to the Presto server timezone (use ⁠show(<PrestoConnection>)⁠ to see).

parameters

A list() of extra parameters to be passed in the ‘X-Presto-Session’ header

ctes

[Experimental] A list of common table expressions (CTEs) that can be used in the WITH clause. See vignette("common-table-expressions").

request.config

An optional config list, as returned by httr::config(), to be sent with every HTTP request.

use.trino.headers

A boolean to indicate whether Trino request headers should be used. Default to FALSE.

extra.credentials

Extra credentials to be passed in the X-Presto-Extra-Credential or X-Trino-Extra-Credential header ( depending on the value of the use.trino.headers argument). Default to an empty string.

bigint

The R type that Presto's 64-bit integer (BIGINT) class should be translated to. The default is "integer", which returns R's integer type, but results in NA for values above/below +/-2147483647. "integer64" returns a bit64::integer64, which allows the full range of 64 bit integers. "numeric" coerces into R's double type but might result in precision loss. Lastly, "character" casts into R's character type.

conn

A PrestoConnection object

Value

Presto A PrestoDriver object

DBI::dbConnect() A PrestoConnection object

DBI::dbDisconnect() A logical() value indicating success

Examples

## Not run: 
conn <- dbConnect(Presto(),
  catalog = "hive", schema = "default",
  user = "onur", host = "localhost", port = 8080,
  session.timezone = "US/Eastern", bigint = "character"
)
dbListTables(conn, "%_iris")
dbDisconnect(conn)

## End(Not run)

S4 implementation of DBIConnection for Presto.

Description

S4 implementation of DBIConnection for Presto.

Usage

## S4 method for signature 'PrestoConnection'
show(object)

## S4 method for signature 'PrestoConnection,ANY,data.frame'
dbAppendTable(conn, name, value, ..., chunk.fields = NULL, row.names = NULL)

## S4 method for signature 'PrestoConnection'
dbCreateTable(
  conn,
  name,
  fields,
  with = NULL,
  ...,
  row.names = NULL,
  temporary = FALSE
)

## S4 method for signature 'PrestoConnection'
dbCreateTableAs(conn, name, sql, overwrite = FALSE, with = NULL, ...)

## S4 method for signature 'PrestoConnection,ANY'
dbExistsTable(conn, name, ...)

## S4 method for signature 'PrestoConnection,character'
dbGetQuery(conn, statement, ..., quiet = getOption("rpresto.quiet"))

## S4 method for signature 'PrestoConnection,ANY'
dbListFields(conn, name, ...)

## S4 method for signature 'PrestoConnection,character'
dbListFields(conn, name, ...)

## S4 method for signature 'PrestoConnection,dbplyr_schema'
dbListFields(conn, name, ...)

## S4 method for signature 'PrestoConnection,Id'
dbListFields(conn, name, ...)

## S4 method for signature 'PrestoConnection,SQL'
dbListFields(conn, name, ...)

## S4 method for signature 'PrestoConnection'
dbListTables(conn, pattern, ...)

## S4 method for signature 'PrestoConnection,dbplyr_schema'
dbQuoteIdentifier(conn, x, ...)

## S4 method for signature 'PrestoConnection,dbplyr_table_path'
dbQuoteIdentifier(conn, x, ...)

## S4 method for signature 'PrestoConnection,AsIs'
dbQuoteIdentifier(conn, x, ...)

## S4 method for signature 'PrestoConnection'
dbQuoteLiteral(conn, x, ...)

## S4 method for signature 'PrestoConnection,ANY'
dbReadTable(conn, name, ...)

## S4 method for signature 'PrestoConnection,character'
dbReadTable(conn, name, ...)

## S4 method for signature 'PrestoConnection,dbplyr_schema'
dbReadTable(conn, name, ...)

## S4 method for signature 'PrestoConnection,Id'
dbReadTable(conn, name, ...)

## S4 method for signature 'PrestoConnection,SQL'
dbReadTable(conn, name, ...)

## S4 method for signature 'PrestoConnection,ANY'
dbRemoveTable(conn, name, ..., fail_if_missing = TRUE)

## S4 method for signature 'PrestoConnection'
dbRenameTable(conn, name, new_name, ...)

## S4 method for signature 'PrestoConnection,character'
dbSendQuery(conn, statement, ..., quiet = getOption("rpresto.quiet"))

## S4 method for signature 'PrestoConnection,ANY,data.frame'
dbWriteTable(
  conn,
  name,
  value,
  overwrite = FALSE,
  ...,
  append = FALSE,
  field.types = NULL,
  temporary = FALSE,
  row.names = FALSE,
  with = NULL,
  chunk.fields = NULL,
  use.one.query = FALSE
)

## S4 method for signature 'PrestoConnection'
sqlCreateTable(
  con,
  table,
  fields,
  row.names = NA,
  temporary = FALSE,
  with = NULL,
  ...
)

## S4 method for signature 'PrestoConnection'
sqlCreateTableAs(con, name, sql, with = NULL, ...)

Arguments

conn

a PrestoConnection object, as returned by DBI::dbConnect().

name

The table name, passed on to dbQuoteIdentifier(). Options are:

  • a character string with the unquoted DBMS table name, e.g. "table_name",

  • a call to Id() with components to the fully qualified table name, e.g. Id(schema = "my_schema", table = "table_name")

  • a call to SQL() with the quoted and fully qualified table name given verbatim, e.g. SQL('"my_schema"."table_name"')

value

A data.frame (or coercible to data.frame).

...

Other parameters passed on to methods.

chunk.fields

A character vector of names of the fields that should be used to slice the value data frame into chunks for batch append. This is necessary when the data frame is too big to be uploaded at once in one single INSERT INTO statement. Default to NULL which inserts the entire value data frame.

row.names

Must be NULL.

fields

Either a character vector or a data frame.

A named character vector: Names are column names, values are types. Names are escaped with dbQuoteIdentifier(). Field types are unescaped.

A data frame: field types are generated using dbDataType().

with

An optional WITH clause for the CREATE TABLE statement.

temporary

If TRUE, will generate a temporary table.

quiet

If a progress bar should be shown for long queries (which run for more than 2 seconds. Default to getOption("rpresto.quiet") which if not set, defaults to NA which turns on the progress bar for interactive queries.

pattern

optional SQL pattern for filtering table names, e.g. '%test%'

x

A character vector, SQL or Id object to quote as identifier.

fail_if_missing

If FALSE, dbRemoveTable() succeeds if the table doesn't exist.

use.one.query

A boolean to indicate if to use a single CREATE TABLE AS statement rather than the default implementation of using separate CREATE TABLE and INSERT INTO statements. Some Presto backends might have different requirements between the two approaches. e.g. INSERT INTO might not be allowed to mutate an unpartitioned table created by CREATE TABLE. If set to TRUE, chunk.fields cannot be used.

con

A database connection.

table

The table name, passed on to dbQuoteIdentifier(). Options are:

  • a character string with the unquoted DBMS table name, e.g. "table_name",

  • a call to Id() with components to the fully qualified table name, e.g. Id(schema = "my_schema", table = "table_name")

  • a call to SQL() with the quoted and fully qualified table name given verbatim, e.g. SQL('"my_schema"."table_name"')


An S4 class to represent a Presto Driver (and methods) It is used purely for dispatch and dbUnloadDriver is unnecessary

Description

An S4 class to represent a Presto Driver (and methods) It is used purely for dispatch and dbUnloadDriver is unnecessary

Usage

## S4 method for signature 'PrestoDriver'
show(object)

## S4 method for signature 'PrestoDriver'
dbUnloadDriver(drv, ...)

Class to encapsulate a Presto query

Description

This reference class (so that the object can be passed by reference and modified) encapsulates the lifecycle of a Presto query from its inception (by providing a PrestoConnection and a query statement) to the steps it takes to execute (i.e. an initial POST request and subsequent GET requests).

Details

This is similar to the PrestoQuery class defined in the Presto Python client

Slots

.conn

A PrestoConnection object

.statement

The query statement

.id

The query ID returned after the first POST request

.timestamp

The timestamp of the query execution

.bigint

How BIGINT fields should be converted to an R class

.state

The query state. This changes every time the query advances to the next stage

.next.uri

The URI that specifies the next endpoint to send the GET request

.info.uri

The information URI

.stats

Query stats. This changes every time the query advances to the next stage

.response

HTTP request response. This changes when the query advances

.content

Parsed content from the HTTP request response

.fetched.row.count

How many rows of data have been fetched to R

.post.data.fetched

A boolean flag indicating if data returned from the POST request has been fetched

.quiet

If a progress bar should be shown for long queries (which run for more than 2 seconds. Default to NA which turns on the progress bar for interactive queries.


An S4 class to represent a Presto Result

Description

An S4 class to represent a Presto Result

Usage

## S4 method for signature 'PrestoResult'
show(object)

## S4 method for signature 'PrestoResult'
dbClearResult(res, ...)

## S4 method for signature 'PrestoResult,numeric'
dbFetch(res, n)

## S4 method for signature 'PrestoResult,missing'
dbFetch(res)

## S4 method for signature 'PrestoResult'
dbGetRowCount(res, ...)

## S4 method for signature 'PrestoResult'
dbGetRowsAffected(res)

## S4 method for signature 'PrestoResult'
dbGetStatement(res, ...)

## S4 method for signature 'PrestoResult'
dbHasCompleted(res, ...)

## S4 method for signature 'PrestoResult'
dbIsValid(dbObj, ...)

## S4 method for signature 'PrestoResult,missing'
dbListFields(conn, name)

## S4 method for signature 'PrestoResult,integer'
fetch(res, n = -1, ...)

## S4 method for signature 'PrestoResult,numeric'
fetch(res, n = -1, ...)

## S4 method for signature 'PrestoResult,missing'
fetch(res)

Slots

statement

The SQL statement sent to the database

connection

The connection object associated with the result

query

An internal implementation detail for keeping track of what stage a request is in

post.data

Any data extracted from the POST request response

bigint

How bigint type should be handled


Class to encapsulate a Presto session

Description

A session contains temporary attributes and information that are only useful for the session. It's attached to a PrestoConnection for as long as as the connection lives. There are a few types of information stored.

  1. Session properties that can be set via query response headers and need to be sent with following HTTP requests.

  2. Common table expressions (CTEs) that can be used to store a subquery and be used in a WITH statement.

Slots

.parameters

List of Presto session parameters to be added to the X-Presto-Session header.

.ctes

List of common table expressions (CTEs), i.e. SELECT statements with names. They can be used in a WITH statement.


Add a chunk field to a data frame

Description

This auxiliary function adds a field, if necessary, to a data frame so that each compartment of the data frame that corresponds to a unique combination of the chunk fields has a size below a certain threshold. This resulting data frame can then be safely used in dbAppendTable() becauase Presto has a size limit on any discrete INSERT INTO statement.

Usage

add_chunk(
  value,
  base_chunk_fields = NULL,
  chunk_size = 1e+06,
  new_chunk_field_name = "aux_chunk_idx"
)

Arguments

value

The original data frame.

base_chunk_fields

A character vector of existing field names that are used to split the data frame before checking the chunk size.

chunk_size

Maximum size (in bytes) of the VALUES statement encoding each unique chunk. Default to 1,000,000 bytes (i.e. 1Mb).

new_chunk_field_name

A string indicating the new chunk field name. Default to "aux_chunk_idx".

Examples

## Not run: 
# returns the original data frame because it's within size
add_chunk(iris)
# add a new aux_chunk_idx field
add_chunk(iris, chunk_size = 2000)
# the new aux_chunk_idx field is added on top of Species
add_chunk(iris, chunk_size = 2000, base_chunk_fields = c("Species"))

## End(Not run)

Create a table in database using a statement

Description

Create a table in database using a statement

Usage

dbCreateTableAs(conn, name, sql, overwrite = FALSE, with = NULL, ...)

Arguments

conn

a PrestoConnection object, as returned by DBI::dbConnect().

name

The table name, passed on to DBI::dbQuoteIdentifier(). Options are:

  • a character string with the unquoted DBMS table name, e.g. "table_name",

  • a call to DBI::Id() with components to the fully qualified table name, e.g. Id(schema = "my_schema", table = "table_name")

  • a call to DBI::SQL() with the quoted and fully qualified table name given verbatim, e.g. SQL('"my_schema"."table_name"')

sql

a character string containing SQL statement.

overwrite

A boolean indicating if an existing table should be overwritten. Default to FALSE.

with

An optional WITH clause for the CREATE TABLE statement.

...

Other arguments used by individual methods.


Return the corresponding presto data type for the given R object

Description

Return the corresponding presto data type for the given R object

Usage

## S4 method for signature 'PrestoDriver'
dbDataType(dbObj, obj, ...)

Arguments

dbObj

A PrestoDriver object

obj

Any R object

...

Extra optional parameters, not currently used

Details

The default value for unknown classes is ‘VARCHAR’.

Value

A character value corresponding to the Presto type for obj

Examples

drv <- RPresto::Presto()
dbDataType(drv, 1)
dbDataType(drv, NULL)
dbDataType(drv, as.POSIXct("2015-03-01 00:00:00", tz = "UTC"))
dbDataType(drv, Sys.time())
dbDataType(
  drv,
  list(
    c("a" = 1L, "b" = 2L),
    c("a" = 3L, "b" = 4L)
  )
)
dbDataType(
  drv,
  list(
    c(as.Date("2015-03-01"), as.Date("2015-03-02")),
    c(as.Date("2016-03-01"), as.Date("2016-03-02"))
  )
)
dbDataType(drv, iris)

Metadata about database objects

Description

Metadata about database objects

For the PrestoResult object, the implementation returns the additional stats field which can be used to implement things like progress bars. See the examples section.

Usage

## S4 method for signature 'PrestoDriver'
dbGetInfo(dbObj)

## S4 method for signature 'PrestoConnection'
dbGetInfo(dbObj)

## S4 method for signature 'PrestoResult'
dbGetInfo(dbObj)

Arguments

dbObj

A PrestoDriver, PrestoConnection or PrestoResult object

Value

PrestoResult A list() with elements

statement

The SQL sent to the database

row.count

Number of rows fetched so far

has.completed

Whether all data has been fetched

stats

Current stats on the query

Examples

## Not run: 
conn <- dbConnect(Presto(), "localhost", 7777, "onur", "datascience")
result <- dbSendQuery(conn, "SELECT * FROM jonchang_iris")
iris <- data.frame()
progress.bar <- NULL
while (!dbHasCompleted(result)) {
  chunk <- dbFetch(result)
  if (!NROW(iris)) {
    iris <- chunk
  } else if (NROW(chunk)) {
    iris <- rbind(iris, chunk)
  }
  stats <- dbGetInfo(result)[["stats"]]
  if (is.null(progress.bar)) {
    progress.bar <- txtProgressBar(0, stats[["totalSplits"]], style = 3)
  } else {
    setTxtProgressBar(progress.bar, stats[["completedSplits"]])
  }
}
close(progress.bar)

## End(Not run)

Rename a table

Description

Rename a table

Usage

dbRenameTable(conn, name, new_name, ...)

Arguments

conn

A PrestoConnection.

name

Existing table's name.

new_name

New table name.

...

Extra arguments passed to dbExecute.


S3 implementation of db_desc for Presto.

Description

S3 implementation of db_desc for Presto.

S3 implementation of dplyr::db_data_type() for Presto.

S3 implementation of dplyr::db_explain() for Presto.

S3 implementation of dplyr::db_query_rows() for Presto.

S3 implementation of db_collect for Presto.

S3 implementation of collect for Presto.

S3 implementation of compute for Presto.

Usage

## S3 method for class 'PrestoConnection'
db_desc(x)

## S3 method for class 'PrestoConnection'
db_data_type(con, fields, ...)

## S3 method for class 'PrestoConnection'
db_explain(con, sql, ...)

## S3 method for class 'PrestoConnection'
db_query_rows(con, sql, ...)

## S3 method for class 'PrestoConnection'
db_collect(con, sql, n = -1, warn_incomplete = TRUE, ...)

## S3 method for class 'tbl_presto'
collect(x, ..., n = Inf, warn_incomplete = TRUE)

## S3 method for class 'tbl_presto'
compute(x, name, temporary = FALSE, ..., cte = FALSE)

Arguments

x

A lazy data frame backed by a database query.

cte

[Experimental] An experimental feature to save the query to a common table expression. Default to FALSE. See vignette("common-table-expressions")


dbplyr database methods

Description

dbplyr database methods

Usage

## S3 method for class 'PrestoConnection'
db_list_tables(con)

## S3 method for class 'PrestoConnection'
db_has_table(con, table)

## S3 method for class 'PrestoConnection'
db_write_table(
  con,
  table,
  types,
  values,
  temporary = FALSE,
  overwrite = FALSE,
  ...,
  with = NULL
)

## S3 method for class 'PrestoConnection'
db_copy_to(
  con,
  table,
  values,
  overwrite = FALSE,
  types = NULL,
  temporary = TRUE,
  unique_indexes = NULL,
  indexes = NULL,
  analyze = TRUE,
  ...,
  in_transaction = TRUE,
  with = NULL
)

## S3 method for class 'PrestoConnection'
db_compute(
  con,
  table,
  sql,
  temporary = TRUE,
  unique_indexes = list(),
  indexes = list(),
  analyze = TRUE,
  with = NULL,
  ...
)

## S3 method for class 'PrestoConnection'
db_sql_render(con, sql, ..., use_presto_cte = TRUE)

Arguments

con

A PrestoConnection as returned by dbConnect().

table

Table name

types

Column types. If not provided, column types are inferred using dbDataType.

values

A data.frame.

temporary

If a temporary table should be used. Not supported. Only FALSE is accepted.

overwrite

If an existing table should be overwritten.

...

Extra arguments to be passed to individual methods.

with

An optional WITH clause for the CREATE TABLE statement.

unique_indexes, indexes, analyze, in_transaction

Ignored. Included for compatibility with generics.

sql

A SQL statement.

use_presto_cte

[Experimental] A logical value indicating if to use common table expressions stored in PrestoConnection when possible. Default to TRUE. See vignette("common-table-expressions").


Inform the dbplyr version used in this package

Description

Inform the dbplyr version used in this package

Usage

## S3 method for class 'PrestoConnection'
dbplyr_edition(con)

Arguments

con

A DBIConnection object.


A dummy PrestoConnection

Description

A dummy PrestoConnection

Usage

dummyPrestoConnection()

Examples

dummyPrestoConnection()

Create dummy tables in Presto for testing

Description

create_primitive_arrays_table() creates a dummy table that has ARRAYs of primitive Presto data types.

create_primitive_maps_table() creates a dummy table that has MAPs of primitive Presto data types.

create_primitive_types_table() creates a dummy table that has primitive Presto data types.

create_primitive_rows_table() creates a dummy table that has all primitive data types included in one ROW type column.

create_array_of_rows_table() creates a dummy table that has an ARRAY(ROW) column that has 2 ROW elements, each containing all 17 supported primitive data types.

create_array_of_maps_table() creates a dummy table that has 17 ARRAY(MAP) columns, each of which has an ARRAY of 2 MAP elements.

Usage

create_primitive_arrays_table(
  con,
  table_name = "presto_primitive_arrays",
  time_zone = "America/New_York",
  verbose = TRUE
)

create_primitive_maps_table(
  con,
  table_name = "presto_primitive_maps",
  time_zone = "America/New_York",
  verbose = TRUE
)

create_primitive_types_table(
  con,
  table_name = "presto_primitive_types",
  time_zone = "America/New_York",
  verbose = TRUE
)

create_primitive_rows_table(
  con,
  table_name = "presto_primitive_rows",
  time_zone = "America/New_York",
  verbose = TRUE
)

create_array_of_rows_table(
  con,
  table_name = "presto_array_of_rows",
  time_zone = "America/New_York",
  verbose = TRUE
)

create_array_of_maps_table(
  con,
  table_name = "presto_array_of_maps",
  time_zone = "America/New_York",
  verbose = TRUE
)

Arguments

con

A valid PrestoConnection object.

table_name

The resulting table name.

time_zone

Time zone string for data types that require a time zone. Default to "America/New_York".

verbose

Boolean indicating whether messages should be printed. Default to TRUE.

Details

We construct the arrays-of-primitive-types table by putting two different values of the same type and a NULL value in an array. In this way, the three values of the same type appear together in the source code and therefore are easier to compare. For integer values, we use the theoretical lower bound (i.e., minimum value) and the theoretical upper bound (i.e., maximum value) as the two values. The field names are taken from the Presto data types they represent.

Here's the complete primitive type values included in the table

Index Column Type ARRAY values
1 boolean BOOLEAN [true, false, null]
2 tinyint TINYINT [-128, 127, null]
3 smallint SMALLINT [-32768, 32767, null]
4 integer INTEGER [-2147483647, 2147483647, null]
5 bigint BIGINT [-9007199254740991, 9007199254740991, null]
6 real REAL [1.0, 2.0, null]
7 double DOUBLE [1.0, 2.0, null]
8 decimal DECIMAL [-9007199254740991.5, 9007199254740991.5, null]
9 varchar VARCHAR ['abc', 'def', null]
10 char CHAR ['a', 'b', null]
11 varbinary VARBINARY ['abc', 'def', null]
12 date DATE ['2000-01-01', '2000-01-02', null]
13 time TIME ['01:02:03.456', '02:03:04.567', null]
14 time_with_tz TIME WITH TIME ZONE ['01:02:03.456 \<tz\>', '02:03:04.567 \<tz\>', null]
15 timestamp TIMESTAMP ['2000-01-01 01:02:03.456', '2000-01-02 02:03:04.567', null]
16 timestamp_with_tz TIMESTAMP WITH TIME ZONE ['2000-01-01 01:02:03.456 \<tz\>', '2000-01-02 02:03:04.567 \<tz\>', null]
17 interval_year_to_month INTERVAL YEAR TO MONTH ['14' MONTH, '28' MONTH, null]
18 interval_day_to_second INTERVAL DAY TO SECOND ['2 4:5:6.500' DAY TO SECOND, '3 7:8:9.600' DAY TO SECOND, null]

We construct the maps-of-primitive-types table by first creating a table with ARRAYs of all primitive data types. We then use the MAP() function to create the MAPs from ARRAYs.

We construct the primitive-types table by first creating a table with ARRAYs of all primitive data types. We then use Presto's UNNEST() function to expand the arrays into three separate rows. Each supported Presto data type has three rows in the table so that the resulting R data frame is distinctly different from a simple named list.

We construct the primitive-rows table by first creating a table with all primitive data types. We then use Presto's ⁠CAST(ROW() AS ROW())⁠ function to create the ROW column.

We construct the array-of-rows table by first creating a table with a ROW type column that includes all 17 supported primitive data types. We then use the ARRAY[] function to construct the 2-element ARRAY(ROW) column.

We construct the array-of-maps table by first creating a table a primitive MAP table and then calling the ARRAY[] function to create the ARRAY(MAP) columns.


A convenient wrapper around Kerberos config

Description

The configs specify authentication protocol and additional settings.

Usage

kerberos_configs(user = "", password = "", service_name = "presto")

Arguments

user

User name to pass to httr::authenticate(). Default to "".

password

Password to pass to httr::authenticate(). Default to "".

service_name

The service name. Default to "presto".

Value

A httr::config() output that can be passed to the request.config argument of dbConnect().


Check if default database is available.

Description

presto_default() works similarly but returns a connection on success and throws a testthat skip condition on failure, making it suitable for use in tests.

RPresto examples and tests connect to a default database via dbConnect(Presto(), ...). This function checks if that database is available, and if not, displays an informative message.

Usage

presto_default(...)

presto_has_default(...)

Arguments

...

Additional arguments passed on to DBI::dbConnect()

Examples

if (presto_has_default()) {
  db <- presto_default()
  print(dbListTables(db))
  dbDisconnect(db)
} else {
  message("No database connection.")
}

Compose query to create a simple table using a statement

Description

Compose query to create a simple table using a statement

Usage

sqlCreateTableAs(con, name, sql, with = NULL, ...)

Arguments

con

A database connection.

name

The table name, passed on to DBI::dbQuoteIdentifier(). Options are:

  • a character string with the unquoted DBMS table name, e.g. "table_name",

  • a call to DBI::Id() with components to the fully qualified table name, e.g. Id(schema = "my_schema", table = "table_name")

  • a call to DBI::SQL() with the quoted and fully qualified table name given verbatim, e.g. SQL('"my_schema"."table_name"')

sql

a character string containing SQL statement.

with

An optional WITH clause for the CREATE TABLE statement.

...

Other arguments used by individual methods.


S3 implementation of sql_query_fields for Presto.

Description

S3 implementation of sql_query_fields for Presto.

S3 implementation of custom escape method for sql_escape_date

S3 implementation of custom escape method for sql_escape_datetime

S3 implementation of sql_translation for Presto.

Usage

## S3 method for class 'PrestoConnection'
sql_query_fields(con, sql, ...)

## S3 method for class 'PrestoConnection'
sql_escape_date(con, x)

## S3 method for class 'PrestoConnection'
sql_escape_datetime(con, x)

## S3 method for class 'PrestoConnection'
sql_translation(con)

dbplyr SQL methods

Description

dbplyr SQL methods

Usage

## S3 method for class 'PrestoConnection'
sql_query_save(con, sql, name, temporary = TRUE, ..., with = NULL)

Arguments

con

A database connection.

sql

a character string containing SQL statement.

name

The table name, passed on to DBI::dbQuoteIdentifier(). Options are:

  • a character string with the unquoted DBMS table name, e.g. "table_name",

  • a call to DBI::Id() with components to the fully qualified table name, e.g. Id(schema = "my_schema", table = "table_name")

  • a call to DBI::SQL() with the quoted and fully qualified table name given verbatim, e.g. SQL('"my_schema"."table_name"')

temporary

If a temporary table should be created. Default to TRUE in the dbplyr::sql_query_save() generic. The default value generates an error in Presto. Using temporary = FALSE to save the query in a permanent table.

...

Other arguments used by individual methods.

with

An optional WITH clause for the CREATE TABLE statement.


dplyr integration to connect to a Presto database.

Description

Allows you to connect to an existing database through a presto connection.

Usage

src_presto(
  catalog = NULL,
  schema = NULL,
  user = NULL,
  host = NULL,
  port = NULL,
  source = NULL,
  session.timezone = NULL,
  parameters = NULL,
  bigint = c("integer", "integer64", "numeric", "character"),
  con = NULL,
  ...
)

Arguments

catalog

Catalog to use in the connection

schema

Schema to use in the connection

user

User name to use in the connection

host

Host name to connect to the database

port

Port number to use with the host name

source

Source to specify for the connection

session.timezone

Time zone for the connection

parameters

Additional parameters to pass to the connection

bigint

The R type that Presto's 64-bit integer (BIGINT) types should be translated to. The default is "integer", which returns R's integer type, but results in NA for values above/below +/-2147483647. "integer64" returns a bit64::integer64, which allows the full range of 64 bit integers. "numeric" coerces into R's double type but might result in precision loss. Lastly, "character" casts into R's character type.

con

An object that inherits from PrestoConnection, typically generated by DBI::dbConnect. When a valid connection object is supplied, Other arguments are ignored.

...

For src_presto other arguments passed on to the underlying database connector dbConnect. For tbl.src_presto, it is included for compatibility with the generic, but otherwise ignored.

Examples

## Not run: 
# To connect to a database
my_db <- src_presto(
  catalog = "memory",
  schema = "default",
  user = Sys.getenv("USER"),
  host = "http://localhost",
  port = 8080,
  session.timezone = "Asia/Kathmandu"
)
# Use a PrestoConnection
my_con <- DBI::dbConnect(
  catalog = "memory",
  schema = "default",
  user = Sys.getenv("USER"),
  host = "http://localhost",
  port = 8080,
  session.timezone = "Asia/Kathmandu"
)
my_db2 <- src_presto(con = my_con)

## End(Not run)

dplyr integration to connect to a table in a database.

Description

Use src_presto to connect to an existing database, and tbl to connect to tables within that database. If you're unsure of the arguments to pass, please ask your database administrator for the values of these variables.

Automatically create a Presto remote database source to wrap around the PrestoConnection object via which DBI APIs can be called.

Usage

## S3 method for class 'src_presto'
tbl(src, from, ..., vars = NULL)

## S3 method for class 'PrestoConnection'
tbl(src, from, ...)

## S3 method for class 'src_presto'
copy_to(
  dest,
  df,
  name = deparse(substitute(df)),
  overwrite = FALSE,
  types = NULL,
  temporary = FALSE,
  unique_indexes = NULL,
  indexes = NULL,
  analyze = FALSE,
  ...,
  in_transaction = FALSE,
  with = NULL
)

## S3 method for class 'PrestoConnection'
copy_to(
  dest,
  df,
  name = deparse(substitute(df)),
  overwrite = FALSE,
  types = NULL,
  temporary = FALSE,
  unique_indexes = NULL,
  indexes = NULL,
  analyze = FALSE,
  ...,
  in_transaction = FALSE,
  with = NULL
)

Arguments

src

A PrestoConnection object produced by DBI::dbConnect().

from

Either a string (giving a table name) or a literal dbplyr::sql() string.

...

Passed on to dbplyr::tbl_sql()

vars

Provide column names as a character vector to avoid retrieving them from the database.

dest

remote data source

df

local data frame

name

name for new remote table.

overwrite

If TRUE, will overwrite an existing table with name name. If FALSE, will throw an error if name already exists.

with

An optional WITH clause for the CREATE TABLE statement.

Examples

## Not run: 
# First create a database connection with src_presto, then reference a tbl
# within that database
my_db <- src_presto(
  catalog = "memory",
  schema = "default",
  user = Sys.getenv("USER"),
  host = "http://localhost",
  port = 8080,
  session.timezone = "Asia/Kathmandu"
)
my_tbl <- tbl(my_db, "my_table")

## End(Not run)
## Not run: 
# First create a database connection, then reference a tbl within that
# database
my_con <- DBI::dbConnect(
  catalog = "memory",
  schema = "default",
  user = Sys.getenv("USER"),
  host = "http://localhost",
  port = 8080,
  session.timezone = "Asia/Kathmandu"
)
my_tbl <- tbl(my_con, "my_table")

## End(Not run)

mirror server hosted at Truenetwork, Russian Federation.