Title: | Rectangle Nested Lists |
Version: | 0.3.1 |
Description: | A tool to rectangle a nested list, that is to convert it into a tibble. This is done automatically or according to a given specification. A common use case is for nested lists coming from parsing JSON files or the JSON response of REST APIs. It is supported by the 'vctrs' package and therefore offers a wide support of vector types. |
License: | GPL-3 |
URL: | https://github.com/mgirlich/tibblify, https://mgirlich.github.io/tibblify/ |
BugReports: | https://github.com/mgirlich/tibblify/issues |
Depends: | R (≥ 3.6.0) |
Imports: | cli (≥ 3.6.2), lifecycle (≥ 1.0.4), purrr (≥ 1.0.2), rlang (≥ 1.1.3), tibble (≥ 3.2.1), tidyselect (≥ 1.2.0), vctrs (≥ 0.6.5), withr (≥ 2.5.2) |
Suggests: | covr (≥ 3.6.1), jsonlite (≥ 1.8.0), knitr (≥ 1.40), memoise (≥ 2.0.1), rmarkdown (≥ 2.16), spelling (≥ 2.2), testthat (≥ 3.1.4), yaml (≥ 2.3.6) |
LinkingTo: | vctrs (≥ 0.6.5) |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
Language: | en-US |
LazyData: | true |
NeedsCompilation: | yes |
RoxygenNote: | 7.2.3 |
Packaged: | 2024-01-11 07:07:59 UTC; mgirlich |
Author: | Maximilian Girlich [aut, cre, cph], Kirill Müller [ctb] |
Maintainer: | Maximilian Girlich <maximilian.girlich@outlook.com> |
Repository: | CRAN |
Date/Publication: | 2024-01-11 07:30:02 UTC |
tibblify: Rectangle Nested Lists
Description
A tool to rectangle a nested list, that is to convert it into a tibble. This is done automatically or according to a given specification. A common use case is for nested lists coming from parsing JSON files or the JSON response of REST APIs. It is supported by the 'vctrs' package and therefore offers a wide support of vector types.
Author(s)
Maintainer: Maximilian Girlich maximilian.girlich@outlook.com [copyright holder]
Other contributors:
Kirill Müller [contributor]
See Also
Useful links:
Report bugs at https://github.com/mgirlich/tibblify/issues
Printing tibblify specifications
Description
Printing tibblify specifications
Usage
## S3 method for class 'tspec'
print(x, width = NULL, ..., names = NULL)
## S3 method for class 'tspec_df'
format(x, width = NULL, ..., names = NULL)
Arguments
x |
Spec to format or print |
width |
Width of text output to generate. |
... |
These dots are for future extensions and must be empty. |
names |
Should names be printed even if they can be deduced from the spec? |
Value
x
is returned invisibly.
Examples
spec <- tspec_df(
a = tib_int("a"),
new_name = tib_chr("b"),
row = tib_row(
"row",
x = tib_int("x")
)
)
print(spec, names = FALSE)
print(spec, names = TRUE)
Examine the column specification
Description
Examine the column specification
Usage
get_spec(x)
Arguments
x |
The data frame object to extract from. |
Value
A tibblify specification object.
Examples
df <- tibblify(list(list(x = 1, y = "a"), list(x = 2)))
get_spec(df)
GitHub Repositories
Description
A dataset containing some basic information about some GitHub repositories.
Usage
gh_repos
Format
A list of lists.
GitHub Users
Description
A dataset containing some basic information about six GitHub users.
Usage
gh_users
Format
A list of lists.
Game of Thrones POV characters
Description
The data is from the repurrrsive package.
Usage
got_chars
Format
A unnamed list with 30 components, each representing a POV character. Each character's component is a named list of length 18, containing information such as name, aliases, and house allegiances.
Details
Info on the point-of-view (POV) characters from the first five books in the Song of Ice and Fire series by George R. R. Martin. Retrieved from An API Of Ice And Fire.
Source
Examples
got_chars
str(lapply(got_chars, `[`, c("name", "culture")))
Guess the tibblify()
Specification
Description
Use guess_tspec()
if you don't know the input type.
Use guess_tspec_df()
if the input is a data frame or an object list.
Use guess_tspec_objecte()
is the input is an object.
Usage
guess_tspec(
x,
...,
empty_list_unspecified = FALSE,
simplify_list = FALSE,
inform_unspecified = should_inform_unspecified(),
call = rlang::current_call()
)
guess_tspec_df(
x,
...,
empty_list_unspecified = FALSE,
simplify_list = FALSE,
inform_unspecified = should_inform_unspecified(),
call = rlang::current_call(),
arg = rlang::caller_arg(x)
)
guess_tspec_object(
x,
...,
empty_list_unspecified = FALSE,
simplify_list = FALSE,
call = rlang::current_call()
)
Arguments
x |
A nested list. |
... |
These dots are for future extensions and must be empty. |
empty_list_unspecified |
Treat empty lists as unspecified? |
simplify_list |
Should scalar lists be simplified to vectors? |
inform_unspecified |
Inform about fields whose type could not be determined? |
call |
The execution environment of a currently running function, e.g.
|
arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
Value
A specification object that can used in tibblify()
.
Examples
guess_tspec(list(x = 1, y = "a"))
guess_tspec(list(list(x = 1), list(x = 2)))
guess_tspec(gh_users)
Convert a data frame to a tree
Description
Convert a data frame to a tree
Usage
nest_tree(data, id_col, parent_col, children_to)
Arguments
data |
A data frame. |
id_col |
Id column. The values must be unique and non-missing. |
parent_col |
Parent column. Each value must either be missing (for the
root elements) or appear in the |
children_to |
Name of the column the children should be put. |
Value
A tree like data frame.
Examples
df <- tibble::tibble(
id = 1:5,
x = letters[1:5],
parent = c(NA, NA, 1L, 2L, 4L)
)
out <- nest_tree(df, id, parent, "children")
out
out$children
out$children[[2]]$children
Parse an OpenAPI spec
Description
Use
parse_openapi_spec()
to parse a OpenAPI spec
or use parse_openapi_schema()
to parse a OpenAPI schema.
Usage
parse_openapi_spec(file)
parse_openapi_schema(file)
Arguments
file |
Either a path to a file, a connection, or literal data (a single string). |
Value
For parse_openapi_spec()
a data frame with the columns
-
endpoint
<character>
Name of the endpoint. -
operation
<character>
The http operation; one of"get"
,"put"
,"post"
,"delete"
,"options"
,"head"
,"patch"
, or"trace"
. -
status_code
<character>
The http status code. May contain wildcards like2xx
for all response codes between200
and299
. -
media_type
<character>
The media type. -
spec
<list>
A list of tibblify specifications.
For parse_openapi_schema()
a tibblify spec.
Examples
file <- '{
"$schema": "http://json-schema.org/draft-04/schema",
"title": "Starship",
"description": "A vehicle.",
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of this vehicle. The common name, e.g. Sand Crawler."
},
"model": {
"type": "string",
"description": "The model or official name of this vehicle."
},
"url": {
"type": "string",
"format": "uri",
"description": "The hypermedia URL of this resource."
},
"edited": {
"type": "string",
"format": "date-time",
"description": "the ISO 8601 date format of the time this resource was edited."
}
},
"required": [
"name",
"model",
"edited"
]
}'
parse_openapi_schema(file)
Politicians
Description
A dataset containing some basic information about some politicians.
Usage
politicians
Format
A list of lists.
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
Determine whether to inform about unspecified fields in spec
Description
Wrapper around getOption("tibblify.show_unspecified")
that implements some
#' fall back logic if the option is unset. This returns:
Usage
should_inform_unspecified()
Details
-
TRUE
if the option is set toTRUE
-
FALSE
if the option is set toFALSE
-
FALSE
if the option is unset and we appear to be running tests -
TRUE
otherwise
Value
TRUE
or FALSE
.
Create a Field Specification
Description
Use these functions to specify how to convert the fields of an object.
Usage
tib_unspecified(key, ..., required = TRUE)
tib_scalar(
key,
ptype,
...,
required = TRUE,
fill = NULL,
ptype_inner = ptype,
transform = NULL
)
tib_lgl(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = logical(),
transform = NULL
)
tib_int(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = integer(),
transform = NULL
)
tib_dbl(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = double(),
transform = NULL
)
tib_chr(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = character(),
transform = NULL
)
tib_date(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = vctrs::new_date(),
transform = NULL
)
tib_chr_date(key, ..., required = TRUE, fill = NULL, format = "%Y-%m-%d")
tib_vector(
key,
ptype,
...,
required = TRUE,
fill = NULL,
ptype_inner = ptype,
transform = NULL,
elt_transform = NULL,
input_form = c("vector", "scalar_list", "object"),
values_to = NULL,
names_to = NULL
)
tib_lgl_vec(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = logical(),
transform = NULL,
elt_transform = NULL,
input_form = c("vector", "scalar_list", "object"),
values_to = NULL,
names_to = NULL
)
tib_int_vec(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = integer(),
transform = NULL,
elt_transform = NULL,
input_form = c("vector", "scalar_list", "object"),
values_to = NULL,
names_to = NULL
)
tib_dbl_vec(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = double(),
transform = NULL,
elt_transform = NULL,
input_form = c("vector", "scalar_list", "object"),
values_to = NULL,
names_to = NULL
)
tib_chr_vec(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = character(),
transform = NULL,
elt_transform = NULL,
input_form = c("vector", "scalar_list", "object"),
values_to = NULL,
names_to = NULL
)
tib_date_vec(
key,
...,
required = TRUE,
fill = NULL,
ptype_inner = vctrs::new_date(),
transform = NULL,
elt_transform = NULL,
input_form = c("vector", "scalar_list", "object"),
values_to = NULL,
names_to = NULL
)
tib_chr_date_vec(
key,
...,
required = TRUE,
fill = NULL,
input_form = c("vector", "scalar_list", "object"),
values_to = NULL,
names_to = NULL,
format = "%Y-%m-%d"
)
tib_variant(
key,
...,
required = TRUE,
fill = NULL,
transform = NULL,
elt_transform = NULL
)
tib_recursive(.key, ..., .children, .children_to = .children, .required = TRUE)
tib_row(.key, ..., .required = TRUE)
tib_df(.key, ..., .required = TRUE, .names_to = NULL)
Arguments
key , .key |
The path to the field in the object. |
... |
These dots are for future extensions and must be empty. |
required , .required |
Throw an error if the field does not exist? |
ptype |
A prototype of the desired output type of the field. |
fill |
Optionally, a value to use if the field does not exist. |
ptype_inner |
A prototype of the field. |
transform |
A function to apply to the whole vector after casting to
|
format |
Optional, a string passed to the |
elt_transform |
A function to apply to each element before casting
to |
input_form |
A string that describes what structure the field has. Can be one of:
|
values_to |
Can be one of the following:
|
names_to |
Can be one of the following:
|
.children |
A string giving the name of field that contains the children. |
.children_to |
A string giving the column name to store the children. |
.names_to |
A string giving the name of the column which will contain
the names of elements of the object list. If |
Details
There are basically five different tib_*()
functions
-
tib_scalar(ptype)
: Cast the field to a length one vector of typeptype
. -
tib_vector(ptype)
: Cast the field to an arbitrary length vector of typeptype
. -
tib_variant()
: Cast the field to a list. -
tib_row()
: Cast the field to a named list. -
tib_df()
: Cast the field to a tibble.
There are some special shortcuts of tib_scalar()
resp. tib_vector()
for
the most common prototypes
-
logical()
:tib_lgl()
resp.tib_lgl_vec()
-
integer()
:tib_int()
resp.tib_int_vec()
-
double()
:tib_dbl()
resp.tib_dbl_vec()
-
character()
:tib_chr()
resp.tib_chr_vec()
-
Date
:tib_date()
resp.tib_date_vec()
Further, there is also a special shortcut for dates encoded as character:
tib_chr_date()
resp. tib_chr_date_vec()
.
Value
A tibblify field collector.
Examples
tib_int("int")
tib_int("int", required = FALSE, fill = 0)
tib_scalar("date", Sys.Date(), transform = function(x) as.Date(x, format = "%Y-%m-%d"))
tib_df(
"data",
.names_to = "id",
age = tib_int("age"),
name = tib_chr("name")
)
Rectangle a nested list
Description
Rectangle a nested list
Usage
tibblify(x, spec = NULL, unspecified = NULL)
Arguments
x |
A nested list. |
spec |
A specification how to convert |
unspecified |
A string that describes what happens if the specification contains unspecified fields. Can be one of
|
Value
Either a tibble or a list, depending on the specification
See Also
Use untibblify()
to undo the result of tibblify()
.
Examples
# List of Objects -----------------------------------------------------------
x <- list(
list(id = 1, name = "Tyrion Lannister"),
list(id = 2, name = "Victarion Greyjoy")
)
tibblify(x)
# Provide a specification
spec <- tspec_df(
id = tib_int("id"),
name = tib_chr("name")
)
tibblify(x, spec)
# Object --------------------------------------------------------------------
# Provide a specification for a single object
tibblify(x[[1]], tspec_object(spec))
# Recursive Trees -----------------------------------------------------------
x <- list(
list(
id = 1,
name = "a",
children = list(
list(id = 11, name = "aa"),
list(id = 12, name = "ab", children = list(
list(id = 121, name = "aba")
))
))
)
spec <- tspec_recursive(
tib_int("id"),
tib_chr("name"),
.children = "children"
)
out <- tibblify(x, spec)
out
out$children
out$children[[1]]$children[[2]]
Combine multiple specifications
Description
Combine multiple specifications
Usage
tspec_combine(...)
Arguments
... |
Specifications to combine. |
Value
A tibblify specification.
Examples
# union of fields
tspec_combine(
tspec_df(tib_int("a")),
tspec_df(tib_chr("b"))
)
# unspecified + x -> x
tspec_combine(
tspec_df(tib_unspecified("a"), tib_chr("b")),
tspec_df(tib_int("a"), tib_variant("b"))
)
# scalar + vector -> vector
tspec_combine(
tspec_df(tib_chr("a")),
tspec_df(tib_chr_vec("a"))
)
# scalar/vector + variant -> variant
tspec_combine(
tspec_df(tib_chr("a")),
tspec_df(tib_variant("a"))
)
Create a Tibblify Specification
Description
Use tspec_df()
to specify how to convert a list of objects to a tibble.
Use tspec_row()
resp. tspec_object()
to specify how to convert an object
to a one row tibble resp. a list.
Usage
tspec_df(
...,
.input_form = c("rowmajor", "colmajor"),
.names_to = NULL,
vector_allows_empty_list = FALSE
)
tspec_object(
...,
.input_form = c("rowmajor", "colmajor"),
vector_allows_empty_list = FALSE
)
tspec_recursive(
...,
.children,
.children_to = .children,
.input_form = c("rowmajor", "colmajor"),
vector_allows_empty_list = FALSE
)
tspec_row(
...,
.input_form = c("rowmajor", "colmajor"),
vector_allows_empty_list = FALSE
)
Arguments
... |
Column specification created by |
.input_form |
The input form of data frame like lists. Can be one of:
|
.names_to |
A string giving the name of the column which will contain
the names of elements of the object list. If |
vector_allows_empty_list |
Should empty lists for |
.children |
A string giving the name of field that contains the children. |
.children_to |
A string giving the column name to store the children. |
Details
In column major format all fields are required, regardless of the required
argument.
Value
A tibblify specification.
Examples
tspec_df(
id = tib_int("id"),
name = tib_chr("name"),
aliases = tib_chr_vec("aliases")
)
# To create multiple columns of the same type use the bang-bang-bang (!!!)
# operator together with `purrr::map()`
tspec_df(
!!!purrr::map(purrr::set_names(c("id", "age")), tib_int),
!!!purrr::map(purrr::set_names(c("name", "title")), tib_chr)
)
# The `tspec_*()` functions can also be nested
spec1 <- tspec_object(
int = tib_int("int"),
chr = tib_chr("chr")
)
spec2 <- tspec_object(
int2 = tib_int("int2"),
chr2 = tib_chr("chr2")
)
tspec_df(spec1, spec2)
Unnest a recursive data frame
Description
Unnest a recursive data frame
Usage
unnest_tree(
data,
id_col,
child_col,
level_to = "level",
parent_to = "parent",
ancestors_to = NULL
)
Arguments
data |
A data frame. |
id_col |
A column that uniquely identifies each observation. |
child_col |
Column containing the children of an observation. This must
be a list where each element is either |
level_to |
A string ( |
parent_to |
A string ( |
ancestors_to |
A string ( |
Value
A data frame.
Examples
df <- tibble(
id = 1L,
name = "a",
children = list(
tibble(
id = 11:12,
name = c("b", "c"),
children = list(
NULL,
tibble(
id = 121:122,
name = c("d", "e")
)
)
)
)
)
unnest_tree(
df,
id_col = "id",
child_col = "children",
level_to = "level",
parent_to = "parent",
ancestors_to = "ancestors"
)
Unpack a tibblify specification
Description
Unpack a tibblify specification
Usage
unpack_tspec(
spec,
...,
fields = NULL,
recurse = TRUE,
names_sep = NULL,
names_repair = c("unique", "universal", "check_unique", "unique_quiet",
"universal_quiet"),
names_clean = NULL
)
camel_case_to_snake_case(names)
Arguments
spec |
A tibblify specification. |
... |
These dots are for future extensions and must be empty. |
fields |
A string of the fields to unpack. |
recurse |
Should unpack recursively? |
names_sep |
If |
names_repair |
Used to check that output data frame has valid names. Must be one of the following options:
See |
names_clean |
A function to clean names after repairing. For example
use |
names |
Names to clean |
Value
A tibblify spec.
Examples
spec <- tspec_df(
tib_lgl("a"),
tib_row("x", tib_int("b"), tib_chr("c")),
tib_row("y", tib_row("z", tib_chr("d")))
)
unpack_tspec(spec)
# only unpack `x`
unpack_tspec(spec, fields = "x")
# do not unpack the fields in `y`
unpack_tspec(spec, recurse = FALSE)
Convert a data frame or object into a nested list
Description
The inverse operation to tibblify()
. It converts a data frame or an object
into a nested list.
Usage
untibblify(x, spec = NULL)
Arguments
x |
A data frame or an object. |
spec |
Optional. A spec object which was used to create |
Value
A nested list.
Examples
x <- tibble(
a = 1:2,
b = tibble(
x = c("a", "b"),
y = c(1.5, 2.5)
)
)
untibblify(x)