Title: | Generate Fake Datasets for Prototyping and Teaching |
Version: | 1.0.0 |
Description: | Create fake datasets that can be used for prototyping and teaching. This package provides a set of functions to generate fake data for a variety of data types, such as dates, addresses, and names. It can be used for prototyping (notably in 'shiny') or as a tool to teach data manipulation and data visualization. |
License: | MIT + file LICENSE |
URL: | https://github.com/Thinkr-open/fakir |
BugReports: | https://github.com/Thinkr-open/fakir/issues |
Depends: | R (≥ 2.10) |
Imports: | attempt, charlatan, dplyr, glue, lubridate, magrittr, purrr, stats, tibble, tidyr, withr |
Suggests: | covr, ggplot2, knitr, pkgdown, rmarkdown, sf, testthat |
VignetteBuilder: | knitr |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Packaged: | 2023-04-12 19:30:30 UTC; colinfay |
Author: | Colin Fay |
Maintainer: | Colin Fay <contact@colinfay.me> |
Repository: | CRAN |
Date/Publication: | 2023-04-13 11:20:02 UTC |
fakir: Generate Fake Datasets for Prototyping and Teaching
Description
Create fake datasets that can be used for prototyping and teaching. This package provides a set of functions to generate fake data for a variety of data types, such as dates, addresses, and names. It can be used for prototyping (notably in 'shiny') or as a tool to teach data manipulation and data visualization.
Author(s)
Maintainer: Colin Fay contact@colinfay.me (ORCID)
Authors:
Sebastien Rochette sebastien@thinkr.fr (ORCID)
Other contributors:
ThinkR [copyright holder]
See Also
Useful links:
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the magrittr placeholder. |
rhs |
A function call using the magrittr semantics. |
Value
The result of calling 'rhs(lhs)'.
Create a fake base of tickets
Description
A fake base of customer support tickets
Usage
fake_base_clients(n, local = c("en_US", "fr_FR"), seed = 2811)
Arguments
n |
the number of clients |
local |
the local of the base. Currently supported : "fr_FR" and "en_US". |
seed |
the random seed, default is 2811 |
Value
A dataframe of fake clients.
Examples
fake_base_clients(n = 10)
fake_base_clients(n = 10, local = "fr_FR")
Fake base of products
Description
Fake base of products
Usage
fake_products(n, seed = 2811)
Arguments
n |
Number of Products to generate |
seed |
the random seed, default is 2811 |
Value
a dataframe
A dataframe of fake products.
Examples
fake_products(10)
Base transport
Description
Base transport
Create fake transport sondage
Usage
fake_survey_people(n, seed = 2811, local = c("fr_FR"))
fake_sondage_people(...)
fake_survey_answers(n = 200, x, seed = 2811, split = FALSE, local = c("fr_FR"))
fake_sondage_answers(...)
Arguments
n |
Number of sondage |
seed |
fixe la graine aleatoire |
local |
the local of the base. Currently supported : "fr_FR" and "en_US". |
x |
Optionnal. fake client data base with "age" column |
split |
Logical. Split database in individuals and answers |
Details
id_individu Unique identification of people with "ID-AAAA-1111" pattern
sexe. sex. c("F" = "Female", "M" = "Male", "O" = "Other"). Some are missing
age age. Some are missing
region. some regions have NA values that may be fill with left_join with fra_sf dataset. Some regions are more represented than others
id_departement. number identifying French department
nom_departement. Name of the department. Some departement have NA values that may be fill using id_departement.
question_date. Date/hour when questionnaire has been answered.
year. year extracted from question_date
3 types for each individuals: travail, commerces, loisirs
distance_km. Average distance (km) to target location. Distance is related to age.
transport. Mean of transport to go to target location. Depends on distance.
time_travel_hours. Average duration (hours) to target location. Depends on distance and transport.
Value
A dataframe of fake result from a .
Examples
fake_survey_people(10)
answers <- fake_sondage_answers()
if (FALSE){
ggplot(answers) +
aes(age, log(distance_km), colour = type) +
geom_point() +
geom_smooth() +
facet_wrap(~type, scales = "free_y")
}
Base ticket client
Description
Une fausse base client de ticket Telecom
Usage
fake_ticket_client(
vol,
x,
n = 200,
split = FALSE,
seed = 2811,
local = c("en_US", "fr_FR")
)
Arguments
vol |
le nombre de tickets a retourner |
x |
Optionnal. fake client data base |
n |
Number of clients in the client database if x not provided |
split |
la base doit elle ĂȘtre separee en deux ? |
seed |
fixe la graine aleatoire |
local |
the local of the base. Currently supported : "fr_FR" and "en_US". |
Details
Same client can have multiple tickets
Some clients are more sampled than others
Some types are more sampled than others
Some etat are more sampled than others
Value
A dataframe of fake tickets.
Examples
x <- fake_ticket_client(1000, split = TRUE)
plot(x$clients$entry_date, x$clients$fidelity_points)
barplot(table(x$tickets$type))
barplot(table(x$tickets$state))
Fake user feedbacks
Description
Fake user feedbacks
Usage
fake_user_feedback(
n,
seed = 2811,
from = "2012-01-01 00:00:01",
to = "2020-01-01 00:00:01"
)
Arguments
n |
Number of feedbacks to generate |
seed |
the random seed, default is 2811 |
from , to |
the date to cover |
Value
a dataframe
Examples
fake_user_feedback(10)
Create a fake support ticket base
Description
Create a fake support ticket base
Usage
fake_visits(
from = "2017-01-01",
to = "2017-12-31",
local = c("en_US", "fr_FR"),
seed = 2811
)
Arguments
from , to |
the date to cover |
local |
the local of the base. Currently supported : "fr_FR" and "en_US". |
seed |
fixe la graine aleatoire |
Value
A dataframe of fake web visits.
Examples
fake_visits()
Map of France
Description
A map of France as sf object. Can be used as dataset or for maps
Usage
fra_sf
Format
A data frame with 96 rows, 5 variables and a spatial geometry (MULTIPOLYGON):
- OBJECTID
polygon identifier
- pays
country: France
- region
region name
- departement
departement name
- id_dpt
departement id
- geometry
polygon geometry