Title: | Standardized Git Repository Data |
Version: | 2.3.4 |
Description: | Obtain standardized data from multiple 'Git' services, including 'GitHub' and 'GitLab'. Designed to be 'Git' service-agnostic, this package assists teams with activities spread across various 'Git' platforms by providing a unified way to access repository data. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
URL: | https://r-world-devs.github.io/GitStats/, https://github.com/r-world-devs/GitStats |
BugReports: | https://github.com/r-world-devs/GitStats/issues |
Imports: | cli, dplyr, glue, httr2, lubridate (≥ 1.8.0), magrittr, rlang (≥ 1.1.0), R6, purrr (≥ 1.0.0), stringr |
Suggests: | spelling, knitr, mockery, rmarkdown, testthat (≥ 3.0.0), withr |
Config/testthat/edition: | 3 |
VignetteBuilder: | knitr |
Language: | en-US |
Depends: | R (≥ 4.1.0) |
NeedsCompilation: | no |
Packaged: | 2025-07-07 08:45:00 UTC; banasm |
Author: | Maciej Banas [aut, cre], Kamil Koziej [aut], Karolina Marcinkowska [aut], Matt Secrest [aut] |
Maintainer: | Maciej Banas <banasmaciek@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-07-07 10:30:10 UTC |
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the magrittr placeholder. |
rhs |
A function call using the magrittr semantics. |
Value
The result of calling rhs(lhs)
.
Create a GitStats
object
Description
Create a GitStats
object
Usage
create_gitstats()
Value
A GitStats
object.
Examples
my_gitstats <- create_gitstats()
Get data on commits
Description
List all commits from all repositories for an organization or a vector of repositories.
Usage
get_commits(
gitstats,
since = NULL,
until = Sys.Date() + lubridate::days(1),
cache = TRUE,
verbose = is_verbose(gitstats),
progress = verbose
)
Arguments
gitstats |
A |
since |
A starting date. |
until |
An end date. |
cache |
A logical, if set to |
verbose |
A logical, |
progress |
A logical, by default set to |
Value
A table of tibble
and gitstats_commits
classes.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
repos = c("openpharma/DataFakeR", "openpharma/visR")
) %>%
set_gitlab_host(
token = Sys.getenv("GITLAB_PAT_PUBLIC"),
orgs = "mbtests"
)
get_commits(my_gitstats, since = "2018-01-01")
## End(Not run)
Get commits statistics
Description
Prepare statistics from the pulled commits data.
Usage
get_commits_stats(
commits,
time_aggregation = c("year", "month", "week", "day"),
group_var
)
Arguments
commits |
A |
time_aggregation |
A character, specifying time aggregation of statistics. |
group_var |
Other grouping variable to be passed to |
Details
To make function work, you need first to get commits data with
GitStats
. See examples section.
Value
A table of commits_stats
class.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
repos = c("r-world-devs/GitStats", "openpharma/visR")
) |>
get_commits(my_gitstats, since = "2022-01-01") |>
get_commits_stats(
time_aggregation = "year",
group_var = author
)
## End(Not run)
Get files
Description
Pulls text files and their content.
Usage
get_files(
gitstats,
pattern = NULL,
depth = Inf,
file_path = NULL,
cache = TRUE,
verbose = is_verbose(gitstats),
progress = verbose
)
Arguments
gitstats |
A |
pattern |
A regular expression. If defined, it pulls content of all
files in a repository matching this pattern reaching to the level of
directories defined by |
depth |
Defines level of directories to retrieve files from. E.g. if set
to |
file_path |
A specific path to file(s) in repositories. May be a
character vector if multiple files are to be pulled. If defined, the
function pulls content of this specific |
cache |
A logical, if set to |
verbose |
A logical, |
progress |
A logical, by default set to |
Details
get_files()
may be used in two ways: either with pattern
(with
optional depth
) or file_path
argument defined.
In the first scenario, GitStats
will pull first a files structure
responding to the passed pattern
and depth
arguments and afterwards
files content for all of these files. In the second scenario, GitStats
will pull only the content of files for the specific file_path
of the
repository.
If user wants to pull a particular file or files, a file_path
approach
seems more reasonable, as it is a faster way since it omits pulling the
whole file structure from the repo.
For example, if user wants to pull content of README.md
and/or NEWS.md
files placed in the root
directories of the repositories, he should take
the file_path
approach as he already knows precisely paths of the files.
On the other hand, if user wants to pull specific type of files (e.g. all
.md
or .Rmd
files in the repository), without knowing their path, it is
recommended to use a pattern
approach, which will trigger GitStats
to
find all the files in the repository on the given level of directories
(pattern
argument) and afterwards pull their content.
The latter approach is slower than the former but may be more useful
depending on users' goals. Both approaches return data in the same format:
tibble
with data on files
, namely their path
and their content
.
Value
A data.frame.
Examples
## Not run:
git_stats <- create_gitstats() |>
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
orgs = c("r-world-devs")
) |>
set_gitlab_host(
token = Sys.getenv("GITLAB_PAT_PUBLIC"),
orgs = "mbtests"
)
rmd_files <- get_files(
gitstats = git_stats,
pattern = "\\.Rmd",
depth = 2L
)
app_files <- get_files(
gitstats = git_stats,
file_path = c("R/app.R", "R/ui.R", "R/server.R")
)
## End(Not run)
Get data on issues
Description
List all issues from all repositories for an organization or a vector of repositories.
Usage
get_issues(
gitstats,
since = NULL,
until = Sys.Date() + lubridate::days(1),
state = NULL,
cache = TRUE,
verbose = is_verbose(gitstats),
progress = verbose
)
Arguments
gitstats |
A |
since |
A starting date. |
until |
An end date. |
state |
An optional character, by default |
cache |
A logical, if set to |
verbose |
A logical, |
progress |
A logical, by default set to |
Value
A table of tibble
and gitstats_issues
classes.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
repos = c("openpharma/DataFakeR", "openpharma/visR")
) %>%
set_gitlab_host(
token = Sys.getenv("GITLAB_PAT_PUBLIC"),
orgs = "mbtests"
)
get_issues(my_gitstats, since = "2018-01-01", state = "open")
## End(Not run)
Get issues statistics
Description
Prepare statistics from the pulled issues data.
Usage
get_issues_stats(
issues,
time_aggregation = c("year", "month", "week", "day"),
group_var
)
Arguments
issues |
A |
time_aggregation |
A character, specifying time aggregation of statistics. |
group_var |
Other grouping variable to be passed to |
Details
To make function work, you need first to get issues data with
GitStats
. See examples section.
Value
A table of issues_stats
class.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
repos = c("r-world-devs/GitStats", "openpharma/visR")
) |>
get_issues(my_gitstats, since = "2022-01-01") |>
get_issues_stats(
time_aggregation = "month",
group_var = state
)
## End(Not run)
Get data on organizations
Description
Pulls data on all organizations from a Git host and parses it into table format.
Usage
get_orgs(gitstats, cache = TRUE, verbose = is_verbose(gitstats))
Arguments
gitstats |
A GitStats object. |
cache |
A logical, if set to |
verbose |
A logical, |
Value
A data.frame.
Examples
## Not run:
my_gitstats <- create_gitstats() |>
set_github_host(
orgs = c("r-world-devs", "openpharma"),
token = Sys.getenv("GITHUB_PAT")
) |>
set_gitlab_host(
orgs = "mbtests",
token = Sys.getenv("GITLAB_PAT_PUBLIC")
)
get_orgs(my_gitstats)
## End(Not run)
Get release logs
Description
Pull release logs from repositories.
Usage
get_release_logs(
gitstats,
since = NULL,
until = Sys.Date() + lubridate::days(1),
cache = TRUE,
verbose = is_verbose(gitstats),
progress = verbose
)
Arguments
gitstats |
A |
since |
A starting date. |
until |
An end date. |
cache |
A logical, if set to |
verbose |
A logical, |
progress |
A logical, by default set to |
Value
A data.frame.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
orgs = c("r-world-devs", "openpharma")
)
get_release_logs(my_gistats, since = "2024-01-01")
## End(Not run)
Get data on repositories
Description
Pulls data on all repositories for an organization, individual
user or those with a given text in code blobs (with_code
parameter) or a
file (with_files
parameter) and parse it into table format.
Usage
get_repos(
gitstats,
add_contributors = TRUE,
with_code = NULL,
in_files = NULL,
with_files = NULL,
cache = TRUE,
verbose = is_verbose(gitstats),
progress = verbose
)
Arguments
gitstats |
A GitStats object. |
add_contributors |
A logical parameter to decide whether to add
information about repositories' contributors to the repositories output
(table). If set to |
with_code |
A character vector, if defined, GitStats will pull repositories with specified code phrases in code blobs. |
in_files |
A character vector of file names. Works when |
with_files |
A character vector, if defined, GitStats will pull repositories with specified files. |
cache |
A logical, if set to |
verbose |
A logical, |
progress |
A logical, by default set to |
Value
A data.frame.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
orgs = c("r-world-devs", "openpharma")
) %>%
set_gitlab_host(
token = Sys.getenv("GITLAB_PAT_PUBLIC"),
orgs = "mbtests"
)
get_repos(my_gitstats)
get_repos(my_gitstats, add_contributors = FALSE)
get_repos(my_gitstats, with_code = "Shiny", in_files = "renv.lock")
get_repos(my_gitstats, with_files = "DESCRIPTION")
## End(Not run)
Get data on files trees across repositories
Description
Pulls files tree (structure) per repository. Files trees are
then stored as character vectors in files_tree
column of output table.
Usage
get_repos_trees(
gitstats,
pattern = NULL,
depth = Inf,
cache = TRUE,
verbose = is_verbose(gitstats),
progress = verbose
)
Arguments
gitstats |
A GitStats object. |
pattern |
A regular expression. If defined, it pulls structure of files
in a repository matching this pattern reaching to the level of directories
defined by |
depth |
Defines level of directories to reach for files structure from.
E.g. if set to |
cache |
A logical, if set to |
verbose |
A logical, |
progress |
A logical, by default set to |
Value
A tibble
.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
orgs = c("r-world-devs", "openpharma")
)
get_repos_trees(
gitstats = my_gitstats,
pattern = "\\.md"
)
## End(Not run)
Get repository URLS
Description
Pulls a vector of repositories URLs (web or API): either all for
an organization or those with a given text in code blobs (with_code
parameter) or a file (with_files
parameter).
Usage
get_repos_urls(
gitstats,
type = "api",
with_code = NULL,
in_files = NULL,
with_files = NULL,
cache = TRUE,
verbose = is_verbose(gitstats),
progress = verbose
)
Arguments
gitstats |
A GitStats object. |
type |
A character, choose if |
with_code |
A character vector, if defined, |
in_files |
A character vector of file names. Works when |
with_files |
A character vector, if defined, GitStats will pull repositories with specified files. |
cache |
A logical, if set to |
verbose |
A logical, |
progress |
A logical, by default set to |
Value
A character vector.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
orgs = c("r-world-devs", "openpharma")
) %>%
set_gitlab_host(
token = Sys.getenv("GITLAB_PAT_PUBLIC"),
orgs = "mbtests"
)
get_repos_urls(my_gitstats, with_files = c("DESCRIPTION", "LICENSE"))
## End(Not run)
Get data from GitStats
storage
Description
Retrieves whole or particular data (see storage
parameter)
pulled earlier with GitStats
.
Usage
get_storage(gitstats, storage = NULL)
Arguments
gitstats |
A GitStats object. |
storage |
A character, type of the data you want to get from storage:
|
Value
A list of tibbles (if storage
set to NULL
) or a tibble (if
storage
defined).
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
orgs = c("r-world-devs", "openpharma")
)
get_release_logs(my_gistats, since = "2024-01-01")
get_repos(my_gitstats)
release_logs <- get_storage(
gitstats = my_gitstats,
storage = "release_logs"
)
## End(Not run)
Get users data
Description
Get users data
Usage
get_users(gitstats, logins, cache = TRUE, verbose = is_verbose(gitstats))
Arguments
gitstats |
A GitStats object. |
logins |
A character vector of logins. |
cache |
A logical, if set to |
verbose |
A logical, |
Value
A data.frame.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
token = Sys.getenv("GITHUB_PAT"),
orgs = c("r-world-devs")
) %>%
set_gitlab_host(
token = Sys.getenv("GITLAB_PAT_PUBLIC"),
orgs = "mbtests"
)
get_users(my_gitstats, c("maciekabanas", "marcinkowskak"))
## End(Not run)
Is verbose mode switched on
Description
Is verbose mode switched on
Usage
is_verbose(gitstats)
Arguments
gitstats |
A GitStats object. |
Set GitHub host
Description
Set GitHub host
Usage
set_github_host(
gitstats,
host = NULL,
token = NULL,
orgs = NULL,
repos = NULL,
verbose = is_verbose(gitstats),
.error = TRUE
)
Arguments
gitstats |
A GitStats object. |
host |
A character, optional, URL name of the host. If not passed, a public host will be used. |
token |
A token. |
orgs |
An optional character vector of organisations. If you pass it,
|
repos |
An optional character vector of repositories full names
(organization and repository name, e.g. "r-world-devs/GitStats"). If you
pass it, |
verbose |
A logical, |
.error |
A logical to control if passing wrong input
( |
Details
If you do not define orgs
and repos
, GitStats
will be set to
scan whole Git platform (such as enterprise version of GitHub or GitLab),
unless it is a public platform. In case of a public one (like GitHub) you
need to define orgs
or repos
as scanning through all organizations may
take large amount of time.
Value
A GitStats
object with added information on host.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_github_host(
orgs = c("r-world-devs", "openpharma", "pharmaverse")
)
## End(Not run)
Set GitLab host
Description
Set GitLab host
Usage
set_gitlab_host(
gitstats,
host = NULL,
token = NULL,
orgs = NULL,
repos = NULL,
verbose = is_verbose(gitstats),
.error = TRUE
)
Arguments
gitstats |
A GitStats object. |
host |
A character, optional, URL name of the host. If not passed, a public host will be used. |
token |
A token. |
orgs |
An optional character vector of organisations. If you pass it,
|
repos |
An optional character vector of repositories full names
(organization and repository name, e.g. "r-world-devs/GitStats"). If you
pass it, |
verbose |
A logical, |
.error |
A logical to control if passing wrong input
( |
Details
If you do not define orgs
and repos
, GitStats
will be set to
scan whole Git platform (such as enterprise version of GitHub or GitLab),
unless it is a public platform. In case of a public one (like GitHub) you
need to define orgs
or repos
as scanning through all organizations may
take large amount of time.
Value
A GitStats
object with added information on host.
Examples
## Not run:
my_gitstats <- create_gitstats() %>%
set_gitlab_host(
token = Sys.getenv("GITLAB_PAT_PUBLIC"),
orgs = "mbtests"
)
## End(Not run)
Show organizations set in GitStats
Description
Retrieves organizations set or pulled by GitStats
. Especially
helpful when user is scanning whole git platform and wants to have a
glimpse at organizations.
Usage
show_orgs(gitstats)
Arguments
gitstats |
A GitStats object. |
Value
A vector of organizations.
Switch off verbose mode
Description
Stop printing messages and output.
Usage
verbose_off(gitstats)
Arguments
gitstats |
A GitStats object. |
Value
A GitStats object.
Switch on verbose mode
Description
Print all messages and output.
Usage
verbose_on(gitstats)
Arguments
gitstats |
A GitStats object. |
Value
A GitStats object.