| Type: | Package | 
| Title: | Report on Diversity and Inclusion in a Corporate Setting | 
| Version: | 0.3.1 | 
| Maintainer: | Philippe J.S. De Brouwer <philippe@de-brouwer.com> | 
| License: | AGPL (≥ 3) | 
| URL: | http://www.de-brouwer.com/div/ | 
| BugReports: | https://github.com/DrPhilippeDB/div/issues/ | 
| Description: | Facilitate the analysis of teams in a corporate setting: assess the diversity per grade and job, present the results, search for bias (in hiring and/or promoting processes). It also provides methods to simulate the effect of bias, random team-data, etc. White paper: 'Philippe J.S. De Brouwer' (2021) http://www.de-brouwer.com/assets/div/div-white-paper.pdf. Book (chapter 36): 'Philippe J.S. De Brouwer' (2020, ISBN:978-1-119-63272-6) and 'Philippe J.S. De Brouwer' (2020) <doi:10.1002/9781119632757>. | 
| Encoding: | UTF-8 | 
| Collate: | 'headers.R' 'diversity.R' 'div_conf_colour.R' 'div_fake_team.R' 'div_ci_median.R' 'div_paygap.R' 'div_parse_paygap.R' 'div_round_paygap.R' 'div_gauge_plot.R' 'div_plot_paygap_distribution.R' 'div_add_median_label.R' 'print.paygap.R' 'summary.paygap.R' | 
| Depends: | R (≥ 3.4.0), tidyverse | 
| Imports: | rlang, dplyr, tibble, tidyr, stringr, magrittr, ggplot2, gridExtra, plotly, pryr, rpart, kableExtra | 
| Suggests: | flexdashboard, knitr, rmarkdown, grid, lattice | 
| RoxygenNote: | 7.1.1 | 
| NeedsCompilation: | no | 
| Repository: | CRAN | 
| Packaged: | 2021-05-04 19:02:09 UTC; philippe | 
| Author: | Philippe J.S. De Brouwer [aut, cre] | 
| Date/Publication: | 2021-05-06 08:00:02 UTC | 
Adds a column with new labels (H)igh and (L) for a given colName (within a given grade and jobID)
Description
This function calculates the entropy of a system with discrete states
Usage
div_add_median_label(
  d,
  colName = "age",
  value1 = "T",
  value2 = "F",
  newColName = "isYoung"
)
Arguments
d | 
 tibble, a tibble with team data columns as defined in the documentation (at least the column colName (as set by next parameter), 'grade', and 'jobID')  | 
colName | 
 the name of the columns that contains the factor object to be used as explaining dimension for the paygap (defaults to 'gender')  | 
value1 | 
 character, the label to be used for the first half of observations (the smallest ones)  | 
value2 | 
 character, the label to be used for the second half of observations (the biggest ones)  | 
newColName | 
 the value in new column name that will hold the values value1 and value2  | 
Value
dataframe (with columns grade, jobID, salary_selectedValue, salary_others, n_selectedValue, n_others, paygap, confidence) , where "confidence" is one of the following: NA = not available (numbers are too low), "" = no bias detectable, "." = there might be some bias, but we're not sure, "*" = bias detected wit some degree of confidence, "**" = quite sure there is bias, "***" = trust us, this is biased.
Examples
df <- div_add_median_label(div_fake_team())
colnames(df)
Function to calculate the confidence interval for the median
Description
Function to calculate the confidence interval for the median
Usage
div_ci_median(x, conf = 0.95)
Arguments
x | 
 numeric, data from which the median is calcualted  | 
conf | 
 numeric, the confidence interval as 1 - P(x < x0)  | 
Value
ci (confidence interval object)
Examples
x <- 1:100
div_ci_median(x)
return a colour code given a number of stars for the confidence level of bias
Description
This function returns a colour (R named colour) based on the confidence level
Usage
div_conf_colour(x)
Arguments
x | 
 the string associated to the paygap confidence: NA, ”, ',', '*', '***', '***'  | 
Value
string (named colour)
Examples
div_conf_colour("*")
Generate randomly team-data
Description
This function generates a data frame with data for a team (with salaries, gender, FTE, etc). This is a good start to test the package and to experiment what level of bias will be visible in the paygap for example.
Usage
div_fake_team(
  seed = 100,
  N = 200,
  genders = c("F", "M", "O"),
  gender_prob = c(0.4, 0.58, 0.02),
  gender_salaryBias = c(1, 1.1, 1),
  jobIDs = c("sales", "analytics"),
  jobID_prob = c(0.6, 0.4),
  citizenships = c("Polish", "German", "Italian", "Indian", "Other"),
  citizenship_prob = c(0.6, 0.2, 0.1, 0.05, 0.05)
)
Arguments
seed | 
 numeric, the seed to be used in set.seed()  | 
N | 
 numeric, the size of the team to be used (default = 200)  | 
genders | 
 character, a vector of the genders to be used  | 
gender_prob | 
 numeric, relative probabilities of the different genders to occur (must have the same length as 'genders')  | 
gender_salaryBias | 
 numeric, vector with the relative salaries of the different genders (must have the same length as 'genders')  | 
jobIDs | 
 character, a vector with the labels of the job categories in the team (they will appear in each grade)  | 
jobID_prob | 
 numeric, a vector with the relative sizes of the different jobs in the team (must have the same length as 'jobIDs')  | 
citizenships | 
 character, a vector of the citizenships to be generated  | 
citizenship_prob | 
 numeric, relative probabilities of the different citizenships to occur (must have the same length as 'citizenships')  | 
Value
dataframe (employees of the random team)
Examples
library(div)
d <- div_fake_team()
head(d)
diversity(table(d$gender))
Uses ggplot2 to produce a gauge plot in RAG colour
Description
This function produces one or more gauge plots coloured in red (R), amber (A) or green (G) for a value between 0 and 1.
Usage
div_gauge_plot(df, breaks = c(0, 0.8, 0.95, 1), ncol = NULL, nbrSize = 6)
Arguments
df | 
 tibble, a tibble with columns "value" and "label" (value = the values between 0 and 1; - label = text to show e.g. paste("group", colnames(t)))  | 
breaks | 
 numeric vector with the lower limit, the border between green and amber, the border between amber and red, and the upper limit  | 
ncol | 
 numeric, the number of columns to produce  | 
nbrSize | 
 numeric, the font size for the label  | 
Value
ggplot object
Examples
d <- div_fake_team()
tbl_gender_div <- table(d$gender, d$grade) %>%
   apply(2, diversity, prior = c(50.2, 49.8)) %>%
   tibble(value = ., label = paste("Grade", names(.)))
div_gauge_plot(tbl_gender_div, ncol = 2, nbrSize = 4)
Prepare the paygap matrix to be published in LaTeX
Description
This function formats the paygap matrix (created by div_paygap()) and prepares it for printing via the function knitr::kable()
Usage
div_parse_paygap(
  pg,
  label = NULL,
  min_nbr_show = NULL,
  max_length_jobID = 12,
  max_length_colnames = 9
)
Arguments
pg | 
 paygap object as created by div::div_paygap(). This is an S3 object with a specific structure  | 
label | 
 character, the label to be used in the caption of the kable object  | 
min_nbr_show | 
 numeric, if provided then only groups that have more than min_nbr_show employees in both categories (selectedValue and others) will be shown  | 
max_length_jobID | 
 numeric, if provided the maximal length of the column jobID (in characters)  | 
max_length_colnames | 
 numeric, if provided the maximal length of the column names (in characters)  | 
Value
knitr::kable object (for LaTeX)
Examples
d  <- div_fake_team()
pg <- div_paygap(d)
div_parse_paygap(pg)
Function to calculate the paygap as a ratio.
Description
This function calculates the entropy of a system with discrete states
Usage
div_paygap(d, x = "gender", y = "salary", x_ctrl = "F", ctrl_var = "age")
Arguments
d | 
 tibble, a tibble with columns as definded  | 
x | 
 the name of the columns that contains the factor object to be used as explaining dimension for the paygap (defaults to 'gender')  | 
y | 
 the name of the columns that contains the numeric value to be used to calculate the paygap (could be salary or bonus for example)  | 
x_ctrl | 
 the value in the column defined by x that should be isolated (this versus the others), defaults to 'F'  | 
ctrl_var | 
 a control variable to be added (shows median per group for that variable)  | 
Value
dataframe (with columns grade, jobID, salary_x_ctrl, salary_others, n_x_ctrl, n_others, paygap, confidence) , where "confidence" is one of the following: NA = not available (numbers are too low), "" = no bias detectable, "." = there might be some bias, but we're not sure, "*" = bias detected wit some degree of confidence, "**" = quite sure there is bias, "***" = trust us, this is biased.
Examples
df <- div_paygap(div_fake_team())
df
Produce a histogram and normal distribution
Description
Plots a histogram, a normal distribution with the same standard deviation and mean as well as one with a mean centred around 1
Usage
div_plot_paygap_distribution(x, label = "Gender", mu_unbiased = 1)
Arguments
x | 
 numeric vector, column of paygap observations  | 
label | 
 character, prefix for the title  | 
mu_unbiased | 
 numeric, the mean of the unbiased distribution (for paygaps this should be 1)  | 
Value
ggplot2 object
Examples
d <- div_fake_team()
pg <- div_paygap(d)
div_plot_paygap_distribution(pg$data$paygap)
Rounds all numbers in the paygap data-frame
Description
This function all numbers to zero decimals, except the paygap (which is rounded to 2 decimals):
Usage
div_round_paygap(x)
Arguments
x | 
 paygap object (output of div::div_paygap())  | 
Value
the paygap data-frame (tibble only, not the whole paygap object)
Examples
d <- div_fake_team()
pg <- div_paygap(d)
div_round_paygap(pg)
Calculate the diversity index
Description
This function calculates the entropy of a system with discrete states
Usage
diversity(x, prior = NULL)
Arguments
x | 
 numeric vector, observed probabilities of the classes  | 
prior | 
 numeric vector, the prior probabilities of the classes  | 
Value
the entropy or diversity measure
Examples
x <- c(0.4, 0.6)
diversity(x)
print the paygap object in the terminal
Description
print the paygap object in the terminal
Usage
## S3 method for class 'paygap'
print(x, ...)
Arguments
x | 
 paygap object, as created by the function div_paygpa()  | 
... | 
 arguments passed on to the generic print function: print(x$data)  | 
Value
text output
Examples
library(div)
div_fake_team() %>%
  div_paygap    %>%
  print
Title
Description
Title
Usage
## S3 method for class 'paygap'
summary(object, ...)
Arguments
object | 
 paygap S3 object, as created by the function dif_paygap()  | 
... | 
 passed on to summary()  | 
Value
a summary of the paygap object
Examples
library(div)
d <- div_fake_team()
pg <- div_paygap(d)
summary(pg)