Type: Package
Title: A Tidy API for Graph Manipulation
Version: 1.3.1
Maintainer: Thomas Lin Pedersen <thomasp85@gmail.com>
Description: A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.
License: MIT + file LICENSE
URL: https://tidygraph.data-imaginist.com, https://github.com/thomasp85/tidygraph
BugReports: https://github.com/thomasp85/tidygraph/issues
Imports: cli, dplyr (≥ 0.8.5), igraph (≥ 2.0.0), lifecycle, magrittr, pillar, R6, rlang, stats, tibble, tidyr, tools, utils
Suggests: ape, covr, data.tree, graph, influenceR, methods, netrankr, NetSwan, network, seriation, testthat (≥ 3.0.0)
LinkingTo: cpp11
Encoding: UTF-8
RoxygenNote: 7.3.1
Config/testthat/edition: 3
NeedsCompilation: yes
Packaged: 2024-01-30 12:17:51 UTC; thomas
Author: Thomas Lin Pedersen ORCID iD [cre, aut]
Repository: CRAN
Date/Publication: 2024-01-30 13:40:02 UTC

tidygraph: A Tidy API for Graph Manipulation

Description

logo

A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.

Author(s)

Maintainer: Thomas Lin Pedersen thomasp85@gmail.com (ORCID)

See Also

Useful links:


Determine the context of subsequent manipulations

Description

As a tbl_graph can be considered as a collection of two linked tables it is necessary to specify which table is referenced during manipulations. The activate verb does just that and needs affects all subsequent manipulations until a new table is activated. active is a simple query function to get the currently acitve context. In addition to the use of activate it is also possible to activate nodes or edges as part of the piping using the ⁠%N>%⁠ and ⁠%E>%⁠ pipes respectively. Do note that this approach somewhat obscures what is going on and is thus only recommended for quick, one-line, fixes in interactive use.

Usage

activate(.data, what)

active(x)

lhs %N>% rhs

lhs %E>% rhs

Arguments

.data, x, lhs

A tbl_graph or a grouped_tbl_graph

what

What should get activated? Possible values are nodes or edges.

rhs

A function to pipe into

Value

A tbl_graph

Note

Activate will ungroup a grouped_tbl_graph.

Examples

gr <- create_complete(5) %>%
  activate(nodes) %>%
  mutate(class = sample(c('a', 'b'), 5, TRUE)) %>%
  activate(edges) %>%
  arrange(from)

# The above could be achieved using the special pipes as well
gr <- create_complete(5) %N>%
  mutate(class = sample(c('a', 'b'), 5, TRUE)) %E>%
  arrange(from)
# But as you can see it obscures what part of the graph is being targeted


A data structure for tidy graph manipulation

Description

The tbl_graph class is a thin wrapper around an igraph object that provides methods for manipulating the graph using the tidy API. As it is just a subclass of igraph every igraph method will work as expected. A grouped_tbl_graph is the equivalent of a grouped_df where either the nodes or the edges has been grouped. The grouped_tbl_graph is not constructed directly but by using the group_by() verb. After creation of a tbl_graph the nodes are activated by default. The context can be changed using the activate() verb and affects all subsequent operations. Changing context automatically drops any grouping. The current active context can always be extracted with as_tibble(), which drops the graph structure and just returns a tbl_df or a grouped_df depending on the state of the tbl_graph. The returned context can be overriden by using the active argument in as_tibble().

Usage

## S3 method for class 'data.frame'
as_tbl_graph(x, directed = TRUE, ...)

## S3 method for class 'Node'
as_tbl_graph(x, directed = TRUE, mode = "out", ...)

## S3 method for class 'dendrogram'
as_tbl_graph(x, directed = TRUE, mode = "out", ...)

## S3 method for class 'graphNEL'
as_tbl_graph(x, ...)

## S3 method for class 'graphAM'
as_tbl_graph(x, ...)

## S3 method for class 'graphBAM'
as_tbl_graph(x, ...)

## S3 method for class 'hclust'
as_tbl_graph(x, directed = TRUE, mode = "out", ...)

## S3 method for class 'igraph'
as_tbl_graph(x, ...)

## S3 method for class 'list'
as_tbl_graph(x, directed = TRUE, node_key = "name", ...)

## S3 method for class 'matrix'
as_tbl_graph(x, directed = TRUE, ...)

## S3 method for class 'network'
as_tbl_graph(x, ...)

## S3 method for class 'phylo'
as_tbl_graph(x, directed = NULL, ...)

## S3 method for class 'evonet'
as_tbl_graph(x, directed = TRUE, ...)

tbl_graph(nodes = NULL, edges = NULL, directed = TRUE, node_key = "name")

as_tbl_graph(x, ...)

## Default S3 method:
as_tbl_graph(x, ...)

is.tbl_graph(x)

Arguments

x

An object convertible to a tbl_graph

directed

Should the constructed graph be directed (defaults to TRUE)

...

Arguments passed on to the conversion function

mode

In case directed = TRUE should the edge direction be away from node or towards. Possible values are "out" (default) or "in".

node_key

The name of the column in nodes that character represented to and from columns should be matched against. If NA the first column is always chosen. This setting has no effect if to and from are given as integers.

nodes

A data.frame containing information about the nodes in the graph. If edges$to and/or edges$from are characters then they will be matched to the column named according to node_key in nodes, if it exists. If not, they will be matched to the first column.

edges

A data.frame containing information about the edges in the graph. The terminal nodes of each edge must either be encoded in a to and from column, or in the two first columns, as integers. These integers refer to nodes index.

Details

Constructors are provided for most data structures that resembles networks. If a class provides an igraph::as.igraph() method it is automatically supported.

Value

A tbl_graph object

Functions

Examples

rstat_nodes <- data.frame(name = c("Hadley", "David", "Romain", "Julia"))
rstat_edges <- data.frame(from = c(1, 1, 1, 2, 3, 3, 4, 4, 4),
                            to = c(2, 3, 4, 1, 1, 2, 1, 2, 3))
tbl_graph(nodes = rstat_nodes, edges = rstat_edges)

Add graphs, nodes, or edges to a tbl_graph

Description

These functions are tbl_graph pendants to dplyr::bind_rows() that allows you to grow your tbl_graph by adding rows to either the nodes data, the edges data, or both. As with bind_rows() columns are matched by name and are automatically filled with NA if the column doesn't exist in some instances. In the case of bind_graphs() the graphs are automatically converted to tbl_graph objects prior to binding. The edges in each graph will continue to reference the nodes in the graph where they originated, meaning that their terminal node indexes will be shifted to match the new index of the node in the combined graph. This means the bind_graphs() always result in a disconnected graph. See graph_join() for merging graphs on common nodes.

Usage

bind_graphs(.data, ...)

bind_nodes(.data, ...)

bind_edges(.data, ..., node_key = "name")

Arguments

.data

A tbl_graph, or a list of tbl_graph objects (for bind_graphs()).

...

In case of bind_nodes() and bind_edges() data.frames to add. In the case of bind_graphs() objects that are convertible to tbl_graph using as_tbl_graph().

node_key

The name of the column in nodes that character represented to and from columns should be matched against. If NA the first column is always chosen. This setting has no effect if to and from are given as integers.

Value

A tbl_graph containing the new data

Examples

graph <- create_notable('bull')
new_graph <- create_notable('housex')

# Add nodes
graph %>% bind_nodes(data.frame(new = 1:4))

# Add edges
graph %>% bind_edges(data.frame(from = 1, to = 4:5))

# Add graphs
graph %>% bind_graphs(new_graph)


Calculate node and edge centrality

Description

The centrality of a node measures the importance of node in the network. As the concept of importance is ill-defined and dependent on the network and the questions under consideration, many centrality measures exist. tidygraph provides a consistent set of wrappers for all the centrality measures implemented in igraph for use inside dplyr::mutate() and other relevant verbs. All functions provided by tidygraph have a consistent naming scheme and automatically calls the function on the graph, returning a vector with measures ready to be added to the node data. Further tidygraph provides access to the netrankr engine for centrality calculations and define a number of centrality measures based on that, as well as provide a manual mode for specifying more-or-less any centrality score. These measures all only work on undirected graphs.

Usage

centrality_alpha(
  weights = NULL,
  alpha = 1,
  exo = 1,
  tol = 1e-07,
  loops = FALSE
)

centrality_authority(weights = NULL, scale = TRUE, options = arpack_defaults())

centrality_betweenness(
  weights = NULL,
  directed = TRUE,
  cutoff = -1,
  normalized = FALSE
)

centrality_power(exponent = 1, rescale = FALSE, tol = 1e-07, loops = FALSE)

centrality_closeness(
  weights = NULL,
  mode = "out",
  normalized = FALSE,
  cutoff = NULL
)

centrality_eigen(
  weights = NULL,
  directed = FALSE,
  scale = TRUE,
  options = arpack_defaults()
)

centrality_hub(weights = NULL, scale = TRUE, options = arpack_defaults())

centrality_pagerank(
  weights = NULL,
  directed = TRUE,
  damping = 0.85,
  personalized = NULL
)

centrality_subgraph(loops = FALSE)

centrality_degree(
  weights = NULL,
  mode = "out",
  loops = TRUE,
  normalized = FALSE
)

centrality_edge_betweenness(weights = NULL, directed = TRUE, cutoff = NULL)

centrality_harmonic(
  weights = NULL,
  mode = "out",
  normalized = FALSE,
  cutoff = NULL
)

centrality_manual(relation = "dist_sp", aggregation = "sum", ...)

centrality_closeness_harmonic()

centrality_closeness_residual()

centrality_closeness_generalised(alpha)

centrality_integration()

centrality_communicability()

centrality_communicability_odd()

centrality_communicability_even()

centrality_subgraph_odd()

centrality_subgraph_even()

centrality_katz(alpha = NULL)

centrality_betweenness_network(netflowmode = "raw")

centrality_betweenness_current()

centrality_betweenness_communicability()

centrality_betweenness_rsp_simple(rspxparam = 1)

centrality_betweenness_rsp_net(rspxparam = 1)

centrality_information()

centrality_decay(alpha = 1)

centrality_random_walk()

centrality_expected()

Arguments

weights

The weight of the edges to use for the calculation. Will be evaluated in the context of the edge data.

alpha

Relative importance of endogenous vs exogenous factors (centrality_alpha), the exponent to the power transformation of the distance metric (centrality_closeness_generalised), the base of power transformation (centrality_decay), or the attenuation factor (centrality_katz)

exo

The exogenous factors of the nodes. Either a scalar or a number number for each node. Evaluated in the context of the node data.

tol

Tolerance for near-singularities during matrix inversion

loops

Should loops be included in the calculation

scale

Should the output be scaled between 0 and 1

options

Settings passed on to igraph::arpack()

directed

Should direction of edges be used for the calculations

cutoff

maximum path length to use during calculations

normalized

Should the output be normalized

exponent

The decay rate for the Bonacich power centrality

rescale

Should the output be scaled to sum up to 1

mode

How should edges be followed. Ignored for undirected graphs

damping

The damping factor of the page rank algorithm

personalized

The probability of jumping to a node when abandoning a random walk. Evaluated in the context of the node data.

relation

The indirect relation measure type to be used in netrankr::indirect_relations

aggregation

The aggregation type to use on the indirect relations to be used in netrankr::aggregate_positions

...

Arguments to pass on to netrankr::indirect_relations

netflowmode

The return type of the network flow distance, either 'raw' or 'frac'

rspxparam

inverse temperature parameter

Value

A numeric vector giving the centrality measure of each node.

Functions

Examples

create_notable('bull') %>%
  activate(nodes) %>%
  mutate(importance = centrality_alpha())

# Most centrality measures are for nodes but not all
create_notable('bull') %>%
  activate(edges) %>%
  mutate(importance = centrality_edge_betweenness())

Graph games based on connected components

Description

This set of graph creation algorithms simulate the topology by, in some way, connecting subgraphs. The nature of their algorithm is described in detail at the linked igraph documentation.

Usage

play_blocks(n, size_blocks, p_between, directed = TRUE, loops = FALSE)

play_blocks_hierarchy(n, size_blocks, rho, p_within, p_between)

play_islands(n_islands, size_islands, p_within, m_between)

play_smallworld(
  n_dim,
  dim_size,
  order,
  p_rewire,
  loops = FALSE,
  multiple = FALSE
)

Arguments

n

The number of nodes in the graph.

size_blocks

The number of vertices in each block

p_between, p_within

The probability of edges within and between groups/blocks

directed

Should the resulting graph be directed

loops

Are loop edges allowed

rho

The fraction of vertices per cluster

n_islands

The number of densely connected islands

size_islands

The number of nodes in each island

m_between

The number of edges between groups/islands

n_dim, dim_size

The dimension and size of the starting lattice

order

The neighborhood size to create connections from

p_rewire

The rewiring probability of edges

multiple

Are multiple edges allowed

Value

A tbl_graph object

Functions

See Also

Other graph games: evolution_games, sampling_games, type_games

Examples

plot(play_islands(4, 10, 0.7, 3))


Access graph, nodes, and edges directly inside verbs

Description

These three functions makes it possible to directly access either the node data, the edge data or the graph itself while computing inside verbs. It is e.g. possible to add an attribute from the node data to the edges based on the terminating nodes of the edge, or extract some statistics from the graph itself to use in computations.

Usage

.G()

.N(focused = TRUE)

.E(focused = TRUE)

Arguments

focused

Should only the attributes of the currently focused nodes or edges be returned

Value

Either a tbl_graph (.G()) or a tibble (.N())

Functions

Examples


# Get data from the nodes while computing for the edges
create_notable('bull') %>%
  activate(nodes) %>%
  mutate(centrality = centrality_power()) %>%
  activate(edges) %>%
  mutate(mean_centrality = (.N()$centrality[from] + .N()$centrality[to])/2)

Create different types of well-defined graphs

Description

These functions creates a long list of different types of well-defined graphs, that is, their structure is not based on any randomisation. All of these functions are shallow wrappers around a range of ⁠igraph::make_*⁠ functions but returns tbl_graph rather than igraph objects.

Usage

create_ring(n, directed = FALSE, mutual = FALSE)

create_path(n, directed = FALSE, mutual = FALSE)

create_chordal_ring(n, w)

create_de_bruijn(alphabet_size, label_size)

create_empty(n, directed = FALSE)

create_bipartite(n1, n2, directed = FALSE, mode = "out")

create_citation(n)

create_complete(n)

create_notable(name)

create_kautz(alphabet_size, label_size)

create_lattice(dim, directed = FALSE, mutual = FALSE, circular = FALSE)

create_star(n, directed = FALSE, mutual = FALSE, mode = "out")

create_tree(n, children, directed = TRUE, mode = "out")

Arguments

n, n1, n2

The number of nodes in the graph

directed

Should the graph be directed

mutual

Should mutual edges be created in case of the graph being directed

w

A matrix specifying the additional edges in the chordan ring. See igraph::make_chordal_ring()

alphabet_size

The number of unique letters in the alphabet used for the graph

label_size

The number of characters in each node

mode

In case of a directed, non-mutual, graph should the edges flow 'out' or 'in'

name

The name of a notable graph. See a complete list in igraph::make_graph()

dim

The dimensions of the lattice

circular

Should each dimension in the lattice wrap around

children

The number of children each node has in the tree (if possible)

Value

A tbl_graph

Functions

Examples

# Create a complete graph with 10 nodes
create_complete(10)


Calculate edge ranking

Description

This set of functions tries to calculate a ranking of the edges in a graph so that edges sharing certain topological traits are in proximity in the resulting order.

Usage

edge_rank_eulerian(cyclic = FALSE)

Arguments

cyclic

should the eulerian path start and end at the same node

Value

An integer vector giving the position of each edge in the ranking

Functions

Examples

graph <- create_notable('meredith') %>%
  activate(edges) %>%
  mutate(rank = edge_rank_eulerian())


Querying edge types

Description

These functions lets the user query whether the edges in a graph is of a specific type. All functions return a logical vector giving whether each edge in the graph corresponds to the specific type.

Usage

edge_is_multiple()

edge_is_loop()

edge_is_mutual()

edge_is_from(from)

edge_is_to(to)

edge_is_between(from, to, ignore_dir = !graph_is_directed())

edge_is_incident(nodes)

edge_is_bridge()

edge_is_feedback_arc(weights = NULL, approximate = TRUE)

Arguments

from, to, nodes

A vector giving node indices

ignore_dir

Is both directions of the edge allowed

weights

The weight of the edges to use for the calculation. Will be evaluated in the context of the edge data.

approximate

Should the minimal set be approximated or exact

Value

A logical vector of the same length as the number of edges in the graph

Functions

Examples

create_star(10, directed = TRUE, mutual = TRUE) %>%
  activate(edges) %>%
  sample_frac(0.7) %>%
  mutate(single_edge = !edge_is_mutual())

Graph games based on evolution

Description

This games create graphs through different types of evolutionary mechanisms (not necessarily in a biological sense). The nature of their algorithm is described in detail at the linked igraph documentation.

Usage

play_citation_age(
  n,
  growth = 1,
  bins = n/7100,
  p_pref = (1:(bins + 1))^-3,
  directed = TRUE
)

play_forestfire(
  n,
  p_forward,
  p_backward = p_forward,
  growth = 1,
  directed = TRUE
)

play_growing(n, growth = 1, directed = TRUE, citation = FALSE)

play_barabasi_albert(
  n,
  power,
  growth = 1,
  growth_dist = NULL,
  use_out = FALSE,
  appeal_zero = 1,
  directed = TRUE,
  method = "psumtree"
)

play_barabasi_albert_aging(
  n,
  power,
  power_age,
  growth = 1,
  growth_dist = NULL,
  bins = 300,
  use_out = FALSE,
  appeal_zero = 1,
  appeal_zero_age = 0,
  directed = TRUE,
  coefficient = 1,
  coefficient_age = 1,
  window = NULL
)

Arguments

n

The number of nodes in the graph.

growth

The number of edges added at each iteration

bins

The number of aging bins

p_pref

The probability that an edge will be made to an age bin.

directed

Should the resulting graph be directed

p_forward, p_backward

Forward and backward burning probability

citation

Should a citation graph be created

power

The power of the preferential attachment

growth_dist

The distribution of the number of added edges at each iteration

use_out

Should outbound edges be used for calculating citation probability

appeal_zero

The appeal value for unconnected nodes

method

The algorithm to use for graph creation. Either 'psumtree', 'psumtree-multiple', or 'bag'

power_age

The aging exponent

appeal_zero_age

The appeal value of nodes without age

coefficient

The coefficient of the degree dependent part of attrictiveness

coefficient_age

The coefficient of the age dependent part of attrictiveness

window

The aging window to take into account when calculating the preferential attraction

Value

A tbl_graph object

Functions

See Also

play_traits() and play_citation_type() for an evolutionary algorithm based on different node types

Other graph games: component_games, sampling_games, type_games

Examples

plot(play_forestfire(50, 0.5))


Select specific nodes or edges to compute on

Description

The focus()/unfocus() idiom allow you to temporarily tell tidygraph algorithms to only calculate on a subset of the data, while keeping the full graph intact. The purpose of this is to avoid having to calculate time costly measures etc on all nodes or edges of a graph if only a few is needed. E.g. you might only be interested in the shortest distance from one node to another so rather than calculating this for all nodes you apply a focus on one node and perform the calculation. It should be made clear that not all algorithms will see a performance boost by being applied to a few nodes/edges since their calculation is applied globally and the result for all nodes/edges are provided in unison.

Usage

focus(.data, ...)

## S3 method for class 'tbl_graph'
focus(.data, ...)

## S3 method for class 'morphed_tbl_graph'
focus(.data, ...)

unfocus(.data, ...)

## S3 method for class 'tbl_graph'
unfocus(.data, ...)

## S3 method for class 'focused_tbl_graph'
unfocus(.data, ...)

## S3 method for class 'morphed_tbl_graph'
unfocus(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

<data-masking> Expressions that return a logical value, and are defined in terms of the variables in .data. If multiple expressions are included, they are combined with the & operator. Only rows for which all conditions evaluate to TRUE are kept.

Value

A graph with focus applied

Note

focusing is the lowest prioritised operation on a graph. Applying a morph() or a group_by() operation will unfocus the graph prior to performing the operation. The same is true for the inverse operations (unmorph() and ungroup()). Further, unfocusing will happen any time some graph altering operation is performed, such as the arrange() and slice() operations


Fortify a tbl_graph for ggplot2 plotting

Description

In general tbl_graph objects are intended to be plotted by network visualisation libraries such as ggraph. However, if you do wish to plot either the node or edge data directly with ggplot2 you can simply add the tbl_graph object as either the global or layer data and the currently active data is passed on as a regular data frame.

Usage

fortify.tbl_graph(model, data, ...)

Register a graph context for the duration of the current frame

Description

This function sets the provided graph to be the context for tidygraph algorithms, such as e.g. node_is_center(), for the duration of the current environment. It automatically removes the graph once the environment exits.

Usage

.graph_context

.register_graph_context(graph, free = FALSE, env = parent.frame())

.free_graph_context(env = parent.frame())

Arguments

graph

A tbl_graph object

free

Should the active state of the graph be ignored?

env

The environment where the context should be active

Format

An object of class ContextBuilder (inherits from R6) of length 12.


Join graphs on common nodes

Description

This graph-specific join method makes a full join on the nodes data and updates the edges in the joining graph so they matches the new indexes of the nodes in the resulting graph. Node and edge data is combined using dplyr::bind_rows() semantic, meaning that data is matched by column name and filled with NA if it is missing in either of the graphs.

Usage

graph_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)

Arguments

x

A tbl_graph

y

An object convertible to a tbl_graph using as_tbl_graph()

by

A join specification created with join_by(), or a character vector of variables to join by.

If NULL, the default, ⁠*_join()⁠ will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join on different variables between x and y, use a join_by() specification. For example, join_by(a == b) will match x$a to y$b.

To join by multiple variables, use a join_by() specification with multiple expressions. For example, join_by(a == b, c == d) will match x$a to y$b and x$c to y$d. If the column names are the same between x and y, you can shorten this by listing only the variable names, like join_by(a, c).

join_by() can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.

For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, by = c("a", "b") joins x$a to y$a and x$b to y$b. If variable names differ between x and y, use a named character vector like by = c("x_a" = "y_a", "x_b" = "y_b").

To perform a cross-join, generating all combinations of x and y, see cross_join().

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

...

Other parameters passed onto methods.

Value

A tbl_graph containing the merged graph

Examples

gr1 <- create_notable('bull') %>%
  activate(nodes) %>%
  mutate(name = letters[1:5])
gr2 <- create_ring(10) %>%
  activate(nodes) %>%
  mutate(name = letters[4:13])

gr1 %>% graph_join(gr2)

Graph measurements

Description

This set of functions provide wrappers to a number of ìgraphs graph statistic algorithms. As for the other wrappers provided, they are intended for use inside the tidygraph framework and it is thus not necessary to supply the graph being computed on as the context is known. All of these functions are guarantied to return scalars making it easy to compute with them.

Usage

graph_adhesion()

graph_assortativity(attr, in_attr = NULL, directed = TRUE)

graph_automorphisms(sh = "fm", colors = NULL)

graph_clique_num()

graph_clique_count(min = NULL, max = NULL, subset = NULL)

graph_component_count(type = "weak")

graph_motif_count(size = 3, cut.prob = rep(0, size))

graph_diameter(weights = NULL, directed = TRUE, unconnected = TRUE)

graph_girth()

graph_radius(mode = "out")

graph_mutual_count()

graph_asym_count()

graph_unconn_count()

graph_size()

graph_order()

graph_reciprocity(ignore_loops = TRUE, ratio = FALSE)

graph_min_cut(capacity = NULL)

graph_mean_dist(directed = TRUE, unconnected = TRUE, weights = NULL)

graph_modularity(group, weights = NULL)

graph_efficiency(weights = NULL, directed = TRUE)

Arguments

attr

The node attribute to measure on

in_attr

An alternative node attribute to use for incomming node. If NULL the attribute given by type will be used

directed

Should a directed graph be treated as directed

sh

The splitting heuristics for the BLISS algorithm. Possible values are: ‘f’: first non-singleton cell, ‘fl’: first largest non-singleton cell, ‘fs’: first smallest non-singleton cell, ‘fm’: first maximally non-trivially connected non-singleton cell, ‘flm’: first largest maximally non-trivially connected non-singleton cell, ‘fsm’: first smallest maximally non-trivially connected non-singleton cell.

colors

The colors of the individual vertices of the graph; only vertices having the same color are allowed to match each other in an automorphism. When omitted, igraph uses the color attribute of the vertices, or, if there is no such vertex attribute, it simply assumes that all vertices have the same color. Pass NULL explicitly if the graph has a color vertex attribute but you do not want to use it.

min, max

The upper and lower bounds of the cliques to be considered.

subset

The indexes of the nodes to start the search from (logical or integer). If provided only the cliques containing these nodes will be counted.

type

The type of component to count, either 'weak' or 'strong'. Ignored for undirected graphs.

size

The size of the motif.

cut.prob

Numeric vector giving the probabilities that the search graph is cut at a certain level. Its length should be the same as the size of the motif (the size argument). By default no cuts are made.

weights

Optional positive weight vector for calculating weighted distances. If the graph has a weight edge attribute, then this is used by default.

unconnected

Logical, what to do if the graph is unconnected. If FALSE, the function will return a number that is one larger the largest possible diameter, which is always the number of vertices. If TRUE, the diameters of the connected components will be calculated and the largest one will be returned.

mode

How should eccentricity be calculated. If "out" only outbound edges are followed. If "in" only inbound are followed. If "all" all edges are followed. Ignored for undirected graphs.

ignore_loops

Logical. Should loops be ignored while calculating the reciprocity

ratio

Should the old "ratio" approach from igraph < v0.6 be used

capacity

The capacity of the edges

group

The node grouping to calculate the modularity on

Value

A scalar, the type depending on the function

Functions

Examples

# Use e.g. to modify computations on nodes and edges
create_notable('meredith') %>%
  activate(nodes) %>%
  mutate(rel_neighbors = centrality_degree()/graph_order())

Querying graph types

Description

This set of functions lets the user query different aspects of the graph itself. They are all concerned with wether the graph implements certain properties and will all return a logical scalar.

Usage

graph_is_simple()

graph_is_directed()

graph_is_bipartite()

graph_is_connected()

graph_is_tree()

graph_is_forest()

graph_is_dag()

graph_is_chordal()

graph_is_complete()

graph_is_isomorphic_to(graph, method = "auto", ...)

graph_is_subgraph_isomorphic_to(graph, method = "auto", ...)

graph_is_eulerian(cyclic = FALSE)

Arguments

graph

The graph to compare structure to

method

The algorithm to use for comparison

...

Arguments passed on to the comparison methods. See igraph::is_isomorphic_to() and igraph::is_subgraph_isomorphic_to()

cyclic

should the eulerian path start and end at the same node

Value

A logical scalar

Functions

Examples

gr <- create_tree(50, 4)

with_graph(gr, graph_is_tree())


Group nodes and edges based on community structure

Description

These functions are wrappers around the various clustering functions provided by igraph. As with the other wrappers they automatically use the graph that is being computed on, and otherwise passes on its arguments to the relevant clustering function. The return value is always a numeric vector of group memberships so that nodes or edges with the same number are part of the same group. Grouping is predominantly made on nodes and currently the only grouping of edges supported is biconnected components.

Usage

group_components(type = "weak")

group_edge_betweenness(weights = NULL, directed = TRUE, n_groups = NULL)

group_fast_greedy(weights = NULL, n_groups = NULL)

group_infomap(weights = NULL, node_weights = NULL, trials = 10)

group_label_prop(weights = NULL, label = NULL, fixed = NULL)

group_leading_eigen(
  weights = NULL,
  steps = -1,
  label = NULL,
  options = arpack_defaults(),
  n_groups = NULL
)

group_louvain(weights = NULL, resolution = 1)

group_leiden(
  weights = NULL,
  resolution = 1,
  objective_function = "CPM",
  beta = 0.01,
  label = NULL,
  n = 2,
  node_weights = NULL
)

group_optimal(weights = NULL)

group_spinglass(weights = NULL, ...)

group_walktrap(weights = NULL, steps = 4, n_groups = NULL)

group_fluid(n_groups = 2)

group_biconnected_component()

group_color()

Arguments

type

The type of component to find. Either 'weak' or 'strong'

weights

The weight of the edges to use for the calculation. Will be evaluated in the context of the edge data.

directed

Should direction of edges be used for the calculations

n_groups

Integer scalar, the desired number of communities. If too low or two high, then an error message is given. The measure is applied to the full graph so the number of groups returned may be lower for focused graphs

node_weights

The weight of the nodes to use for the calculation. Will be evaluated in the context of the node data.

trials

Number of times partition of the network should be attempted

label

The initial groups of the nodes. Will be evaluated in the context of the node data.

fixed

A logical vector determining which nodes should keep their initial groups. Will be evaluated in the context of the node data.

steps

The number of steps in the random walks

options

Settings passed on to igraph::arpack()

resolution

Resolution of the modularity function used internally in the algorithm

objective_function

Either "CPM" (constant potts model) or "modularity". Sets the objective function to use.

beta

Parameter affecting the randomness in the Leiden algorithm. This affects only the refinement step of the algorithm.

n

The number of iterations to run the clustering

...

arguments passed on to igraph::cluster_spinglass()

Value

a numeric vector with the membership for each node in the graph. The enumeration happens in order based on group size progressing from the largest to the smallest group

Functions

Examples

create_notable('tutte') %>%
  activate(nodes) %>%
  mutate(group = group_infomap())


Repeatedly modify a graph by a function

Description

The iterate family of functions allow you to call the same modification function on a graph until some condition is met. This can be used to create simple simulations in a tidygraph friendly API

Usage

iterate_n(.data, n, .f, ...)

iterate_while(.data, cnd, .f, ..., max_n = NULL)

Arguments

.data

A tbl_graph object

n

The number of times to iterate

.f

A function taking in a tbl_graph as the first argument and returning a tbl_graph object

...

Further arguments passed on to .f

cnd

A condition that must evaluate to TRUE or FALSE determining if iteration should continue

max_n

The maximum number of iterations in case cnd never evaluates to FALSE

Value

A tbl_graph object

Examples

# Gradually remove edges from the least connected nodes while avoiding
# isolates
create_notable('zachary') |>
  iterate_n(20, function(gr) {
    gr |>
      activate(nodes) |>
      mutate(deg = centrality_degree(), rank = order(deg)) |>
      activate(edges) |>
      slice(
        -which(edge_is_incident(.N()$rank == sum(.N()$deg == 1) + 1))[1]
      )
  })

# Remove a random edge until the graph is split in two
create_notable('zachary') |>
  iterate_while(graph_component_count() == 1, function(gr) {
    gr |>
      activate(edges) |>
      slice(-sample(graph_size(), 1))
  })


Measures based on the neighborhood of each node

Description

These functions wraps a set of functions that all measures quantities of the local neighborhood of each node. They all return a vector or list matching the node position.

Usage

local_size(order = 1, mode = "all", mindist = 0)

local_members(order = 1, mode = "all", mindist = 0)

local_triangles()

local_ave_degree(weights = NULL)

local_transitivity(weights = NULL)

Arguments

order

Integer giving the order of the neighborhood.

mode

Character constant, it specifies how to use the direction of the edges if a directed graph is analyzed. For ‘out’ only the outgoing edges are followed, so all vertices reachable from the source vertex in at most order steps are counted. For ‘"in"’ all vertices from which the source vertex is reachable in at most order steps are counted. ‘"all"’ ignores the direction of the edges. This argument is ignored for undirected graphs.

mindist

The minimum distance to include the vertex in the result.

weights

An edge weight vector. For local_ave_degree: If this argument is given, the average vertex strength is calculated instead of vertex degree. For local_transitivity: if given weighted transitivity using the approach by A. Barrat will be calculated.

Value

A numeric vector or a list (for local_members) with elements corresponding to the nodes in the graph.

Functions

Examples

# Get all neighbors of each graph
create_notable('chvatal') %>%
  activate(nodes) %>%
  mutate(neighborhood = local_members(mindist = 1))

# These are equivalent
create_notable('chvatal') %>%
  activate(nodes) %>%
  mutate(n_neighbors = local_size(mindist = 1),
         degree = centrality_degree()) %>%
  as_tibble()


Apply a function to nodes in the order of a breath first search

Description

These functions allow you to map over the nodes in a graph, by first performing a breath first search on the graph and then mapping over each node in the order they are visited. The mapping function will have access to the result and search statistics for all the nodes between itself and the root in the search. To map over the nodes in the reverse direction use map_bfs_back().

Usage

map_bfs(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_lgl(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_chr(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_int(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_dbl(root, mode = "out", unreachable = FALSE, .f, ...)

Arguments

root

The node to start the search from

mode

How should edges be followed? 'out' only follows outbound edges, 'in' only follows inbound edges, and 'all' follows all edges. This parameter is ignored for undirected graphs.

unreachable

Should the search jump to an unvisited node if the search is completed without visiting all nodes.

.f

A function to map over all nodes. See Details

...

Additional parameters to pass to .f

Details

The function provided to .f will be called with the following arguments in addition to those supplied through ...:

Instead of spelling out all of these in the function it is possible to simply name the ones needed and use ... to catch the rest.

Value

map_bfs() returns a list of the same length as the number of nodes in the graph, in the order matching the node order in the graph (that is, not in the order they are called). ⁠map_bfs_*()⁠ tries to coerce its result into a vector of the classes logical (map_bfs_lgl), character (map_bfs_chr), integer (map_bfs_int), or double (map_bfs_dbl). These functions will throw an error if they are unsuccesful, so they are type safe.

See Also

Other node map functions: map_bfs_back(), map_dfs(), map_dfs_back()

Examples

# Accumulate values along a search
create_tree(40, children = 3, directed = TRUE) %>%
  mutate(value = round(runif(40)*100)) %>%
  mutate(value_acc = map_bfs_dbl(node_is_root(), .f = function(node, path, ...) {
    sum(.N()$value[c(node, path$node)])
  }))

Apply a function to nodes in the reverse order of a breath first search

Description

These functions allow you to map over the nodes in a graph, by first performing a breath first search on the graph and then mapping over each node in the reverse order they are visited. The mapping function will have access to the result and search statistics for all the nodes following itself in the search. To map over the nodes in the original direction use map_bfs().

Usage

map_bfs_back(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_back_lgl(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_back_chr(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_back_int(root, mode = "out", unreachable = FALSE, .f, ...)

map_bfs_back_dbl(root, mode = "out", unreachable = FALSE, .f, ...)

Arguments

root

The node to start the search from

mode

How should edges be followed? 'out' only follows outbound edges, 'in' only follows inbound edges, and 'all' follows all edges. This parameter is ignored for undirected graphs.

unreachable

Should the search jump to an unvisited node if the search is completed without visiting all nodes.

.f

A function to map over all nodes. See Details

...

Additional parameters to pass to .f

Details

The function provided to .f will be called with the following arguments in addition to those supplied through ...:

Instead of spelling out all of these in the function it is possible to simply name the ones needed and use ... to catch the rest.

Value

map_bfs_back() returns a list of the same length as the number of nodes in the graph, in the order matching the node order in the graph (that is, not in the order they are called). ⁠map_bfs_back_*()⁠ tries to coerce its result into a vector of the classes logical (map_bfs_back_lgl), character (map_bfs_back_chr), integer (map_bfs_back_int), or double (map_bfs_back_dbl). These functions will throw an error if they are unsuccesful, so they are type safe.

See Also

Other node map functions: map_bfs(), map_dfs(), map_dfs_back()

Examples

# Collect values from children
create_tree(40, children = 3, directed = TRUE) %>%
  mutate(value = round(runif(40)*100)) %>%
  mutate(child_acc = map_bfs_back_dbl(node_is_root(), .f = function(node, path, ...) {
    if (nrow(path) == 0) .N()$value[node]
    else {
      sum(unlist(path$result[path$parent == node]))
    }
  }))

Apply a function to nodes in the order of a depth first search

Description

These functions allow you to map over the nodes in a graph, by first performing a depth first search on the graph and then mapping over each node in the order they are visited. The mapping function will have access to the result and search statistics for all the nodes between itself and the root in the search. To map over the nodes in the reverse direction use map_dfs_back().

Usage

map_dfs(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_lgl(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_chr(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_int(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_dbl(root, mode = "out", unreachable = FALSE, .f, ...)

Arguments

root

The node to start the search from

mode

How should edges be followed? 'out' only follows outbound edges, 'in' only follows inbound edges, and 'all' follows all edges. This parameter is ignored for undirected graphs.

unreachable

Should the search jump to an unvisited node if the search is completed without visiting all nodes.

.f

A function to map over all nodes. See Details

...

Additional parameters to pass to .f

Details

The function provided to .f will be called with the following arguments in addition to those supplied through ...:

Instead of spelling out all of these in the function it is possible to simply name the ones needed and use ... to catch the rest.

Value

map_dfs() returns a list of the same length as the number of nodes in the graph, in the order matching the node order in the graph (that is, not in the order they are called). ⁠map_dfs_*()⁠ tries to coerce its result into a vector of the classes logical (map_dfs_lgl), character (map_dfs_chr), integer (map_dfs_int), or double (map_dfs_dbl). These functions will throw an error if they are unsuccesful, so they are type safe.

See Also

Other node map functions: map_bfs(), map_bfs_back(), map_dfs_back()

Examples

# Add a random integer to the last value along a search
create_tree(40, children = 3, directed = TRUE) %>%
  mutate(child_acc = map_dfs_int(node_is_root(), .f = function(node, path, ...) {
    last_val <- if (nrow(path) == 0) 0L else tail(unlist(path$result), 1)
    last_val + sample(1:10, 1)
  }))

Apply a function to nodes in the reverse order of a depth first search

Description

These functions allow you to map over the nodes in a graph, by first performing a depth first search on the graph and then mapping over each node in the reverse order they are visited. The mapping function will have access to the result and search statistics for all the nodes following itself in the search. To map over the nodes in the original direction use map_dfs().

Usage

map_dfs_back(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_back_lgl(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_back_chr(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_back_int(root, mode = "out", unreachable = FALSE, .f, ...)

map_dfs_back_dbl(root, mode = "out", unreachable = FALSE, .f, ...)

Arguments

root

The node to start the search from

mode

How should edges be followed? 'out' only follows outbound edges, 'in' only follows inbound edges, and 'all' follows all edges. This parameter is ignored for undirected graphs.

unreachable

Should the search jump to an unvisited node if the search is completed without visiting all nodes.

.f

A function to map over all nodes. See Details

...

Additional parameters to pass to .f

Details

The function provided to .f will be called with the following arguments in addition to those supplied through ...:

Instead of spelling out all of these in the function it is possible to simply name the ones needed and use ... to catch the rest.

Value

map_dfs_back() returns a list of the same length as the number of nodes in the graph, in the order matching the node order in the graph (that is, not in the order they are called). ⁠map_dfs_back_*()⁠ tries to coerce its result into a vector of the classes logical (map_dfs_back_lgl), character (map_dfs_back_chr), integer (map_dfs_back_int), or double (map_dfs_back_dbl). These functions will throw an error if they are unsuccesful, so they are type safe.

See Also

Other node map functions: map_bfs(), map_bfs_back(), map_dfs()

Examples

# Collect values from the 2 closest layers of children in a dfs search
create_tree(40, children = 3, directed = TRUE) %>%
  mutate(value = round(runif(40)*100)) %>%
  mutate(child_acc = map_dfs_back(node_is_root(), .f = function(node, path, dist, ...) {
    if (nrow(path) == 0) .N()$value[node]
    else {
      unlist(path$result[path$dist - dist <= 2])
    }
  }))

Map a function over a graph representing the neighborhood of each node

Description

This function extracts the neighborhood of each node as a graph and maps over each of these neighborhood graphs. Conceptually it is similar to igraph::local_scan(), but it borrows the type safe versions available in map_bfs() and map_dfs().

Usage

map_local(order = 1, mode = "all", mindist = 0, .f, ...)

map_local_lgl(order = 1, mode = "all", mindist = 0, .f, ...)

map_local_chr(order = 1, mode = "all", mindist = 0, .f, ...)

map_local_int(order = 1, mode = "all", mindist = 0, .f, ...)

map_local_dbl(order = 1, mode = "all", mindist = 0, .f, ...)

Arguments

order

Integer giving the order of the neighborhood.

mode

Character constant, it specifies how to use the direction of the edges if a directed graph is analyzed. For ‘out’ only the outgoing edges are followed, so all vertices reachable from the source vertex in at most order steps are counted. For ‘"in"’ all vertices from which the source vertex is reachable in at most order steps are counted. ‘"all"’ ignores the direction of the edges. This argument is ignored for undirected graphs.

mindist

The minimum distance to include the vertex in the result.

.f

A function to map over all nodes. See Details

...

Additional parameters to pass to .f

Details

The function provided to .f will be called with the following arguments in addition to those supplied through ...:

The neighborhood graph will contain an extra node attribute called .central_node, which will be TRUE for the node that the neighborhood is expanded from and FALSE for everything else.

Value

map_local() returns a list of the same length as the number of nodes in the graph, in the order matching the node order in the graph. ⁠map_local_*()⁠ tries to coerce its result into a vector of the classes logical (map_local_lgl), character (map_local_chr), integer (map_local_int), or double (map_local_dbl). These functions will throw an error if they are unsuccesful, so they are type safe.

Examples

# Smooth out values over a neighborhood
create_notable('meredith') %>%
  mutate(value = rpois(graph_order(), 5)) %>%
  mutate(value_smooth = map_local_dbl(order = 2, .f = function(neighborhood, ...) {
    mean(as_tibble(neighborhood, active = 'nodes')$value)
  }))

Create a temporary alternative representation of the graph to compute on

Description

The morph/unmorph verbs are used to create temporary representations of the graph, such as e.g. its search tree or a subgraph. A morphed graph will accept any of the standard dplyr verbs, and changes to the data is automatically propagated to the original graph when unmorphing. Tidygraph comes with a range of morphers, but is it also possible to supply your own. See Details for the requirement for custom morphers. The crystallise verb is used to extract the temporary graph representation into a tibble containing one separate graph per row and a name and graph column holding the name of each graph and the graph itself respectively. convert() is a shorthand for performing both morph and crystallise along with extracting a single tbl_graph (defaults to the first). For morphs were you know they only create a single graph, and you want to keep it, this is an easy way.

Usage

morph(.data, .f, ...)

unmorph(.data)

crystallise(.data)

crystallize(.data)

convert(.data, .f, ..., .select = 1, .clean = FALSE)

Arguments

.data

A tbl_graph or a morphed_tbl_graph

.f

A morphing function. See morphers for a list of provided one.

...

Arguments passed on to the morpher

.select

The graph to return during convert(). Either an index or the name as created during crystallise().

.clean

Should references to the node and edge indexes in the original graph be removed when using convert

Details

It is only possible to change and add to node and edge data from a morphed state. Any filtering/removal of nodes and edges will not result in removal from the main graph. However, nodes and edges not present in the morphed state will be unaffected in the main graph when unmorphing (if new columns were added during the morhped state they will be filled with NA).

Morphing an already morhped graph will unmorph prior to applying the new morph.

During a morphed state, the mapping back to the original graph is stored in .tidygraph_node_index and .tidygraph_edge_index columns. These are accesible but protected, meaning that any changes to them with e.g. mutate will be ignored. Furthermore, if the morph results in the merging of nodes and/or edges the original data is stored in a .data column. This is protected as well.

When supplying your own morphers the morphing function should accept a tbl_graph as its first input. The provided graph will already have nodes and edges mapped with a .tidygraph_node_index and .tidygraph_edge_index column. The return value must be a tbl_graph or a list of tbl_graphs and these must contain either a .tidygraph_node_index column or a .tidygraph_edge_index column (or both). Note that it is possible for the morph to have the edges mapped back to the original nodes and vice versa (e.g. as with to_linegraph). In that case the edge data in the morphed graph(s) will contain a .tidygraph_node_index column and/or the node data a .tidygraph_edge_index column. If the morphing results in the collapse of multiple columns or edges the index columns should be converted to list columns mapping the new node/edge back to all the nodes/edges it represents. Furthermore the original node/edge data should be collapsed to a list of tibbles, with the row order matching the order in the index column element.

Value

A morphed_tbl_graph

Examples

create_notable('meredith') %>%
  mutate(group = group_infomap()) %>%
  morph(to_contracted, group) %>%
  mutate(group_centrality = centrality_pagerank()) %>%
  unmorph()

Functions to generate alternate representations of graphs

Description

These functions are meant to be passed into morph() to create a temporary alternate representation of the input graph. They are thus not meant to be called directly. See below for detail of each morpher.

Usage

to_linegraph(graph)

to_subgraph(graph, ..., subset_by = NULL)

to_subcomponent(graph, node)

to_split(graph, ..., split_by = NULL)

to_components(graph, type = "weak", min_order = 1)

to_largest_component(graph, type = "weak")

to_complement(graph, loops = FALSE)

to_local_neighborhood(graph, node, order = 1, mode = "all")

to_dominator_tree(graph, root, mode = "out")

to_minimum_spanning_tree(graph, weights = NULL)

to_random_spanning_tree(graph)

to_shortest_path(graph, from, to, mode = "out", weights = NULL)

to_bfs_tree(graph, root, mode = "out", unreachable = FALSE)

to_dfs_tree(graph, root, mode = "out", unreachable = FALSE)

to_simple(graph, remove_multiples = TRUE, remove_loops = TRUE)

to_contracted(graph, ..., simplify = TRUE)

to_unfolded_tree(graph, root, mode = "out")

to_directed(graph)

to_undirected(graph)

to_hierarchical_clusters(graph, method = "walktrap", weights = NULL, ...)

Arguments

graph

A tbl_graph

...

Arguments to pass on to filter(), group_by(), or the cluster algorithm (see igraph::cluster_walktrap(), igraph::cluster_leading_eigen(), and igraph::cluster_edge_betweenness())

subset_by, split_by

Whether to create subgraphs based on nodes or edges

node

The center of the neighborhood for to_local_neighborhood() and the node to that should be included in the component for to_subcomponent()

type

The type of component to split into. Either 'weak' or 'strong'

min_order

The minimum order (number of vertices) of the component. Components below this will not be created

loops

Should loops be included. Defaults to FALSE

order

The radius of the neighborhood

mode

How should edges be followed? 'out' only follows outbound edges, 'in' only follows inbound edges, and 'all' follows all edges. This parameter is ignored for undirected graphs.

root

The root of the tree

weights

Optional edge weights for the calculations

from, to

The start and end node of the path

unreachable

Should the search jump to a node in a new component when stuck.

remove_multiples

Should edges that run between the same nodes be reduced to one

remove_loops

Should edges that start and end at the same node be removed

simplify

Should edges in the contracted graph be simplified? Defaults to TRUE

method

The clustering method to use. Either 'walktrap', 'leading_eigen', or 'edge_betweenness'

Value

A list of tbl_graphs

Functions

Examples

# Compute only on a subgraph of every even node
create_notable('meredith') %>%
  morph(to_subgraph, seq_len(graph_order()) %% 2 == 0) %>%
  mutate(neighbour_count = centrality_degree()) %>%
  unmorph()

Base implementation of mutate

Description

This implementation of mutate is slightly faster than mutate at the expense of the graph only being updated in the end. This means that graph algorithms will not take changes happening during the mutate call into account.

Usage

mutate_as_tbl(.data, ...)

Arguments

.data

A tbl_graph object

...

columns to mutate

Details

The order of speed increase are rather small and in the ~1 millisecond per mutateed column order, so for regular use this should not be a choice. The operations not supported by mutate_as_tbl are e.g.

gr %>%
  activate(nodes) %>%
  mutate(weights = runif(10), degree = centrality_degree(weights))

as weights will only be made available in the graph at the end of the mutate call.

Value

A tbl_graph object


Querying node measures

Description

These functions are a collection of node measures that do not really fall into the class of centrality measures. For lack of a better place they are collected under the ⁠node_*⁠ umbrella of functions.

Usage

node_eccentricity(mode = "out")

node_constraint(weights = NULL)

node_coreness(mode = "out")

node_diversity(weights)

node_efficiency(weights = NULL, directed = TRUE, mode = "all")

node_bridging_score()

node_effective_network_size()

node_connectivity_impact()

node_closeness_impact()

node_fareness_impact()

Arguments

mode

How edges are treated. In node_coreness() it chooses which kind of coreness measure to calculate. In node_efficiency() it defines how the local neighborhood is created

weights

The weights to use for each node during calculation

directed

Should the graph be treated as a directed graph if it is in fact directed

Value

A numeric vector of the same length as the number of nodes in the graph.

Functions

Examples

# Calculate Burt's Constraint for each node
create_notable('meredith') %>%
  mutate(b_constraint = node_constraint())

Calculate node ranking

Description

This set of functions tries to calculate a ranking of the nodes in a graph so that nodes sharing certain topological traits are in proximity in the resulting order. These functions are of great value when composing matrix layouts and arc diagrams but could concievably be used for other things as well.

Usage

node_rank_hclust(
  method = "average",
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_anneal(
  cool = 0.5,
  tmin = 1e-04,
  swap_to_inversion = 0.5,
  step_multiplier = 100,
  reps = 1,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_branch_bound(
  weighted_gradient = FALSE,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_traveller(
  method = "two_opt",
  ...,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_two(
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_mds(
  method = "cmdscale",
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_leafsort(
  method = "average",
  type = "OLO",
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_visual(
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_spectral(
  normalized = FALSE,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_spin_out(
  step = 25,
  nstart = 10,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_spin_in(
  step = 5,
  sigma = seq(20, 1, length.out = 10),
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_quadratic(
  criterion = "2SUM",
  reps = 1,
  step = 2 * graph_order(),
  step_multiplier = 1.1,
  temp_multiplier = 0.5,
  maxsteps = 50,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_genetic(
  ...,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_rank_dendser(
  ...,
  dist = "shortest",
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

Arguments

method

The method to use. See Functions section for reference

dist

The algorithm to use for deriving a distance matrix from the graph. One of

  • "shortest" (default): Use the shortest path between all nodes

  • "euclidean": Calculate the L2 norm on the adjacency matrix of the graph

  • "manhattan": Calculate the L1 norm on the adjacency matrix of the graph

  • "maximum": Calculate the supremum norm on the adjacenecy matrix of the graph

  • "canberra": Calculate a weighted manhattan distance on the adjacency matrix of the graph

  • "binary": Calculate distance as the proportion of agreement between nodes based on the adjacency matrix of the graph

or a function that takes a tbl_graph and return a dist object with a size matching the order of the graph.

mode

Which edges should be included in the distance calculation. For distance measures based on the adjacency matrix, 'out' will use the matrix as is, 'in' will use the transpose, and 'all' will take the mean of the two. Defaults to 'out'. Ignored for undirected graphs.

weights

An edge variable to use as weight for the shortest path calculation if dist = 'shortest'

algorithm

The algorithm to use for the shortest path calculation if dist = 'shortest'

cool

cooling rate

tmin

minimum temperature

swap_to_inversion

Proportion of swaps in local neighborhood search

step_multiplier

Multiplication factor for number of iterations per temperature

reps

Number of repeats with random initialisation

weighted_gradient

minimize the weighted gradient measure? Defaults to FALSE

...

Arguments passed on to other algorithms. See Functions section for reference

type

The type of leaf reordering, either 'GW' to use the "GW" method or 'OLO' to use the "OLO" method (both in seriation)

normalized

Should the normalized laplacian of the similarity matrix be used?

step

The number iterations to run per initialisation

nstart

The number of random initialisations to perform

sigma

The variance around the diagonal to use for the weight matrix. Either a single number or a decreasing sequence.

criterion

The criterion to minimize. Either "LS" (Linear Seriation Problem), "2SUM" (2-Sum Problem), "BAR" (Banded Anti-Robinson form), or "Inertia" (Inertia criterion)

temp_multiplier

Temperature multiplication factor between 0 and 1

maxsteps

The upper bound of iterations

Value

An integer vector giving the position of each node in the ranking

Functions

Examples

graph <- create_notable('zachary') %>%
  mutate(rank = node_rank_hclust())


Node properties related to the graph topology

Description

These functions calculate properties that are dependent on the overall topology of the graph.

Usage

node_dominator(root, mode = "out")

node_topo_order(mode = "out")

Arguments

root

The node to start the dominator search from

mode

How should edges be followed. Either 'in' or 'out'

Value

A vector of the same length as the number of nodes in the graph

Functions

Examples

# Sort a graph based on its topological order
create_tree(10, 2) %>%
  arrange(sample(graph_order())) %>%
  mutate(old_ind = seq_len(graph_order())) %>%
  arrange(node_topo_order())

Querying node types

Description

These functions all lets the user query whether each node is of a certain type. All of the functions returns a logical vector indicating whether the node is of the type in question. Do note that the types are not mutually exclusive and that nodes can thus be of multiple types.

Usage

node_is_cut()

node_is_root()

node_is_leaf()

node_is_sink()

node_is_source()

node_is_isolated()

node_is_universal(mode = "out")

node_is_simplical(mode = "out")

node_is_center(mode = "out")

node_is_adjacent(to, mode = "all", include_to = TRUE)

node_is_keyplayer(k, p = 0, tol = 1e-04, maxsec = 120, roundsec = 30)

node_is_connected(nodes, mode = "all", any = FALSE)

Arguments

mode

The way edges should be followed in the case of directed graphs.

to

The nodes to test for adjacency to

include_to

Should the nodes in to be marked as adjacent as well

k

The number of keyplayers to identify

p

The probability to accept a lesser state

tol

Optimisation tolerance, below which the optimisation will stop

maxsec

The total computation budget for the optimization, in seconds

roundsec

Number of seconds in between synchronizing workers' answer

nodes

The set of nodes to test connectivity to. Can be a list to use different sets for different nodes. If a list it will be recycled as necessary.

any

Logical. If TRUE the node only needs to be connected to a single node in the set for it to return TRUE

Value

A logical vector of the same length as the number of nodes in the graph.

Functions

Examples

# Find the root and leafs in a tree
create_tree(40, 2) %>%
  mutate(root = node_is_root(), leaf = node_is_leaf())

Calculate node pair properties

Description

This set of functions can be used for calculations that involve node pairs. If the calculateable measure is not symmetric the function will come in two flavours, differentiated with ⁠_to⁠/⁠_from⁠ suffix. The ⁠*_to()⁠ functions will take the provided node indexes as the target node (recycling if necessary). For the ⁠*_from()⁠ functions the provided nodes are taken as the source. As for the other wrappers provided, they are intended for use inside the tidygraph framework and it is thus not necessary to supply the graph being computed on as the context is known.

Usage

node_adhesion_to(nodes)

node_adhesion_from(nodes)

node_cohesion_to(nodes)

node_cohesion_from(nodes)

node_distance_to(nodes, mode = "out", weights = NULL, algorithm = "automatic")

node_distance_from(
  nodes,
  mode = "out",
  weights = NULL,
  algorithm = "automatic"
)

node_cocitation_with(nodes)

node_bibcoupling_with(nodes)

node_similarity_with(nodes, mode = "out", loops = FALSE, method = "jaccard")

node_max_flow_to(nodes, capacity = NULL)

node_max_flow_from(nodes, capacity = NULL)

Arguments

nodes

The other part of the node pair (the first part is the node defined by the row). Recycled if necessary.

mode

How should edges be followed? If 'all' all edges are considered, if 'in' only inbound edges are considered, and if 'out' only outbound edges are considered

weights

The weights to use for calculation

algorithm

The distance algorithms to use. By default it will try to select the fastest suitable algorithm. Possible values are "automatic", "unweighted", "dijkstra", "bellman-ford", and "johnson"

loops

Should loop edges be considered

method

The similarity measure to calculate. Possible values are: "jaccard", "dice", and "invlogweighted"

capacity

The edge capacity to use

Value

A numeric vector of the same length as the number of nodes in the graph

Functions

Examples

# Calculate the distance to the center node
create_notable('meredith') %>%
  mutate(dist_to_center = node_distance_to(node_is_center()))

Perform a random walk on the graph and return encounter rank

Description

A random walk is a traversal of the graph starting from a node and going a number of steps by picking an edge at random (potentially weighted). random_walk() can be called both when nodes and edges are active and will adapt to return a value fitting to the currently active part. As the walk order cannot be directly encoded in the graph the return value is a list giving a vector of positions along the walk of each node or edge.

Usage

random_walk_rank(n, root = NULL, mode = "out", weights = NULL)

Arguments

n

The number of steps to perform. If the walk gets stuck before reaching this number the walk is terminated

root

The node to start the walk at. If NULL a random node will be used

mode

How edges are followed in the search if the graph is directed. "out" only follows outbound edges, "in" only follows inbound edges, and "all" or "total" follows all edges. This is ignored for undirected graphs.

weights

The weights to use for edges when selecting the next step of the walk. Currently only used when edges are active

Value

A list with an integer vector for each node or edge (depending on what is active) each element encode the time the node/edge is encountered along the walk

Examples

graph <- create_notable("zachary")

# Random walk returning node order
graph |>
  mutate(walk_rank = random_walk_rank(200))

# Rank edges instead
graph |>
  activate(edges) |>
  mutate(walk_rank = random_walk_rank(200))


Objects exported from other packages

Description

These objects are imported from other packages. Follow the links below to see their documentation.

dplyr

anti_join, arrange, contains, distinct, ends_with, everything, filter, full_join, group_by, group_data, group_indices, group_keys, group_size, group_vars, groups, inner_join, left_join, matches, mutate, mutate_all, mutate_at, n, n_groups, num_range, one_of, pull, rename, right_join, sample_frac, sample_n, select, semi_join, slice, slice_head, slice_max, slice_min, slice_sample, slice_tail, starts_with, tbl_vars, top_n, transmute, ungroup

igraph

as.igraph

magrittr

%>%

tibble

as_tibble

tidyr

drop_na, replace_na


Change terminal nodes of edges

Description

The reroute verb lets you change the beginning and end node of edges by specifying the new indexes of the start and/or end node(s). Optionally only a subset of the edges can be rerouted using the subset argument, which should be an expression that are to be evaluated in the context of the edge data and should return an index compliant vector (either logical or integer).

Usage

reroute(.data, from = NULL, to = NULL, subset = NULL)

Arguments

.data

A tbl_graph or morphed_tbl_graph object. grouped_tbl_graph will be ungrouped prior to rerouting

from, to

The new indexes of the terminal nodes. If NULL nothing will be changed

subset

An expression evaluating to an indexing vector in the context of the edge data. If NULL it will use focused edges if available or all edges

Value

An object of the same class as .data

Examples

# Switch direction of edges
create_notable('meredith') %>%
  activate(edges) %>%
  reroute(from = to, to = from)

# Using subset
create_notable('meredith') %>%
  activate(edges) %>%
  reroute(from = 1, subset = to > 10)

Graph games based on direct sampling

Description

This set of graph games creates graphs directly through sampling of different attributes, topologies, etc. The nature of their algorithm is described in detail at the linked igraph documentation.

Usage

play_degree(out_degree, in_degree = NULL, method = "simple")

play_dotprod(position, directed = TRUE)

play_fitness(m, out_fit, in_fit = NULL, loops = FALSE, multiple = FALSE)

play_fitness_power(
  n,
  m,
  out_exp,
  in_exp = -1,
  loops = FALSE,
  multiple = FALSE,
  correct = TRUE
)

play_gnm(n, m, directed = TRUE, loops = FALSE)

play_gnp(n, p, directed = TRUE, loops = FALSE)

play_geometry(n, radius, torus = FALSE)

play_erdos_renyi(n, p, m, directed = TRUE, loops = FALSE)

Arguments

out_degree, in_degree

The degrees of each node in the graph

method

The algorithm to use for the generation. Either 'simple', 'vl', or 'simple.no.multiple'

position

The latent position of each node by column.

directed

Should the resulting graph be directed

m

The number of edges in the graph

out_fit, in_fit

The fitness of each node

loops

Are loop edges allowed

multiple

Are multiple edges allowed

n

The number of nodes in the graph.

out_exp, in_exp

Power law exponent of degree distribution

correct

Use finite size correction

p

The probabilty of an edge occuring

radius

The radius within which vertices are connected

torus

Should the vertices be distributed on a torus instead of a plane

Value

A tbl_graph object

Functions

See Also

Other graph games: component_games, evolution_games, type_games

Examples

plot(play_erdos_renyi(20, 0.3))

Search a graph with depth first and breath first

Description

These functions wraps the igraph::bfs() and igraph::dfs() functions to provide a consistent return value that can be used in dplyr::mutate() calls. Each function returns an integer vector with values matching the order of the nodes in the graph.

Usage

bfs_rank(root, mode = "out", unreachable = FALSE)

bfs_parent(root, mode = "out", unreachable = FALSE)

bfs_before(root, mode = "out", unreachable = FALSE)

bfs_after(root, mode = "out", unreachable = FALSE)

bfs_dist(root, mode = "out", unreachable = FALSE)

dfs_rank(root, mode = "out", unreachable = FALSE)

dfs_rank_out(root, mode = "out", unreachable = FALSE)

dfs_parent(root, mode = "out", unreachable = FALSE)

dfs_dist(root, mode = "out", unreachable = FALSE)

Arguments

root

The node to start the search from

mode

How edges are followed in the search if the graph is directed. "out" only follows outbound edges, "in" only follows inbound edges, and "all" or "total" follows all edges. This is ignored for undirected graphs.

unreachable

Should the search jump to a new component if the search is terminated without all nodes being visited? Default to FALSE (only reach connected nodes).

Value

An integer vector, the nature of which is determined by the function.

Functions

Examples

# Get the depth of each node in a tree
create_tree(10, 2) %>%
  activate(nodes) %>%
  mutate(depth = bfs_dist(root = 1))

# Reorder nodes based on a depth first search from node 3
create_notable('franklin') %>%
  activate(nodes) %>%
  mutate(order = dfs_rank(root = 3)) %>%
  arrange(order)


Graph games based on different node types

Description

This set of games are build around different types of nodes and simulating their interaction. The nature of their algorithm is described in detail at the linked igraph documentation.

Usage

play_preference(
  n,
  n_types,
  p_type = rep(1, n_types),
  p_pref = matrix(1, n_types, n_types),
  fixed = FALSE,
  directed = TRUE,
  loops = FALSE
)

play_preference_asym(
  n,
  n_types,
  p_type = matrix(1, n_types, n_types),
  p_pref = matrix(1, n_types, n_types),
  loops = FALSE
)

play_bipartite(n1, n2, p, m, directed = TRUE, mode = "out")

play_traits(
  n,
  n_types,
  growth = 1,
  p_type = rep(1, n_types),
  p_pref = matrix(1, n_types, n_types),
  callaway = TRUE,
  directed = TRUE
)

play_citation_type(
  n,
  growth,
  types = rep(0, n),
  p_pref = rep(1, length(unique(types))),
  directed = TRUE
)

Arguments

n, n1, n2

The number of nodes in the graph. For bipartite graphs n1 and n2 specifies the number of nodes of each type.

n_types

The number of different node types in the graph

p_type

The probability that a node will be the given type. Either a vector or a matrix, depending on the game

p_pref

The probability that an edge will be made to a type. Either a vector or a matrix, depending on the game

fixed

Should n_types be understood as a fixed number of nodes for each type rather than as a probability

directed

Should the resulting graph be directed

loops

Are loop edges allowed

p

The probabilty of an edge occuring

m

The number of edges in the graph

mode

The flow direction of edges

growth

The number of edges added at each iteration

callaway

Use the callaway version of the trait based game

types

The type of each node in the graph, enumerated from 0

Value

A tbl_graph object

Functions

See Also

Other graph games: component_games, evolution_games, sampling_games

Examples

plot(play_bipartite(20, 30, 0.4))


Evaluate a tidygraph algorithm in the context of a graph

Description

All tidygraph algorithms are meant to be called inside tidygraph verbs such as mutate(), where the graph that is currently being worked on is known and thus not needed as an argument to the function. In the off chance that you want to use an algorithm outside of the tidygraph framework you can use with_graph() to set the graph context temporarily while the algorithm is being evaluated.

Usage

with_graph(graph, expr)

Arguments

graph

The tbl_graph to use as context

expr

The expression to evaluate

Value

The value of expr

Examples

gr <- play_gnp(10, 0.3)

with_graph(gr, centrality_degree())

mirror server hosted at Truenetwork, Russian Federation.