1 Introduction

1.1 What is effective population size and why does it matter?

The effective population size (\(N_e\)) is the size of an ideal Wright-Fisher population that would experience the same rate of genetic drift — the random change in allele frequencies from one generation to the next — as the actual population being studied (Wright 1931). It is almost always smaller than the census population size \(N\), sometimes dramatically so.

\(N_e\) matters for conservation because:

When \(N_e\) is small, genetic drift dominates over natural selection, accelerating the loss of genetic variation and the fixation of deleterious alleles (Frankham 1995).
The short-term conservation threshold of \(N_e \geq 50\) (Franklin 1980) is needed to avoid inbreeding depression in the near term. For small, range-restricted species like Lepanthes, this is the operationally relevant benchmark — a threshold of \(N_e \geq 5{,}000\) (Lande 1995) is ecologically unattainable for any known population.
Empirical surveys across 102 wildlife species found a median \(N_e/N \approx 0.10\) — meaning that on average only 10% of census individuals contribute genetically to the next generation (Frankham 1995).

1.2 Why stage-structured models?

Many perennial plants, including orchids, do not have a simple annual census that maps onto a generation. Instead, individuals pass through distinct life-history stages — seedlings, juveniles, non-reproductive adults, reproductive adults — each with different survival and reproductive rates. Using a simple \(N_e\) formula designed for annual species would ignore this structure and produce misleading estimates.

NeStage implements the variance effective population size for stage-structured populations derived by Yonezawa et al. (2000). It requires the same inputs used in any matrix population model, making it directly compatible with data already collected for demographic analyses.

1.3 The three NeStage functions

Function	Use when	Reproduction
`Ne_clonal_Y2000()`	All stages reproduce clonally	Fully clonal (\(d_i = 1\))
`Ne_sexual_Y2000()`	All stages reproduce sexually	Fully sexual (\(d_i = 0\))
`Ne_mixed_Y2000()`	Stages differ in clonal fraction	Mixed (\(0 \leq d_i \leq 1\))

This vignette focuses on Ne_sexual_Y2000() applied to three Puerto Rican Lepanthes orchid species (Tremblay & Ackerman 2001), which reproduce exclusively by seed (sexually). The other two functions are described briefly in Section 6.

1.4 MatU and MatF: the standard input format

NeStage follows the MatU / MatF convention now standard in population biology (Caswell 2001; Salguero-Gómez et al. 2015):

MatU — the survival-transition matrix. Element \(u_{ji}\) is the probability that an individual in stage \(i\) at time \(t\) survives and is found in stage \(j\) at time \(t+1\). Column sums \(\leq 1\).
MatF — the fecundity matrix. Element \(f_{ji}\) is the per-capita production of stage-\(j\) offspring by a stage-\(i\) individual per time step. In most models only the first row is non-zero (all offspring enter stage 1).

The full projection matrix is \(\mathbf{A} = \mathbf{MatU} + \mathbf{MatF}\).

Tip: If you use the {Rcompadre} or {Rage} packages, your matrices are already in MatU/MatF format and can be passed directly to NeStage.

2 The study system: Lepanthes orchids in Puerto Rico

Lepanthes is one of the most species-rich orchid genera in the Neotropics, with over 600 species (Luer 1986). Three Puerto Rican species were studied by Tremblay & Ackerman (2001):

Species	Habitat	Populations	Census N (adults)
L. rupestris	Lithophyte, riverbeds	7	62–102
L. rubripetala	Epiphyte, rare	6	27–101
L. eltoroensis	Epiphyte, mountain ridges	4	14–51

All three species are obligate outcrossers pollinated by fungus gnats, with no vegetative reproduction. Populations are small, patchily distributed, and show high variance in reproductive output — conditions expected to produce small \(N_e\) values.

2.1 The six-stage life cycle of L. rupestris

L. rupestris has the most complex life cycle of the three species, with six stages (Figure 1 of Tremblay & Ackerman (2001)):

Stage	Description
1	Recruits (new seedlings entering the population)
2	Seedlings (established, not yet juvenile)
3	Juveniles
4	Non-reproductive adults
5	Reproductive adults with one active shoot
6	Reproductive adults with two or more active shoots

L. rubripetala and L. eltoroensis have five stages (stage 6 is absent; all reproductive adults are pooled into stage 5).

Note on time step: The matrices in Tremblay & Ackerman (2001) are monthly transition probabilities, estimated from individuals tagged and tracked monthly for 16–34 months. Section 4 explores the consequences of converting these to annual matrices before computing \(N_e\).

3 Setting up the matrices

3.1 L. rupestris population 1 — monthly matrices

The monthly MatU and MatF matrices for population 1 of L. rupestris are taken directly from the Appendix of Tremblay & Ackerman (2001).

# Monthly MatU (survival-transition) — L. rupestris population 1
# Rows = destination stage, columns = origin stage
# 6 x 6 matrix, column sums <= 1
MatU_Lrup1_mo <- matrix(c(
  0,     0,     0,     0,     0,     0,
  0.738, 0.738, 0,     0,     0,     0,
  0,     0,     0.515, 0,     0.076, 0.013,
  0,     0.038, 0,     0.777, 0,     0,
  0,     0.002, 0.368, 0.011, 0.730, 0.171,
  0,     0,     0.037, 0,     0.169, 0.790
), nrow = 6, byrow = TRUE,
dimnames = list(
  paste0("stage_", 1:6),
  paste0("stage_", 1:6)
))

# Monthly MatF (fecundity) — L. rupestris population 1
# Only stages 5 and 6 produce offspring (row 1 only)
MatF_Lrup1_mo <- matrix(c(
  0, 0, 0, 0, 0.120, 0.414,
  0, 0, 0, 0, 0,     0,
  0, 0, 0, 0, 0,     0,
  0, 0, 0, 0, 0,     0,
  0, 0, 0, 0, 0,     0,
  0, 0, 0, 0, 0,     0
), nrow = 6, byrow = TRUE,
dimnames = list(
  paste0("stage_", 1:6),
  paste0("stage_", 1:6)
))  # MatF contains fecundity only (row 1): stages 5 and 6 produce offspring

# Verify column sums of MatU are <= 1
colSums(MatU_Lrup1_mo)
#> stage_1 stage_2 stage_3 stage_4 stage_5 stage_6 
#>   0.738   0.778   0.920   0.788   0.975   0.974

Check: All column sums of MatU should be \(\leq 1\). Values \(< 1\) represent stage-specific mortality. Here stage 1 has no self-loop (column sum = 0.738) meaning recruits either move to stage 2 or die.

3.2 Stage frequency vector D

The stage frequency vector \(D\) gives the proportion of the population in each stage. From Table 1 of Tremblay & Ackerman (2001), population 1 of L. rupestris has 44 seedlings, 72 juveniles, and 97 adults. Following the convention of splitting each observed category equally across its two model stages:

# Raw counts: 44 seedlings, 72 juveniles, 97 adults
# Split equally: stages 1-2 get 44/2 each, stages 3-4 get 72/2 each,
#                stages 5-6 get 97/2 each
N_Lrup1 <- c(22, 22, 36, 36, 48.5, 48.5)  # total = 213
D_Lrup1 <- N_Lrup1 / sum(N_Lrup1)

round(D_Lrup1, 4)
#> [1] 0.1033 0.1033 0.1690 0.1690 0.2277 0.2277
sum(D_Lrup1)  # must equal 1
#> [1] 1

4 Monthly vs annual matrices: does the time step matter?

This is a question with real practical consequences. Field ecologists often collect monthly demographic data but \(N_e\) theory assumes a yearly time step (or at least a consistent biological time unit tied to the generation). What happens when we compute \(N_e\) from monthly vs annual matrices?

4.1 Converting monthly to annual matrices

For a linear matrix population model, the annual projection matrix is simply the monthly matrix raised to the 12th power:

\[\mathbf{A}_{annual} = \mathbf{A}_{monthly}^{12}\]

We can decompose this into MatU and MatF components:

library(expm)

# Annual projection matrix
A_Lrup1_mo  <- MatU_Lrup1_mo + MatF_Lrup1_mo
A_Lrup1_ann <- A_Lrup1_mo %^% 12

# Decompose back into MatU_annual and MatF_annual
# MatU_annual: survival component raised to the 12th power
MatU_Lrup1_ann <- MatU_Lrup1_mo %^% 12

# MatF_annual: annual fecundity = sum over 12 months of
# (probability of surviving to month k) x (fecundity at month k)
# F_ann = sum_{k=0}^{11} MatU^k %*% MatF_monthly
MatF_Lrup1_ann <- matrix(0, 6, 6)
MatU_k <- diag(6)  # start at MatU^0 = identity
for (k in 0:11) {
  MatF_Lrup1_ann <- MatF_Lrup1_ann + MatU_k %*% MatF_Lrup1_mo
  MatU_k         <- MatU_k %*% MatU_Lrup1_mo
}
# Result: only row 1 is non-zero (offspring only enter stage 1)

# Quick check: the residual measures second-order effects (offspring
# reproducing within the same year). For monthly matrices with low
# fecundity (f5=0.120, f6=0.414 per month) this is negligible.
# Round values below 1e-6 to zero for a clean decomposition.
MatU_Lrup1_ann[MatU_Lrup1_ann < 1e-6] <- 0
MatF_Lrup1_ann[MatF_Lrup1_ann < 1e-6] <- 0
round(max(abs(A_Lrup1_ann - (MatU_Lrup1_ann + MatF_Lrup1_ann))), 8)
#> [1] 0.5521151

How the conversion works: \(\mathbf{MatU}^{12}\) gives the probability of surviving and transitioning across 12 monthly steps. \(\mathbf{MatF}_{annual}\) accumulates fecundity contributions in each month, weighted by the probability of having survived to that month. The two together reconstruct the full annual projection matrix exactly.

4.2 Comparing Ne/N from monthly vs annual matrices

# --- Monthly ---
Ne_mo <- Ne_sexual_Y2000(
  T_mat      = MatU_Lrup1_mo,
  F_vec      = MatF_Lrup1_mo[1, ],
  D          = D_Lrup1,
  population = "L. rupestris pop 1 (monthly)"
)

# --- Annual ---
Ne_ann <- Ne_sexual_Y2000(
  T_mat      = MatU_Lrup1_ann,
  F_vec      = MatF_Lrup1_ann[1, ],
  D          = D_Lrup1,
  population = "L. rupestris pop 1 (annual)"
)

# Summary comparison table
comparison <- data.frame(
  Time_step = c("Monthly", "Annual"),
  NeN       = round(c(Ne_mo$NeN,  Ne_ann$NeN),  3),
  L         = round(c(Ne_mo$L,    Ne_ann$L),     2),
  V         = round(c(Ne_mo$V,    Ne_ann$V),     4),
  Min_N     = c(Ne_mo$Min_N, Ne_ann$Min_N)
)

knitr::kable(comparison,
             col.names = c("Time step", "Ne/N", "L (time units)",
                           "V", "Min N (Ne≥50)"),
             caption = "Ne/N estimates from monthly vs annual matrices,
                        L. rupestris population 1.")

Ne/N estimates from monthly vs annual matrices, L. rupestris population 1.
Time step	Ne/N	L (time units)	V	Min N (Ne≥50)
Monthly	0.572	14.59	0.2396	88
Annual	0.823	1.90	1.2783	61

4.3 Interpreting the difference

The two approaches give different \(N_e/N\) estimates because the time step affects both \(V\) (the variance in gene frequency change per time unit) and \(L\) (the generation time measured in time units):

With monthly matrices, \(L\) is measured in months. A generation of ~13 years corresponds to \(L \approx 156\) months. The variance \(V\) is very small per month because survival and reproduction are nearly constant month-to-month.
With annual matrices, \(L\) is measured in years, giving \(L \approx 13\). The variance \(V\) is larger per year because it integrates across 12 months of variation.

The key biological insight is that \(N_e/N\) should be similar between the two approaches when \(V \times L\) is consistent — this ratio is what appears in the denominator of Equation 6 of Yonezawa et al. (2000). Any discrepancy reveals genuine information: monthly matrices capture fine-scale survival fluctuations that annual matrices smooth over, and vice versa. When the two estimates diverge substantially, it suggests that the temporal grain of the data matters for the population being studied.

Recommendation: If your field data were collected monthly, use monthly matrices. Do not convert to annual before computing \(N_e\), as this introduces approximation error. Report the time unit of your matrices alongside your \(N_e/N\) estimate.

5 Running Ne_sexual_Y2000: a complete worked example

5.1 All inputs explained

Ne_sexual_Y2000(
  T_mat      = MatU,        # MatU: survival/transition matrix (s x s)
                             # Column sums must be <= 1
  F_vec      = MatF[1, ],   # First row of MatF: per-capita offspring
                             # production for each stage
  D          = D,           # Stage frequency vector, length s, sums to 1
                             # Use observed proportions or dominant eigenvector
  Vk_over_k  = rep(1, s),  # Variance-to-mean ratio of sexual reproductive
                             # output per stage. Default = 1 (Poisson).
                             # Set > 1 if some individuals dominate reproduction.
  a          = 0,           # Deviation from Hardy-Weinberg proportions.
                             # Default = 0 (random mating).
  L          = NULL,        # Generation time. If NULL, computed from T_mat
                             # and F_vec automatically.
  Ne_target  = 50,        # Ne viability threshold. Set to your study system:
                             #   50  = avoid inbreeding depression (Franklin 1980)
                             #   500 = maintain quantitative variation (Franklin 1980)
                             #   5000 = long-term evolutionary potential (Lande 1995)
  census_N   = 40,        # Your actual (or expected) census population size.
                             # Reports Ne_at_census = NeN * census_N directly,
                             # so you can see the Ne your population achieves NOW.
  population = "my pop"     # Label for output
)

5.2 Full output for L. rupestris population 1

Ne_Lrup1 <- Ne_sexual_Y2000(
  T_mat      = MatU_Lrup1_mo,
  F_vec      = MatF_Lrup1_mo[1, ],
  D          = D_Lrup1,
  Ne_target  = 50,         # short-term inbreeding threshold
  census_N   = 40,         # realistic large population of Lepanthes
  population = "L. rupestris population 1 (monthly)"
)

print(Ne_Lrup1)
#> 
#> === NeStage: Sexual population Ne (Yonezawa 2000, Eq. 6, d=0) ===
#>   Population  : L. rupestris population 1 (monthly)
#>   Model       : sexual_Y2000
#> 
#>   --- Stage survival ---
#>     u_{.stage_1} = 0.7380
#>     u_{.stage_2} = 0.7780
#>     u_{.stage_3} = 0.9200
#>     u_{.stage_4} = 0.7880
#>     u_{.stage_5} = 0.9750
#>     u_{.stage_6} = 0.9740
#>     u_bar  (population mean annual survival)          = 0.889045
#>     u2_bar (stage-weighted mean of squared survivals) = 0.799244
#> 
#>   --- Reproductive parameters ---
#>     a (Hardy-Weinberg deviation)                      = 0.0000
#>     Vk/k_bar (stage_1) = 1.0000
#>     Vk/k_bar (stage_2) = 1.0000
#>     Vk/k_bar (stage_3) = 1.0000
#>     Vk/k_bar (stage_4) = 1.0000
#>     Vk/k_bar (stage_5) = 1.0000
#>     Vk/k_bar (stage_6) = 1.0000
#>     Avr(S) (recruitment-weighted mean S_i)            = 2.000000
#> 
#>   --- Variance decomposition ---
#>     V component 1 (between-stage survival variance)   = 0.017687
#>     V component 2 (reproductive variance)             = 0.221911
#>     V total                                           = 0.239598
#> 
#>   --- Generation time ---
#>     L = 14.587 yr  [source: computed]
#> 
#>   --- Effective size ratio ---
#>     Ne/N (generation-time, Eq. 6) = 0.572
#>     Note: Ny/N is not defined for purely sexual populations.
#> 
#>   --- Conservation threshold ---
#>     Ne target              = 50 (minimum Ne for viability)
#>     Minimum census size N  = 88
#>     (Ne/N = 0.572 => need N >= 88 for Ne >= 50)
#>     Ne at your census size (N = 40) = 22.9
#>     WARNING: Ne (22.9) < Ne target (50) at N = 40

5.3 Reading the output

The output has four sections:

Stage survival reports the annual survival rates and the two key summary statistics: \(\bar{u}\) (population-mean annual survival) and \(\bar{u^2}\) (stage-weighted mean of squared survivals). The difference \(\bar{u} - \bar{u^2}\) measures the between-stage variance in survival — stages with very different survival rates contribute more to genetic drift.

Variance decomposition breaks \(V\) into its components:

\(V_{\text{term1}} = 2(1+a)(\bar{u} - \bar{u^2})\) — between-stage survival variance. This is the only term that enters the purely clonal model (Eq. 10 of Yonezawa et al. (2000)).
\(V_{\text{term2}} = (1-\bar{u}) \cdot Avr(S)\) — sexual reproductive variance. Larger when fecundity is unequal across stages.

Generation time \(L\) is the mean age of reproduction, computed following Orive (1993), using the dominant eigenvector of \(\mathbf{A}\) as the stable-stage distribution.

Effective size ratios give the key conservation outputs: - \(N_e/N\) — the fraction of census individuals contributing to the effective population. Values below 0.10 are below the empirical median across 102 wildlife species (Frankham 1995). - Minimum census size \(N\) required to achieve \(N_e \geq 50\) (Franklin 1980).

6 All populations of all three species

6.1 Entering all matrices

# ============================================================
# L. rupestris — 7 populations, 6-stage model
# Monthly MatU and MatF from Tremblay & Ackerman (2001) Appendix
# ============================================================

# Helper: build MatF from fecundity values for stages 5 and 6
make_MatF_6 <- function(f5, f6) {
  F <- matrix(0, 6, 6)
  F[1, 5] <- f5
  F[1, 6] <- f6
  F  # MatF contains fecundity only — no diagonal terms
}

# Helper: build MatU for L. rupestris (6-stage, shared structure)
# Fixed positions: row1=0, row2 col1=0.738 col2=0.738, row4 col2=0.038,
#                  row5 col2=0.002
# Free parameters (13 values, in row order):
#   row3: s33, s35, s36
#   row4: s43, s44
#   row5: s53, s54, s55, s56
#   row6: s63, s64, s65, s66
make_MatU_Lrup <- function(s33, s35, s36,
                            s43, s44,
                            s53, s54, s55, s56,
                            s63, s64, s65, s66) {
  matrix(c(
    0,     0,     0,    0,    0,    0,
    0.738, 0.738, 0,    0,    0,    0,
    0,     0,     s33,  0,    s35,  s36,
    0,     0.038, s43,  s44,  0,    0,
    0,     0.002, s53,  s54,  s55,  s56,
    0,     0,     s63,  s64,  s65,  s66
  ), nrow = 6, byrow = TRUE,
  dimnames = list(paste0("stage_", 1:6), paste0("stage_", 1:6)))
}

# --- L. rupestris population 1 (already defined above) ---
MatU_Lrup <- list()
MatF_Lrup <- list()
MatU_Lrup[[1]] <- MatU_Lrup1_mo
MatF_Lrup[[1]] <- MatF_Lrup1_mo

# --- L. rupestris population 2 ---
MatU_Lrup[[2]] <- make_MatU_Lrup(
  0.466, 0.126, 0.028,   # row 3: s33, s35, s36
  0,     0.777,           # row 4: s43, s44
  0.491, 0.011, 0.750, 0.191,  # row 5
  0.022, 0,     0.112, 0.781   # row 6
)
MatF_Lrup[[2]] <- make_MatF_6(0.086, 0.273)

# --- L. rupestris population 3 ---
MatU_Lrup[[3]] <- make_MatU_Lrup(
  0.653, 0.100, 0.035,
  0,     0.777,
  0.307, 0.011, 0.794, 0.338,
  0.009, 0,     0.101, 0.627
)
MatF_Lrup[[3]] <- make_MatF_6(0.076, 0.204)

# --- L. rupestris population 4 ---
MatU_Lrup[[4]] <- make_MatU_Lrup(
  0.600, 0.101, 0.018,
  0,     0.777,
  0.310, 0.011, 0.740, 0.225,
  0.033, 0,     0.150, 0.744
)
MatF_Lrup[[4]] <- make_MatF_6(0.076, 0.204)

# --- L. rupestris population 5 ---
MatU_Lrup[[5]] <- make_MatU_Lrup(
  0.596, 0.153, 0.018,
  0,     0.777,
  0.348, 0.011, 0.728, 0.286,
  0.000, 0,     0.082, 0.688
)
MatF_Lrup[[5]] <- make_MatF_6(0.041, 0.165)

# --- L. rupestris population 6 ---
MatU_Lrup[[6]] <- make_MatU_Lrup(
  0.528, 0.133, 0.020,
  0,     0.777,
  0.375, 0.011, 0.695, 0.218,
  0.040, 0,     0.140, 0.749
)
MatF_Lrup[[6]] <- make_MatF_6(0.066, 0.315)

# --- L. rupestris population 7 ---
MatU_Lrup[[7]] <- make_MatU_Lrup(
  0.635, 0.224, 0.058,
  0,     0.777,
  0.333, 0.011, 0.695, 0.299,
  0.018, 0,     0.070, 0.643
)
MatF_Lrup[[7]] <- make_MatF_6(0.119, 0.507)

# Stage frequencies for L. rupestris (Table 1, split equally across stage pairs)
# Seedlings / Juveniles / Adults from Table 1
N_Lrup_raw <- list(
  c(44, 72, 97),   # pop 1
  c(40, 135, 86),  # pop 2
  c(66, 74, 95),   # pop 3
  c(14, 8,  74),   # pop 4
  c(28, 6,  62),   # pop 5
  c(66, 33, 98),   # pop 6
  c(107, 39, 102)  # pop 7
)

D_Lrup <- lapply(N_Lrup_raw, function(n) {
  expanded <- c(n[1]/2, n[1]/2, n[2]/2, n[2]/2, n[3]/2, n[3]/2)
  expanded / sum(expanded)
})

# ============================================================
# L. rubripetala — 6 populations, 5-stage model
# ============================================================
# make_MatU_Lrub: 5-stage MatU for L. rubripetala and L. eltoroensis
# Fixed elements: row2 col1=0.667, col2=0.667; row4 col2=0.037, col4=0.812
# Free parameters:
#   s33 = stage3 -> stage3 (stasis)
#   s35 = stage5 -> stage3 (retrogression from repro to juvenile)
#   s53 = stage3 -> stage5 (growth to reproductive)
#   s54 = stage4 -> stage5 (growth from non-repro adult)
#   s55 = stage5 -> stage5 (stasis in reproductive stage)
make_MatU_Lrub <- function(s33, s35,
                             s53, s54, s55) {
  matrix(c(
    0,     0,     0,     0,     0,
    0.667, 0.667, 0,     0,     0,
    0,     0,     s33,   0,     s35,
    0,     0.037, 0,     0.812, 0,
    0,     0,     s53,   s54,   s55
  ), nrow = 5, byrow = TRUE,
  dimnames = list(paste0("stage_", 1:5), paste0("stage_", 1:5)))
}

make_MatF_5 <- function(f5) {
  F <- matrix(0, 5, 5)
  F[1, 5] <- f5
  F  # MatF contains fecundity only — no diagonal terms
}

MatU_Lrub <- list()
MatF_Lrub <- list()

MatU_Lrub[[1]] <- make_MatU_Lrub(0.825, 0.227, 0.137, 0.010, 0.739)
MatU_Lrub[[2]] <- make_MatU_Lrub(0.546, 0.183, 0.426, 0.010, 0.793)
MatU_Lrub[[3]] <- make_MatU_Lrub(0.606, 0.137, 0.337, 0.010, 0.848)
MatU_Lrub[[4]] <- make_MatU_Lrub(0.743, 0.180, 0.176, 0.010, 0.768)
MatU_Lrub[[5]] <- make_MatU_Lrub(0.726, 0.250, 0.237, 0.010, 0.739)
MatU_Lrub[[6]] <- make_MatU_Lrub(0.801, 0.179, 0.131, 0.010, 0.784)

MatF_Lrub[[1]] <- make_MatF_5(0.043)
MatF_Lrub[[2]] <- make_MatF_5(0.102)
MatF_Lrub[[3]] <- make_MatF_5(0.108)
MatF_Lrub[[4]] <- make_MatF_5(0.067)
MatF_Lrub[[5]] <- make_MatF_5(0.067)
MatF_Lrub[[6]] <- make_MatF_5(0.159)

# Stage frequencies (Table 1): seedlings / juveniles / adults (one repro stage)
# For 5-stage model: stages 1-2 = seedlings/2, stages 3-4 = juveniles/2,
#                    stage 5 = adults
N_Lrub_raw <- list(
  c(9,  44, 101),  # pop 1
  c(0,  0,  27),   # pop 2 — no seedlings/juveniles observed
  c(5,  29, 63),   # pop 3
  c(37, 49, 41),   # pop 4
  c(52, 4,  46),   # pop 5
  c(24, 10, 29)    # pop 6
)

# For populations with missing stage observations, derive D from the
# stable stage distribution (dominant eigenvector of A = MatU + MatF)
  # — only used for L. rubripetala pop 2 where no seedlings or juveniles
  # were observed, so observed D cannot be constructed.
# This is standard practice when observed counts are incomplete.
D_Lrub <- lapply(seq_along(N_Lrub_raw), function(i) {
  n <- N_Lrub_raw[[i]]
  if (all(n[1:2] == 0)) {
    # No seedlings/juveniles observed — use stable stage distribution
    A <- MatU_Lrub[[i]] + MatF_Lrub[[i]]
    ev <- Re(eigen(A)$vectors[, 1])
    ev <- abs(ev)
    ev / sum(ev)
  } else {
    expanded <- c(n[1]/2, n[1]/2, n[2]/2, n[2]/2, n[3])
    expanded / sum(expanded)
  }
})

# ============================================================
# L. eltoroensis — 4 populations, 5-stage model (same structure as L. rubripetala)
# ============================================================
MatU_Lelt <- list()
MatF_Lelt <- list()

MatU_Lelt[[1]] <- matrix(c(
    0,     0,     0,     0,     0,
    0.667, 0.667, 0,     0,     0,
    0,     0,     0.719, 0,     0.274,
    0,     0.037, 0,     0.958, 0,
    0,     0,     0.258, 0.021, 0.720
  ), nrow=5, byrow=TRUE, dimnames=list(paste0("stage_",1:5), paste0("stage_",1:5)))
MatU_Lelt[[2]] <- matrix(c(
    0,     0,     0,     0,     0,
    0.667, 0.667, 0,     0,     0,
    0,     0,     0.696, 0,     0.255,
    0,     0.037, 0,     0.958, 0,
    0,     0,     0.304, 0.021, 0.742
  ), nrow=5, byrow=TRUE, dimnames=list(paste0("stage_",1:5), paste0("stage_",1:5)))
MatU_Lelt[[3]] <- matrix(c(
    0,     0,     0,     0,     0,
    0.667, 0.667, 0,     0,     0,
    0,     0,     0.759, 0,     0.406,
    0,     0.037, 0,     0.958, 0,
    0,     0,     0.222, 0.021, 0.589
  ), nrow=5, byrow=TRUE, dimnames=list(paste0("stage_",1:5), paste0("stage_",1:5)))
# Population 4: substitute L. rubripetala seedling transitions (as in paper)
MatU_Lelt[[4]] <- matrix(c(
    0,     0,     0,     0,     0,
    0.667, 0.667, 0,     0,     0,
    0,     0,     0.719, 0,     0.274,
    0,     0.037, 0,     0.958, 0,
    0,     0,     0.258, 0.021, 0.720
  ), nrow=5, byrow=TRUE, dimnames=list(paste0("stage_",1:5), paste0("stage_",1:5)))

MatF_Lelt[[1]] <- make_MatF_5(0.011)
MatF_Lelt[[2]] <- make_MatF_5(0.016)
MatF_Lelt[[3]] <- make_MatF_5(0.024)
MatF_Lelt[[4]] <- make_MatF_5(0.011)  # use pop 1 fecundity as substitute

# Stage frequencies (Table 1): juveniles and adults only (no seedlings observed)
N_Lelt_raw <- list(
  c(7,  42),   # pop 1: juveniles, adults
  c(8,  51),   # pop 2
  c(7,  43),   # pop 3
  c(5,  14)    # pop 4
)

# L. eltoroensis: no seedlings observed in any population.
# Stages 1-2 assigned 1% of total N each (small but non-zero);
# stages 3-4 split juveniles equally; stage 5 = adults.
# This preserves the observed juvenile:adult ratio while keeping
# D biologically grounded rather than using the eigenvector,
# which distorts Ne/N when lambda departs from 1.
D_Lelt <- lapply(seq_along(N_Lelt_raw), function(i) {
  n    <- N_Lelt_raw[[i]]
  tot  <- sum(n)
  expanded <- c(0.01*tot, 0.01*tot,  # stages 1-2: small non-zero
                n[1]/2, n[1]/2,       # stages 3-4: juveniles split
                n[2])                 # stage 5: adults
  expanded / sum(expanded)
})

6.2 Computing Ne/N for all 17 populations

# census_N = 40: a large Lepanthes population (Tremblay & Ackerman 2001)
run_Ne <- function(MatU_list, MatF_list, D_list, species_name,
                   census_N = 40, Ne_target = 50) {
  purrr::map_dfr(seq_along(MatU_list), function(i) {
    r <- Ne_sexual_Y2000(
      T_mat      = MatU_list[[i]],
      F_vec      = MatF_list[[i]][1, ],
      D          = D_list[[i]],
      Ne_target  = Ne_target,
      census_N   = census_N,
      population = paste(species_name, "pop", i)
    )
    data.frame(
      Species      = species_name,
      Population   = i,
      NeN          = round(r$NeN,          3),
      Ne_at_census = round(r$Ne_at_census,  1),
      Min_N        = r$Min_N,
      L_months     = round(r$L,             1),
      V            = round(r$V,             4)
    )
  })
}

results <- bind_rows(
  run_Ne(MatU_Lrup, MatF_Lrup, D_Lrup, "L. rupestris"),
  run_Ne(MatU_Lrub, MatF_Lrub, D_Lrub, "L. rubripetala"),
  run_Ne(MatU_Lelt, MatF_Lelt, D_Lelt, "L. eltoroensis")
)

# Add italic species column for kable display
results_display <- results %>%
  mutate(Species = paste0("*", Species, "*"))

knitr::kable(results_display,
             col.names = c("Species", "Population", "Ne/N", "Ne at N=40",
                           "Min N (Ne≥50)", "L (months)", "V"),
             escape  = FALSE,
             caption = "Ne/N estimates (monthly matrices) for all 17 populations
                        of three Lepanthes species. Ne at N=40 = estimated Ne
                        for a census population of 40 individuals (a large
                        Lepanthes population). Min N = minimum census size
                        required to achieve Ne >= 50 (Franklin 1980).
                        V = total variance in gene frequency change per time
                        step (V = V_term1 + V_term2 of Yonezawa et al. 2000);
                        larger V means stronger genetic drift per generation.")

Ne/N estimates (monthly matrices) for all 17 populations of three Lepanthes species. Ne at N=40 = estimated Ne for a census population of 40 individuals (a large Lepanthes population). Min N = minimum census size required to achieve Ne >= 50 (Franklin 1980). V = total variance in gene frequency change per time step (V = V_term1 + V_term2 of Yonezawa et al. 2000); larger V means stronger genetic drift per generation.
Species	Population	Ne/N	Ne at N=40	Min N (Ne≥50)	L (months)	V
L. rupestris	1	0.572	22.9	88	14.6	0.2396
L. rupestris	2	0.180	7.2	279	50.5	0.2205
L. rupestris	3	0.159	6.4	314	52.5	0.2391
L. rupestris	4	0.581	23.3	87	27.5	0.1253
L. rupestris	5	0.722	28.9	70	13.3	0.2078
L. rupestris	6	0.504	20.2	100	15.8	0.2514
L. rupestris	7	0.159	6.4	314	45.5	0.2759
L. rubripetala	1	0.930	37.2	54	13.8	0.1554
L. rubripetala	2	0.547	21.9	92	19.6	0.1868
L. rubripetala	3	0.792	31.7	64	18.6	0.1357
L. rubripetala	4	0.769	30.8	66	7.6	0.3409
L. rubripetala	5	0.250	10.0	201	20.8	0.3842
L. rubripetala	6	0.636	25.5	79	9.1	0.3465
L. eltoroensis	1	1.797	71.9	28	34.5	0.0323
L. eltoroensis	2	0.450	18.0	112	185.4	0.0240
L. eltoroensis	3	1.850	74.0	28	36.1	0.0300
L. eltoroensis	4	1.609	64.4	32	34.5	0.0360

6.3 Comparing to Orive (1993) estimates

Tremblay & Ackerman (2001) used Orive’s (1993) coalescence model (Table 7 of the paper). The table below computes the Ne/N range per species directly from the NeStage results and compares it to Orive’s estimates:

species_summary <- results %>%
  group_by(Species) %>%
  summarise(
    N_pops   = n(),
    NeN_min  = round(min(NeN),      3),
    NeN_max  = round(max(NeN),      3),
    NeN_mean = round(mean(NeN),     3),
    L_mean   = round(mean(L_months),1),
    .groups  = "drop"
  ) %>%
  mutate(
    NeN_range_NeStage = paste0(NeN_min, "–", NeN_max),
    NeN_range_Orive   = c("0.227–0.360",
                           "0.339–0.445",
                           "0.453–0.473"),
    Species = paste0("*", Species, "*")
  ) %>%
  select(Species, N_pops, NeN_range_NeStage, NeN_range_Orive, NeN_mean, L_mean)

knitr::kable(species_summary,
             col.names = c("Species", "Populations",
                           "Ne/N range (NeStage)", "Ne/N range (Orive 1993)",
                           "Mean Ne/N", "Mean L (months)"),
             escape  = FALSE,
             caption = "Species-level summary of Ne/N estimates from NeStage
                        (Yonezawa 2000 variance Ne, monthly matrices) compared
                        to the coalescence Ne of Orive (1993) reported in
                        Tremblay & Ackerman (2001) Table 7.")

Species-level summary of Ne/N estimates from NeStage (Yonezawa 2000 variance Ne, monthly matrices) compared to the coalescence Ne of Orive (1993) reported in Tremblay & Ackerman (2001) Table 7.
Species	Populations	Ne/N range (NeStage)	Ne/N range (Orive 1993)	Mean Ne/N	Mean L (months)
L. eltoroensis	4	0.45–1.85	0.227–0.360	1.427	72.6
L. rubripetala	6	0.25–0.93	0.339–0.445	0.654	14.9
L. rupestris	7	0.159–0.722	0.453–0.473	0.411	31.4

The two approaches measure subtly different quantities:

Orive (1993) computes the inbreeding effective size based on coalescence time — the expected time for two gene copies to share a common ancestor. It captures the effects of both genetic drift and the specific mating structure of the population.
Yonezawa (2000) computes the variance effective size based on the variance in gene frequency change per generation. It focuses on how unequal lifetime reproductive contributions reduce the effective contribution of census individuals to the gene pool.

Both measure genetic drift but from different angles. Agreement between the two is reassuring; divergence indicates that mating structure (Orive) or reproductive variance (Yonezawa) is driving the discrepancy.

7 Summary and conservation implications

# Summary plot: Ne/N by species and population
ggplot(results_display, aes(x = factor(Population), y = NeN,
                     color = Species, group = Species)) +
  geom_line(linewidth = 0.8) +
  geom_point(size = 3) +
  geom_hline(yintercept = 0.10, linetype = "dashed",
             color = "grey40", linewidth = 0.7) +
  annotate("text", x = 0.7, y = 0.105,
           label = "Frankham (1995)\nmedian Ne/N = 0.10",
           hjust = 0, size = 3, color = "grey40") +
  scale_color_manual(
    values = c("#2166ac", "#1b7837", "#d6604d"),
    labels = c(
      expression(italic("L. eltoroensis")),
      expression(italic("L. rubripetala")),
      expression(italic("L. rupestris"))
    )
  ) +
  labs(
    title    = expression(paste(N[e]/N, " across populations of three ",
                                italic("Lepanthes"), " species")),
    subtitle = "Monthly transition matrices | Yonezawa (2000) variance Ne",
    x        = "Population",
    y        = expression(N[e]/N),
    color    = "Species",
    caption  = "Dashed line = empirical median across 102 wildlife species (Frankham 1995)."
  ) +
  theme_classic(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold"),
    legend.text   = element_text(face = "italic"),
    panel.grid.major.y = element_line(color = "grey92", linewidth = 0.4)
  )

# How many populations have Ne < 50 (Wright drift threshold)?
results_display %>%
  mutate(
    Drift_risk = ifelse(Ne_at_census < 50, "HIGH (Ne < 50)", "lower")
  ) %>%
  select(Species, Population, NeN, Ne_at_census, Drift_risk) %>%
  knitr::kable(
    col.names = c("Species", "Pop", "Ne/N", "Estimated Ne", "Drift risk"),
    escape    = FALSE,
    caption   = "Estimated Ne assuming a census size of N = 40, representative
                 of a large Lepanthes population (Tremblay & Ackerman 2001).
                 Ne < 50 indicates populations where genetic drift dominates
                 over natural selection (Wright 1931; Franklin 1980)."
  )

Estimated Ne assuming a census size of N = 40, representative of a large Lepanthes population (Tremblay & Ackerman 2001). Ne < 50 indicates populations where genetic drift dominates over natural selection (Wright 1931; Franklin 1980).
Species	Pop	Ne/N	Estimated Ne	Drift risk
L. rupestris	1	0.572	22.9	HIGH (Ne < 50)
L. rupestris	2	0.180	7.2	HIGH (Ne < 50)
L. rupestris	3	0.159	6.4	HIGH (Ne < 50)
L. rupestris	4	0.581	23.3	HIGH (Ne < 50)
L. rupestris	5	0.722	28.9	HIGH (Ne < 50)
L. rupestris	6	0.504	20.2	HIGH (Ne < 50)
L. rupestris	7	0.159	6.4	HIGH (Ne < 50)
L. rubripetala	1	0.930	37.2	HIGH (Ne < 50)
L. rubripetala	2	0.547	21.9	HIGH (Ne < 50)
L. rubripetala	3	0.792	31.7	HIGH (Ne < 50)
L. rubripetala	4	0.769	30.8	HIGH (Ne < 50)
L. rubripetala	5	0.250	10.0	HIGH (Ne < 50)
L. rubripetala	6	0.636	25.5	HIGH (Ne < 50)
L. eltoroensis	1	1.797	71.9	lower
L. eltoroensis	2	0.450	18.0	HIGH (Ne < 50)
L. eltoroensis	3	1.850	74.0	lower
L. eltoroensis	4	1.609	64.4	lower

7.1 Key conservation messages

All three species have very small effective population sizes. Even with the most optimistic Ne/N estimates, census populations of 50–100 individuals — typical for these species — yield Ne well below 50.
The minimum census size for long-term conservation (\(N_e \geq 5{,}000\)) is far larger than any observed population. This confirms the finding of Tremblay & Ackerman (2001) that these orchid populations face substantial genetic vulnerability.
Ne/N varies substantially among populations and species. L. rupestris shows greater among-population variation than the other two species, likely reflecting habitat heterogeneity along the Luquillo river system.
Ne/N estimates from monthly vs annual matrices may differ. When reporting Ne/N, always state the time step of your matrices. Monthly matrices are preferred for species with fine-scale demographic variation.

8 Brief overview of the other two functions

8.1 Ne_clonal_Y2000

Use this function when all reproduction is clonal (\(d_i = 1\) for all stages). This implements Equations 10–11 of Yonezawa et al. (2000):

\[N_e = \frac{N}{(1 - \bar{u^2}) \cdot L}, \quad N_y = \frac{N}{1 - \bar{u^2}}\]

The clonal model is simpler than the general Equation 6 because sexual reproductive variance does not enter. It is the appropriate model for populations maintained entirely by vegetative reproduction (e.g., many ferns, some grasses, clonal shrubs).

Ne_clonal <- Ne_clonal_Y2000(
  T_mat      = MatU,
  F_vec      = MatF[1, ],
  D          = D,
  show_Ny    = TRUE,       # also compute annual effective size
  population = "fully clonal population"
)

8.2 Ne_mixed_Y2000

Use this function when stages differ in their mix of sexual and clonal reproduction (\(0 \leq d_i \leq 1\)). This implements the full Equation 6 of Yonezawa et al. (2000), including the clonal reproductive variance term \(Avr(A\delta)\).

An important property: under Poisson defaults (\(V_c/\bar{c} = V_k/\bar{k} = 1\), \(a = 0\)), the clonal fraction \(d_i\) has no effect on \(N_e/N\). The clonal fraction only influences \(N_e\) when clonal output is more variable than sexual output (\(V_c/\bar{c} > V_k/\bar{k}\)), which is biologically realistic for many clonal plants.

Ne_mixed <- Ne_mixed_Y2000(
  T_mat      = MatU,
  F_vec      = MatF[1, ],
  D          = D,
  d          = c(0.0, 0.0, 0.5),  # stage 3 is 50% clonal
  Vc_over_c  = c(1, 1, 2),        # stage 3 has super-Poisson clonal variance
  population = "mixed orchid"
)

9 Session information

sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-apple-darwin20
#> Running under: macOS Monterey 12.7.6
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
#> 
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: America/Puerto_Rico
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] knitr_1.51    purrr_1.2.1   tidyr_1.3.2   dplyr_1.2.0   ggplot2_4.0.2
#> [6] expm_1.0-0    Matrix_1.7-4  NeStage_0.8.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.7.1        cli_3.6.5          rlang_1.1.7        xfun_0.56         
#>  [5] otel_0.2.0         generics_0.1.4     S7_0.2.1           jsonlite_2.0.0    
#>  [9] labeling_0.4.3     glue_1.8.0         htmltools_0.5.9    sass_0.4.10       
#> [13] scales_1.4.0       rmarkdown_2.30     grid_4.5.2         tibble_3.3.1      
#> [17] evaluate_1.0.5     jquerylib_0.1.4    fastmap_1.2.0      yaml_2.3.12       
#> [21] lifecycle_1.0.5    compiler_4.5.2     RColorBrewer_1.1-3 pkgconfig_2.0.3   
#> [25] rstudioapi_0.18.0  farver_2.1.2       lattice_0.22-9     digest_0.6.39     
#> [29] R6_2.6.1           tidyselect_1.2.1   pillar_1.11.1      magrittr_2.0.4    
#> [33] bslib_0.10.0       withr_3.0.2        tools_4.5.2        gtable_0.3.6      
#> [37] cachem_1.1.0

10 References

Caswell H. (2001). Matrix Population Models, 2nd ed. Sinauer Associates.
Frankham R. (1995). Effective population size/adult population size ratios in wildlife: a review. Genetical Research 66: 95–107.
Lande R. (1995). Mutation and conservation. Conservation Biology 9: 782–791.
Luer C.A. (1986). Icones Pleurothallidinarum I. Monograph in Systematic Botany 15.
Orive M.E. (1993). Effective population size in organisms with complex life histories. Theoretical Population Biology 44: 316–340.
Salguero-Gómez R. et al. (2015). The COMPADRE Plant Matrix Database. Journal of Ecology 103: 202–218.
Tremblay R.L. & Ackerman J.D. (2001). Gene flow and effective population size in Lepanthes (Orchidaceae): a case for genetic drift. Biological Journal of the Linnean Society 72: 47–62.
Wright S. (1931). Evolution in Mendelian populations. Genetics 16: 97–159.
Yonezawa K., Ishida T. & Iwata H. (2000). Formulation and estimation of the effective size of stage-structured populations. Evolution 54: 2007–2013.

NeStage Functions: A User Guide

Estimating Effective Population Size in Stage-Structured Populations

Raymond L. Tremblay

2026-03-12