Optimization methods

Examples

Different numerical optimization options will be explained using dataGCE (see “Generalized Maximum Entropy framework”).

library(GCEstim)

Assume that \(z_k^{upper}=50\), \(k\in\left\lbrace 0,\dots,6\right\rbrace\) is a “wide upper bound” for the signal support space and set \(M=5\) and \(J=3\).
The primal version of the optimization problem previously defined can be solved by the optimization methods:

Rsolnp package [3]
- primal.solnp, using Sequential Quadratic Programming (SQP);
NlcOptim package [4]
- primal.solnl, using the augmented Lagrange multiplier method with an SQP interior algorithm.

res.lmgce.50.primal.solnp <-
  GCEstim::lmgce(
    y ~ .,
    data = dataGCE,
    support.signal = c(-50, 50),
    twosteps.n = 0,
    method = "primal.solnp"
  )

res.lmgce.50.primal.solnl <-
  GCEstim::lmgce(
    y ~ .,
    data = dataGCE,
    support.signal = c(-50, 50),
    twosteps.n = 0,
    method = "primal.solnl"
  )

Both methods converged which can be confirmed by the \(0\) returned by

res.lmgce.50.primal.solnp$convergence
#> [1] 0
res.lmgce.50.primal.solnl$convergence
#> [1] 0

The dual version of the optimization problem can be solved by the optimization methods:

optim package
- dual.BFGS, default, using Broyden-Fletcher-Goldfarb-Shanno quasi-Newton method [5–8];
- dual.CG, a conjugate gradients method [9];
- dual.L-BFGS-B, using a box-constrained optimization with limited-memory modification of the BFGS quasi-Newton method [10];
optimx package [11]
- dual.Rcgmin, using conjugate gradient minimization of nonlinear functions with box constraints using Dai/Yuan update [12] (see Rcgmin);
- dual.bobyqa, using a derivative-free optimization by quadratic approximation [13] (see minqa [14]);
- dual.newuoa, using a derivative-free optimization by quadratic approximation [13] (see minqa [14]);
- dual.nlminb, unconstrained and box-constrained optimization using PORT routines [15] (see nlminb);
- dual.nlm, using a Newton-type algorithm [16,17] (see nlm);
lbfgs package [18]
- dual.lbfgs, using the Limited-memory Broyden-Fletcher-Goldfarb-Shanno;
lbfgsb3c package [19]
- dual.lbfgsb3c, using L-BFSC-B implemented in Fortran code and with an Rcpp interface;

res.lmgce.50.dual.BFGS <-
  GCEstim::lmgce(
    y ~ .,
    data = dataGCE,
    support.signal = c(-50, 50),
    twosteps.n = 0,
    method = "dual.BFGS"
  )

res.lmgce.50.dual.CG <-
  GCEstim::lmgce(
    y ~ .,
    data = dataGCE,
    support.signal = c(-50, 50),
    twosteps.n = 0,
    method = "dual.CG"
  )

Any error or warning produced by the optimization procedure results in outputting \(1\). Note that dual.BFGS converged while dual.CG did not. Conjugate gradient methods will generally be more fragile but may be successful in much larger optimization problems.

res.lmgce.50.dual.BFGS$convergence
#> [1] 0
res.lmgce.50.dual.CG$convergence
#> [1] 1

The values of \(\boldsymbol{\hat\lambda}\), \(\hat {\mathbf{p}}\) and \(\hat {\mathbf{w}}\) for dual.BFGS are, respectively,

res.lmgce.50.dual.BFGS$lambda
#>   [1] -0.0054315581 -0.0362851075 -0.0385981482  0.0566161704  0.0138698331
#>   [6] -0.0024651553 -0.0227793090 -0.0023567234 -0.0187668165  0.0205529920
#>  [11]  0.0189754236  0.0350418904  0.0337957319 -0.0092406559 -0.0217101748
#>  [16] -0.0380872670  0.0233102524  0.0106105899 -0.0450483760 -0.0023395159
#>  [21]  0.0198291478  0.0275226658 -0.0629130452 -0.0001487608  0.0125237687
#>  [26] -0.0328692845 -0.0732469503 -0.0191117768  0.0336050468  0.0562760350
#>  [31]  0.0218128341  0.0231869574  0.0753808406  0.0281004090  0.0402667764
#>  [36] -0.0231789955  0.0462642119  0.0148705520 -0.0002027890 -0.0038059321
#>  [41] -0.0544474657  0.0190416427 -0.0256211058  0.0044666854  0.0068425455
#>  [46] -0.0116029419  0.0503567564  0.0075374883 -0.0374695950 -0.0167435100
#>  [51]  0.0026509575  0.0025613560 -0.0260627773  0.0593285252  0.0336042544
#>  [56]  0.0366290162 -0.0255469000  0.0502924605 -0.0155720536 -0.0143055455
#>  [61]  0.0350497334  0.0061215759 -0.0734237406 -0.0677698468 -0.0422326454
#>  [66] -0.0352340691 -0.0018549268 -0.0410253228 -0.0167136632  0.0272622276
#>  [71]  0.0373657555  0.0137718621 -0.0601873982 -0.1033502240  0.0500338667
#>  [76]  0.0342158472 -0.0184214830  0.0803730211 -0.0274860122 -0.0016426092
#>  [81] -0.0037500133  0.0228453331 -0.0113385292 -0.0249442628 -0.0122691137
#>  [86]  0.0286318993 -0.0391985721  0.0292200582 -0.0147311748 -0.0174859191
#>  [91]  0.0601325584  0.0338395555 -0.0208172623 -0.0356295971  0.0043824510
#>  [96] -0.0199980639  0.0574606377  0.0030525401 -0.0212540400 -0.0123546793

res.lmgce.50.dual.BFGS$p
#>                   p_1       p_2       p_3       p_4       p_5
#> (Intercept) 0.1918591 0.1958458 0.1999154 0.2040696 0.2083101
#> X001        0.1986565 0.1993260 0.1999977 0.2006718 0.2013481
#> X002        0.2150038 0.2072256 0.1997287 0.1925031 0.1855389
#> X003        0.1897860 0.1947609 0.1998662 0.2051053 0.2104817
#> X004        0.1706791 0.1842217 0.1988389 0.2146158 0.2316445
#> X005        0.1473696 0.1699513 0.1959933 0.2260257 0.2606601

res.lmgce.50.dual.BFGS$w
#>           p_1       p_2       p_3
#> 1   0.3442535 0.3332154 0.3225311
#> 2   0.4079385 0.3281286 0.2639329
#> 3   0.4127877 0.3274530 0.2597593
#> 4   0.2284563 0.3208720 0.4506717
#> 5   0.3060097 0.3325652 0.3614251
#> 6   0.3382756 0.3333090 0.3284154
#> 7   0.3797836 0.3312674 0.2889489
#> 8   0.3380577 0.3333111 0.3286311
#> 9   0.3714900 0.3319290 0.2965810
#> 10  0.2931728 0.3316500 0.3751771
#> 11  0.2961820 0.3318977 0.3719202
#> 12  0.2661900 0.3284753 0.4053347
#> 13  0.2684620 0.3288112 0.4027267
#> 14  0.3519758 0.3329920 0.3150321
#> 15  0.3775700 0.3314560 0.2909740
#> 16  0.4117161 0.3276056 0.2606783
#> 17  0.2879456 0.3311704 0.3808839
#> 18  0.3123514 0.3328835 0.3547651
#> 19  0.4263360 0.3253617 0.2483024
#> 20  0.3380232 0.3333114 0.3286654
#> 21  0.2945519 0.3317661 0.3736820
#> 22  0.2800414 0.3303239 0.3896347
#> 23  0.4639038 0.3180469 0.2180492
#> 24  0.3336309 0.3333332 0.3330359
#> 25  0.3086226 0.3327068 0.3586706
#> 26  0.4007893 0.3290534 0.2701573
#> 27  0.4855351 0.3128644 0.2016005
#> 28  0.3722015 0.3318771 0.2959214
#> 29  0.2688105 0.3288616 0.4023279
#> 30  0.2290266 0.3210171 0.4499563
#> 31  0.2907793 0.3314382 0.3777825
#> 32  0.2881785 0.3311931 0.3806284
#> 33  0.1983037 0.3117136 0.4899826
#> 34  0.2789653 0.3301971 0.3908376
#> 35  0.2567696 0.3269411 0.4162893
#> 36  0.3806119 0.3311946 0.2881935
#> 37  0.2461734 0.3249338 0.4288927
#> 38  0.3040730 0.3324506 0.3634764
#> 39  0.3337390 0.3333332 0.3329278
#> 40  0.3409735 0.3332754 0.3257511
#> 41  0.4461098 0.3217839 0.2321063
#> 42  0.2960555 0.3318877 0.3720568
#> 43  0.3856799 0.3307230 0.2835971
#> 44  0.3244409 0.3332535 0.3423055
#> 45  0.3197457 0.3331461 0.3471082
#> 46  0.3567894 0.3327955 0.3104151
#> 47  0.2390807 0.3234168 0.4375025
#> 48  0.3183771 0.3331062 0.3485167
#> 49  0.4104210 0.3277876 0.2617915
#> 50  0.3673234 0.3322148 0.3004618
#> 51  0.3280457 0.3333052 0.3386491
#> 52  0.3282239 0.3333071 0.3384690
#> 53  0.3865978 0.3306328 0.2827694
#> 54  0.2239385 0.3196870 0.4563745
#> 55  0.2688120 0.3288618 0.4023262
#> 56  0.2633103 0.3280307 0.4086591
#> 57  0.3855257 0.3307380 0.2837363
#> 58  0.2391913 0.3234415 0.4373672
#> 59  0.3649161 0.3323655 0.3027184
#> 60  0.3623179 0.3325162 0.3051659
#> 61  0.2661757 0.3284732 0.4053511
#> 62  0.3211679 0.3331835 0.3456487
#> 63  0.4859039 0.3127701 0.2013261
#> 64  0.4740870 0.3156935 0.2102196
#> 65  0.4204179 0.3263118 0.2532703
#> 66  0.4057370 0.3284225 0.2658405
#> 67  0.3370500 0.3333196 0.3296304
#> 68  0.4178820 0.3267016 0.2554164
#> 69  0.3672621 0.3322188 0.3005192
#> 70  0.2805272 0.3303802 0.3890926
#> 71  0.2619788 0.3278179 0.4102033
#> 72  0.3061996 0.3325760 0.3612244
#> 73  0.4581794 0.3193017 0.2225189
#> 74  0.5472808 0.2943768 0.1583423
#> 75  0.2396362 0.3235407 0.4368231
#> 76  0.2676950 0.3286993 0.4036057
#> 77  0.3707781 0.3319801 0.2972418
#> 78  0.1907265 0.3089189 0.5003545
#> 79  0.3895584 0.3303318 0.2801098
#> 80  0.3366239 0.3333225 0.3300536
#> 81  0.3408608 0.3332771 0.3258621
#> 82  0.2888241 0.3312555 0.3799204
#> 83  0.3562497 0.3328197 0.3109306
#> 84  0.3842740 0.3308583 0.2848677
#> 85  0.3581501 0.3327320 0.3091179
#> 86  0.2779770 0.3300782 0.3919448
#> 87  0.4140474 0.3272711 0.2586815
#> 88  0.2768852 0.3299441 0.3931706
#> 89  0.3631906 0.3324670 0.3043424
#> 90  0.3688510 0.3321137 0.2990353
#> 91  0.2226094 0.3193264 0.4580642
#> 92  0.2683820 0.3287996 0.4028184
#> 93  0.3757233 0.3316066 0.2926701
#> 94  0.4065653 0.3283129 0.2651218
#> 95  0.3246078 0.3332565 0.3421356
#> 96  0.3740308 0.3317394 0.2942299
#> 97  0.2270441 0.3205083 0.4524476
#> 98  0.3272472 0.3332961 0.3394567
#> 99  0.3766264 0.3315337 0.2918399
#> 100 0.3583249 0.3327236 0.3089514

The following table shows a comparison of efficiency between all the optimization methods available. Note that in the current version, the default optimization parameters of each dependency package are used.

method.opt <-
  c(
    "primal.solnl",
    "primal.solnp",
    "dual.BFGS",
    "dual.CG",
    "dual.L-BFGS-B",
    "dual.Rcgmin",
    "dual.bobyqa",
    "dual.newuoa",
    "dual.nlminb",
    "dual.nlm",
    "dual.lbfgs",
    "dual.lbfgsb3c"
  )

compare_methods <- data.frame(
  method = method.opt,
  time = NA,
  r.squared = NA,
  error.measure = NA,
  error.measure.cv.mean = NA,
  beta.error = NA,
  nep = NA,
  nep.cv.mean = NA,
  convergence = NA)


for (i in 1:length(method.opt)) {
start.time <- Sys.time()
res.method <- 
  lmgce(
    y ~ .,
    data = dataGCE,
    support.signal = c(-50,50),
    twosteps.n = 0,
    method = method.opt[i]
    )

compare_methods$time[i] <- difftime(Sys.time(),
                                    start.time,
                                    units = "secs")
compare_methods$r.squared[i] <- summary(res.method)$r.squared
compare_methods$error.measure[i] <- res.method$error.measure
compare_methods$error.measure.cv.mean[i] <- res.method$error.measure.cv.mean
compare_methods$beta.error[i] <- accmeasure(coef(res.method), coef.dataGCE)
compare_methods$nep[i] <- res.method$nep
compare_methods$nep.cv.mean [i] <- res.method$nep.cv.mean
compare_methods$convergence[i] <- res.method$convergence
}

compare_methods_ordered <- compare_methods[order(compare_methods$time),]

compare_methods_ordered$convergence <- factor(compare_methods_ordered$convergence,
                                              levels = c(0,1),
                                              labels = c(TRUE, FALSE))

optimization method	time (s)	\(RMSE_{\widehat{y}}\)	\(CV \text{-} RMSE_{\widehat{y}}\)	\(RMSE_{\widehat{\beta}}\)	Convergence
dual.lbfgsb3c	0.372	0.411	0.441	1.578	TRUE
dual.BFGS	0.385	0.411	0.441	1.575	TRUE
dual.L-BFGS-B	0.539	0.411	0.440	1.586	FALSE
dual.CG	0.603	0.436	0.697	2.948	FALSE
dual.lbfgs	0.627	0.411	0.441	1.575	TRUE
dual.nlminb	2.263	0.411	0.441	1.575	TRUE
dual.nlm	2.562	0.411	0.441	1.575	TRUE
dual.Rcgmin	2.816	0.411	0.441	1.575	TRUE
primal.solnp	4.747	0.411	0.441	1.575	TRUE
primal.solnl	5.503	0.411	0.441	1.575	TRUE
dual.newuoa	56.082	0.411	0.441	1.579	TRUE
dual.bobyqa	67.649	0.638	0.448	2.915	TRUE

References

Golan A, Judge GG, Miller D. Maximum Entropy Econometrics : Robust Estimation with Limited Data. Wiley; 1996.

Ye Y. Interior Algorithms for Linear, Quadratic, and Linearly Constrained Non-Linear Programming. PhD thesis. Department of ESS, Stanford University; 1987.

Ghalanos A, Theussl S. Rsolnp: General Non-Linear Optimization Using Augmented Lagrange Multiplier Method.; 2015.

Chen X, Yin X. NlcOptim: Solve Nonlinear Optimization with Nonlinear Constraints.; 2019. https://CRAN.R-project.org/package=NlcOptim

Broyden CG. The convergence of a class of double-rank minimization algorithms 1. General considerations. IMA Journal of Applied Mathematics. 1970;6:76-90. doi:10.1093/imamat/6.1.76

Fletcher R. A new approach to variable metric algorithms. The Computer Journal. 1970;13:317-322. doi:10.1093/comjnl/13.3.317

Goldfarb D. A family of variable-metric methods derived by variational means. Mathematics of computation. 1970;24:23-26. doi:10.1090/s0025-5718-1970-0258249-6

Shanno DF. Conditioning of quasi-newton methods for function minimization. Mathematics of computation. 1970;24:647-656. doi:10.1090/s0025-5718-1970-0274029-x

Fletcher R, Reeves CM. Function minimization by conjugate gradients. The Computer Journal. 1964;7:149-154. doi:10.1093/comjnl/7.2.149

10.

Byrd RH, Lu P, Nocedal J, Zhu C. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing. 1995;16:1190-1208. doi:10.1137/0916069

11.

Nash JC, Varadhan R. Unifying optimization algorithms to aid software system users: Optimx for r. Journal of Statistical Software. 2011;43:1-14. doi:10.18637/jss.v043.i09

12.

Dai YH, Yuan Y. An efficient hybrid conjugate gradient method for unconstrained optimization. Annals of Operations Research. 2001;103:33-47. doi:10.1023/A:1012930416777

13.

Powell M. The BOBYQA algorithm for bound constrained optimization without derivatives. Technical Report, Department of Applied Mathematics and Theoretical Physics. Published online January 2009.

14.

Bates D, Mullen KM, Nash JC, Varadhan R. Minqa: Derivative-Free Optimization Algorithms by Quadratic Approximation.; 2024. https://CRAN.R-project.org/package=minqa

15.

Gay DM. Usage summary for selected optimization routines. Computing Science Technical Report. 1990;153.

16.

Dennis JE, Schnabel RB. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall; 1983.

17.

Schnabel RB, Koonatz JE, Weiss BE. A modular system of algorithms for unconstrained minimization. ACM transactions on mathematical software Association for Computing Machinery. 1985;11:419-440. doi:10.1145/6187.6192

18.

Coppola A, Stewart B, Okazaki N. Lbfgs: Limited-Memory BFGS Optimization.; 2022. https://CRAN.R-project.org/package=lbfgs

19.

Fidler ML, Nash JC, Zhu C, Byrd R, Nocedal J, Morales JL. Lbfgsb3c: Limited Memory BFGS Minimizer with Bounds on Parameters with Optim() ’c’ Interface.; 2024. https://CRAN.R-project.org/package=lbfgsb3c

Optimization methods

Jorge Cabral

Optimization

Examples

References

Acknowledgements