---
title: "Getting Started with twinsvm"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with twinsvm}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 6,
  fig.height = 4
)
```

`twinsvm` fits twin support vector machines and provides a standard C-SVC SVM baseline for comparison. Binary fits use two-class factors: level 1 is class B, level 2 is class A. Multiclass fits use one-vs-one majority voting, with ties resolved by the first factor level.

## Generate data and fit a twin SVM

```{r}
library(twinsvm)

set.seed(1)
dat <- gen_moons(100, noise = 0.12)
fit <- tsvm(dat$x, dat$y, kernel = "rbf", gamma = 2, c1 = 0.1, c2 = 0.1)
head(predict(fit, dat$x))
mean(predict(fit, dat$x) == dat$y)
```

## Plot the boundary

```{r}
plot(fit)
```

For a linear twin SVM, the two fitted planes are drawn as dashed lines.

```{r}
linear_fit <- tsvm(dat$x, dat$y, kernel = "linear")
plot(linear_fit)
```

## Cross-validation

```{r}
cv <- cv_tsvm(
  dat$x,
  dat$y,
  c1_grid = c(0.1, 1),
  c2_grid = c(0.1, 1),
  gamma_grid = c(1, 2),
  kernel = "rbf",
  k = 3
)
cv$best_params
plot(cv)
```

## Multiclass

```{r}
set.seed(4)
x3 <- rbind(
  matrix(rnorm(30, -2, 0.25), ncol = 2),
  cbind(rnorm(15, 2, 0.25), rnorm(15, -2, 0.25)),
  matrix(rnorm(30, 2, 0.25), ncol = 2)
)
y3 <- factor(rep(c("alpha", "beta", "gamma"), each = 15))

multi <- tsvm(x3, y3, kernel = "linear")
head(predict(multi, x3))
head(predict(multi, x3, type = "votes"))
confusion(multi, x3, y3)
```

## Compare with standard SVM

```{r}
timing <- data.frame(
  n = c(40, 80, 120),
  tsvm_seconds = NA_real_,
  svms_seconds = NA_real_
)

for (i in seq_len(nrow(timing))) {
  set.seed(i)
  d <- gen_moons(timing$n[i], noise = 0.12)
  timing$tsvm_seconds[i] <- system.time(tsvm(d$x, d$y, kernel = "rbf", gamma = 2))[["elapsed"]]
  timing$svms_seconds[i] <- system.time(svms(d$x, d$y, kernel = "rbf", gamma = 2))[["elapsed"]]
}
timing
```

The timing table is generated on the machine running this vignette. Kernel twin-SVM forms invert an `(n + 1)` matrix, so they are meant for small to moderate data.

## Visualization

```{r}
circles <- gen_circles(100, noise = 0.04)
lift_plot(circles$x, circles$y, gamma = 1)
```

The same data can be shown through the three fitted classifiers in one row.

```{r}
set.seed(2)
small <- gen_moons(60, noise = 0.1)
compare_methods(small$x, small$y, gamma = 1, c1 = 0.2, c2 = 0.2, cost = 1)
```

`morph_boundary()` returns a `gganimate` object. Rendering is left to the user so package examples stay fast.

```{r}
anim <- morph_boundary(dat$x, dat$y, param = "gamma", range = c(0.5, 2), kernel = "rbf", n = 5)
class(anim)
```

## Validation

The standard SVM baseline is tested against `e1071`, which is backed by LIBSVM. There is no existing R twin-SVM package to match against, so twin-SVM tests validate plane-distance behavior, nonlinear kernel improvement, and agreement between the least-squares and original QP formulations. The algorithms follow Jayadeva, Khemchandani, and Chandra (2007) and Kumar and Gopal (2009).
