Skip to contents

Fit the LHSC in input space and reproducing kernel Hilbert space. The solution path is computed at a grid of values of tuning parameter lambda.

Usage

lhsc(x, y, kern, lambda, eps=1e-05, maxit=1e+05)

Arguments

x

A numerical matrix with \(N\) rows and \(p\) columns for predictors.

y

A vector of length \(N\) for binary responses. The element of y is either -1 or 1.

kern

A kernel function; see dots.

lambda

A user supplied lambda sequence.

eps

The algorithm stops when \(| \beta^{old} - \beta^{new} |\) is less than eps. Default value is 1e-5.

maxit

The maximum of iterations allowed. Default is 1e5.

Details

The leaky hockey stick loss is \(V(u)=1-u\) if \(u \le 1\) and \(-\log u\) if \(u > 1\). The value of \(\lambda\), i.e., lambda, is user-specified.

In the linear case (kern is the inner product and N > p), the lhsc fits a linear LHSC by minimizing the L2 penalized leaky hockey stick loss function, $$L(\beta_0,\beta) := \frac{1}{N}\sum_{i=1}^N V(y_i(\beta_0 + X_i'\beta)) + \lambda \beta' \beta.$$

If a linear LHSC is fitted when N < p, a kernel LHSC with the linear kernel is actually solved. In such case, the coefficient \(\beta\) can be obtained from \(\beta = X'\alpha.\)

In the kernel case, the lhsc fits a kernel LHSC by minimizing $$L(\alpha_0,\alpha) := \frac{1}{n}\sum_{i=1}^n V(y_i(\alpha_0 + K_i' \alpha)) + \lambda \alpha' K \alpha,$$ where \(K\) is the kernel matrix and \(K_i\) is the ith row.

Value

An object with S3 class lhsc.

alpha

A matrix of LHSC coefficients at each lambda value. The dimension is (p+1)*length(lambda) in the linear case and (N+1)*length(lambda) in the kernel case.

lambda

The lambda sequence.

npass

The total number of FISTA iterations for all lambda values.

jerr

Warnings and errors; 0 if none.

info

A list including parameters of the loss function, eps, maxit, kern, and wt if a weight vector was used.

call

The call that produced this object.

Author

Oh-ran Kwon and Hui Zou
Maintainer: Oh-ran Kwon kwon0085@umn.edu

References

Kwon, O. and Zou, H. (2023+) “Leaky Hockey Stick Loss: The First Negatively Divergent Margin-based Loss Function for Classification"

See also

Examples

data(BUPA)
# standardize the predictors
BUPA$X = scale(BUPA$X, center=TRUE, scale=TRUE)

# a grid of tuning parameters
lambda = 10^(seq(3, -3, length.out=10))

# fit a linear LHSC
kern = vanilladot()
DWD_linear = lhsc(BUPA$X, BUPA$y, kern,
  lambda=lambda, eps=1e-5, maxit=1e5)

# fit a kernel LHSC using Gaussian kernel
kern = rbfdot(sigma=1)
DWD_Gaussian = lhsc(BUPA$X, BUPA$y, kern,
  lambda=lambda, eps=1e-5, maxit=1e5)