NonConvexPenalizedRegression.jl

Regularization Paths for SCAD and MCP Penalized Regression Models

This is a translation in Julia of the R package ncvreg. Only gaussian family is translated.

Algorithm is described in Breheny P and Huang J (2011) "Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection". Annals of Applied Statistics, 5: 232–253

using LinearAlgebra
using Random
using NonConvexPenalizedRegression
using RCall
rng = MersenneTwister(1234);
n, p = 50, 5

X = randn(rng, n, p)              # feature matrix
50×5 Matrix{Float64}:
  0.867347  -1.22672     0.183976  -1.87215    -0.205782
 -0.901744  -0.541716   -1.27635   -0.0801396  -1.22338
 -0.494479  -0.686494    1.03132    0.259716    0.980733
 -0.902914  -0.712932   -0.910805  -0.744522    0.427383
  0.864401  -0.327059    0.754603  -0.191176   -0.492253
  2.21188    0.514836   -1.29475    0.0377058   1.48494
  0.532813   2.41747    -0.308944  -0.490009    1.23969
 -0.271735  -0.307974    1.30668    1.21081     1.06159
  0.502334   1.2453      1.44886   -0.457872    0.176156
 -0.516984  -0.0499502   0.778151  -0.170919    0.714251
  ⋮                                            
 -1.09122   -0.316387   -0.712114   1.33251    -0.743225
 -0.580517   0.265743   -0.971334  -0.717513    0.801171
 -0.315437   1.06561    -0.639229  -1.71903    -1.39614
 -1.36145    1.38501    -0.409189  -0.345262   -1.40416
 -0.114457   0.0799514  -2.10832   -0.934029    0.0949513
  0.165837  -0.833369    0.432183   0.823597   -0.596271
 -0.408438  -0.443247   -0.256065  -0.177997    0.861939
 -1.00978   -1.66323     0.797165  -0.644069    0.127747
 -0.543805  -0.521229    0.103145  -1.37931     0.143105
a0 = collect(1:p)                # ground truths
5-element Vector{Int64}:
 1
 2
 3
 4
 5
y = X * a0 + 0.1 * randn(n) # generate response
50-element Vector{Float64}:
  -9.760043646788375
 -12.379064097683992
   7.221044531518727
  -5.783941205861321
  -0.680280004611422
   6.926988976480227
   8.731610922884734
  13.108472501530311
   6.393767824706543
   4.598472343084765
   ⋮
  -2.2230906012288636
  -1.8239672230295005
 -14.099698754038341
  -8.149291537680556
  -9.739243608552623
   0.3065736307913104
   1.5836429362833409
  -4.000441249035337
  -6.269546060599293
XX = hcat(X, randn(rng, n, p))

@rput XX
@rput y

R"library(ncvreg)"

R"scad <- coef(ncvreg(XX, y, lambda=0.2, penalty='SCAD', eps=.0001))"

@rget scad
λ = [0.2]

scad = NonConvexPenalizedRegression.coef(SCAD(XX, y, λ))
11×1 Matrix{Float64}:
 0.009104584535614801
 1.0270973222288753
 1.9888348222298677
 3.0148607295694925
 4.028890696716442
 5.012048910583435
 0.0
 0.0
 0.0
 0.0
 0.0

Index

NonConvexPenalizedRegression.cdfit_gaussianMethod
cdfit_gaussian( X, y, penalty, λ, eps, max_iter, γ, multiplier, α, dfmax)

Coordinate descent for gaussian models

reference : https://github.com/pbreheny/ncvreg/blob/master/src/gaussian.c

source