Package 'ConformalSmallest' reference manual

Title:	Efficient Tuning-Free Conformal Prediction
Description:	An implementation of efficiency first conformal prediction (EFCP) and validity first conformal prediction (VFCP) that demonstrates both validity (coverage guarantee) and efficiency (width guarantee). To learn how to use it, check the vignettes for a quick tutorial. The package is based on the work by Yang Y., Kuchibhotla A.,(2021) <arxiv:2104.13871>.
Authors:	Yachong Yang [aut, cre]
Maintainer:	Yachong Yang <[email protected]>
License:	GPL (>=3)
Version:	1.0
Built:	2025-04-02 05:22:13 UTC
Source:	https://github.com/elsa-yang98/conformalsmallest

Blog data

Description

Blog data

Usage

blog
blog

Format

A dataset of dimension 280+1:

subject: Anonymized Mechanical Turk Worker ID
trial: Trial number, from 1..NNN

...

Source

blogData_train.csv

Concrete data

Description

Concrete data

Usage

concrete
concrete

Format

A dataset of dimension 8+1

Source

concrete.csv

Conditional width and coverage for CQR, internal function used inside conf_CQR_conditional

Description

Conditional width and coverage for CQR, internal function used inside conf_CQR_conditional

Usage

conf_CQR(X1, Y1, X2, Y2, beta, mtry, ntree, alpha = 0.1)
conf_CQR(X1, Y1, X2, Y2, beta, mtry, ntree, alpha = 0.1)

Arguments

`X1`	training matrix to fit the quantile regression forest
`Y1`	training vector
`X2`	training matrix to compute the conformal scores
`Y2`	training vector to compute the conformal scores
`beta`	nominal quantile level
`mtry`	random forest parameter
`ntree`	random forest parameter
`alpha`	miscoverage level

Value

a function for computing conditional width and coverage

Conditional width and coverage for CQR

Description

Conditional width and coverage for CQR

Usage

conf_CQR_conditional(x, y, beta, mtry, ntree, alpha = 0.1)
conf_CQR_conditional(x, y, beta, mtry, ntree, alpha = 0.1)

Arguments

`x`	A N*d training matrix
`y`	A N*1 training vector
`beta`	nominal quantile level
`mtry`	random forest parameter
`ntree`	random forest parameter
`alpha`	miscoverage level

Value

a function for computing conditional width and coverage

preliminary function for CQR

Description

preliminary function for CQR

Usage

conf_CQR_prelim(X1, Y1, X2, Y2, beta_grid, mtry, ntree, alpha = 0.1)
conf_CQR_prelim(X1, Y1, X2, Y2, beta_grid, mtry, ntree, alpha = 0.1)

Arguments

`X1`	A n1*d matrix for training
`Y1`	A n1*1 vector for training
`X2`	A n2*d matrix for calibration
`Y2`	A n2*1 vector for calibration
`beta_grid`	a grid of beta's
`mtry`	mtry parameter in random forest
`ntree`	number of trees parameter in random forest
`alpha`	miscoverage level

Value

the smallest width and its corresponding beta

EFCP and VFCP for CQR, CQR-m, CQR-r

Description

EFCP and VFCP for CQR, CQR-m, CQR-r

Usage

conf_CQR_reg(
  x,
  y,
  split,
  beta_grid,
  mtry_grid,
  ntree_grid,
  method = "efficient",
  alpha = 0.1
)
conf_CQR_reg(
  x,
  y,
  split,
  beta_grid,
  mtry_grid,
  ntree_grid,
  method = "efficient",
  alpha = 0.1
)

Arguments

`x`	A N*d training matrix
`y`	A N*1 training vector
`split`	a vector of length 1 for efcp, length 2 for vfcp
`beta_grid`	a grid of beta's
`mtry_grid`	a grid of mtry
`ntree_grid`	a grid of ntree
`method`	"efficient" for efcp; "valid" for vfcp
`alpha`	miscoverage level

Value

the selected cqr method

Conditional width and coverage for EFCP, VFCP between CQR, CQR-m, CQR-r

Description

Conditional width and coverage for EFCP, VFCP between CQR, CQR-m, CQR-r

Usage

conf_CQR_reg_conditional(
  x,
  y,
  split,
  beta_grid,
  mtry_grid,
  ntree_grid,
  method = "efficient",
  alpha = 0.1
)
conf_CQR_reg_conditional(
  x,
  y,
  split,
  beta_grid,
  mtry_grid,
  ntree_grid,
  method = "efficient",
  alpha = 0.1
)

Arguments

`x`	A N*d training matrix
`y`	A N*1 training vector
`split`	a vector of length 1 for efcp, length 2 for vfcp
`beta_grid`	a grid of beta's
`mtry_grid`	a grid of mtry
`ntree_grid`	a grid of ntree
`method`	"efficient" for efcp; "valid" for vfcp
`alpha`	miscoverage level

Value

the selected cqr method

Cross validation conformal prediction for ridge regression

Description

Cross validation conformal prediction for ridge regression

Usage

cv.fun(X, Y, X0, lambda = seq(0, 100, length = 100), nfolds = 10, alpha = 0.1)
cv.fun(X, Y, X0, lambda = seq(0, 100, length = 100), nfolds = 10, alpha = 0.1)

Arguments

`X`	A N*d training matrix
`Y`	A N*1 training vector
`X0`	A N0*d testing vector
`lambda`	a sequence of penalty parameters for ridge regression
`nfolds`	number of folds
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0

Efficiency first conformal prediction for Conformal Quantile Regression

Description

Efficiency first conformal prediction for Conformal Quantile Regression

Usage

efcp_cqr(x, y, split, beta_grid, params_grid, alpha = 0.1)
efcp_cqr(x, y, split, beta_grid, params_grid, alpha = 0.1)

Arguments

`x`	A N*d training matrix
`y`	A N*1 training vector
`split`	a number between 0 and 1
`beta_grid`	a grid of beta's
`params_grid`	a grid of mtry and ntree
`alpha`	miscoverage level

Value

average prediction width and a function for coverage on some testing points

Efficiency first conformal prediction for ridge regression

Description

Efficiency first conformal prediction for ridge regression

Usage

efcp_ridge(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
efcp_ridge(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)

Arguments

`X`	A N*d training matrix
`Y`	A N*1 training vector
`X0`	A N0*d testing vector
`lambda`	a sequence of penalty parameters for ridge regression
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0.

Examples

df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.efcp=efcp.fun(X,Y,X0)
out.efcp$up
out.efcp$lo
df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.efcp=efcp.fun(X,Y,X0)
out.efcp$up
out.efcp$lo

Efficiency first conformal prediction for ridge regression

Description

Efficiency first conformal prediction for ridge regression

Usage

efcp.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
efcp.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)

Arguments

`X`	A N*d training matrix
`Y`	A N*1 training vector
`X0`	A N0*d testing vector
`lambda`	a sequence of penalty parameters for ridge regression
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0.

Examples

df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.efcp=efcp.fun(X,Y,X0)
out.efcp$up
out.efcp$lo
df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.efcp=efcp.fun(X,Y,X0)
out.efcp$up
out.efcp$lo

Conformal prediction for linear regression

Description

Conformal prediction for linear regression

Usage

ginverse.fun(x, y, x0, alpha = 0.1)
ginverse.fun(x, y, x0, alpha = 0.1)

Arguments

`x`	A N*d training matrix
`y`	A N*1 training vector
`x0`	A N0*d testing vector
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0

Internal function used for ginverse.fun

Description

Internal function used for ginverse.fun

Usage

ginverselm.funs(intercept = TRUE, lambda = 0)
ginverselm.funs(intercept = TRUE, lambda = 0)

Arguments

`intercept`	default is TRUE
`lambda`	a vector

Kernel data

Description

Kernel data

Usage

kernel
kernel

Format

A dataset of dimension 14+1

Source

sgemm_product.csv

Internal function used for ginverse.fun

Description

Internal function used for ginverse.fun

Usage

my.ginverselm.funs
my.ginverselm.funs

Format

An object of class list of length 4.

Conformal prediction for linear regression

Description

Conformal prediction for linear regression

Usage

naive.fun(X, Y, X0, alpha = 0.1)
naive.fun(X, Y, X0, alpha = 0.1)

Arguments

`X`	A N*d training matrix
`Y`	A N*1 training vector
`X0`	A N0*d testing vector
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0

News data

Description

News data

Usage

news
news

Format

A dataset of dimension 59+1

Source

OnlineNewsPopularity.csv

Outcomes of an example for tuning-free conformalized quantile regression(CQR).

Description

A dataset containing the experiment results used in the vignettes.

Usage

pois_n400_reps100
pois_n400_reps100

Format

A list with 10 elements: x_test, n,nrep,width_mat, cov_mat,beta_mat, ntree_mat, cqr_method_mat, evaluations, alpha

x_test: test points of x
n: number of training samples
nrep: number of replications
width_mat: a data frame with the first column being the width of the prediction regions
cov_mat: a data frame with the first column being the coverage of the prediction regions
beta_mat: a data frame with the first column being the beta for CQR used in the final prediction
ntree_mat: a data frame with the first column being the number of trees for CQR used in the final prediction
ntree_mat: a data frame with the first column being the CQR method (among CQR, CQR-m, CQR-r)used in the final prediction
alpha: desired miscoverage level

Source

For details please see the "Example-tuning_free_CQR" vignette:vignette("Example-tuning_free_CQR", package = "ConformalSmallest")

Protein data

Description

Protein data

Usage

protein
protein

Format

A dataset of dimension 8+1

Source

CASP.csv

Outcomes of an example for tuning-free conformal prediction with ridge regression.

Description

A dataset containing the experiment results used in the vignettes.

Usage

ridge_linear_cov100_t3
ridge_linear_cov100_t3

Format

A list with 7 elements: dim_linear_t3,cov.param_linear_fm_t3, cov.naive_linear_fm_t3, cov.vfcp_linear_fm_t3, cov.star_linear_fm_t3, cov.cv5_linear_fm_t3, cov.efcp_linear_fm_t3

dim: dimensions used in the experiment
len.param: a matrix with coverages for the prediction regions produced by the parametric method
len.naive: a matrix with coverages for the prediction regions produced by naive linear regression method
len.vfcp: na matrix with coverages for the prediction regions produced by VFCP
len.star: a matrix with coverages for the prediction regions produced by cross validation with the errors
len.cv5: a matrix with coverages for the prediction regions produced by cross-validation with 5 splits
len.efcp: a matrix with coverages for the prediction regions produced by efcp

Source

For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")

Outcomes of an example for tuning-free conformal prediction with ridge regression.

Description

A dataset containing the experiment results used in the vignettes.

Usage

ridge_linear_cov100_t5
ridge_linear_cov100_t5

Format

A list with 7 elements: dim_linear_t5,cov.param_linear_fm_t5, cov.naive_linear_fm_t5, cov.vfcp_linear_fm_t5, cov.star_linear_fm_t5, cov.cv5_linear_fm_t5, cov.efcp_linear_fm_t5

dim: dimensions used in the experiment
cov.param: a matrix with coverages for the prediction regions produced by the parametric method
cov.naive: a matrix with coverages for the prediction regions produced by naive linear regression method
cov.vfcp: na matrix with coverages for the prediction regions produced by VFCP
cov.star: a matrix with coverages for the prediction regions produced by cross validation with the errors
cov.cv5: a matrix with coverages for the prediction regions produced by cross-validation with 5 splits
cov.efcp: a matrix with coverages for the prediction regions produced by efcp

Source

For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")

Outcomes of an example for tuning-free conformal prediction with ridge regression.

Description

A dataset containing the experiment results used in the vignettes.

Usage

ridge_linear_len100_t3
ridge_linear_len100_t3

Format

A list with 6 elements: len.param_linear_fm_t3, len.naive_linear_fm_t3, len.vfcp_linear_fm_t3, len.star_linear_fm_t3, len.cv5_linear_fm_t3, len.efcp_linear_fm_t3

len.param: a matrix with widths for the prediction regions produced by the parametric method
len.naive: a matrix with widths for the prediction regions produced by naive linear regression method
len.vfcp: na matrix with widths for the prediction regions produced by VFCP
len.star: a matrix with widths for the prediction regions produced by cross validation with the errors
len.cv5: a matrix with widths for the prediction regions produced by cross-validation with 5 splits
len.efcp: a matrix with widths for the prediction regions produced by efcp

Source

For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")

Outcomes of an example for tuning-free conformal prediction with ridge regression.

Description

A dataset containing the experiment results used in the vignettes.

Usage

ridge_linear_len100_t5
ridge_linear_len100_t5

Format

A list with 6 elements: len.param_linear_fm_t5, len.naive_linear_fm_t5, len.vfcp_linear_fm_t5, len.star_linear_fm_t5, len.cv5_linear_fm_t5, len.efcp_linear_fm_t5

len.param: a matrix with widths for the prediction regions produced by the parametric method
len.naive: a matrix with widths for the prediction regions produced by naive linear regression method
len.vfcp: na matrix with widths for the prediction regions produced by VFCP
len.star: a matrix with widths for the prediction regions produced by cross validation with the errors
len.cv5: a matrix with widths for the prediction regions produced by cross-validation with 5 splits
len.efcp: a matrix with widths for the prediction regions produced by efcp

Source

For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")

Conformal prediction for ridge regression, tuning parameter by minimizing the mean of the residuals

Description

Conformal prediction for ridge regression, tuning parameter by minimizing the mean of the residuals

Usage

star.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
star.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)

Arguments

`X`	A N*d training matrix
`Y`	A N*1 training vector
`X0`	A N0*d testing vector
`lambda`	a sequence of penalty parameters for ridge regression
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0

Superconduct data

Description

Superconduct data

Usage

superconduct
superconduct

Format

A dataset of dimension 81+1

Source

train.csv

Validity first conformal prediction for Conformal Quantile Regression

Description

Validity first conformal prediction for Conformal Quantile Regression

Usage

vfcp_cqr(x, y, split, beta_grid, params_grid, alpha = 0.1)
vfcp_cqr(x, y, split, beta_grid, params_grid, alpha = 0.1)

Arguments

`x`	A N*d training matrix
`y`	A N*1 training vector
`split`	a number between 0 and 1
`beta_grid`	a grid of beta's
`params_grid`	a grid of mtry and ntree
`alpha`	miscoverage level

Value

average prediction width and a function for coverage on some testing points

Validity first conformal prediction for ridge regression

Description

Validity first conformal prediction for ridge regression

Usage

vfcp_ridge(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
vfcp_ridge(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)

Arguments

`X`	A N*d training matrix
`Y`	A N*1 training vector
`X0`	A N0*d testing vector
`lambda`	a sequence of penalty parameters for ridge regression
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0.

Examples

df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.vfcp=vfcp.fun(X,Y,X0)
out.vfcp$up
out.vfcp$lo
df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.vfcp=vfcp.fun(X,Y,X0)
out.vfcp$up
out.vfcp$lo

Validity first conformal prediction for ridge regression

Description

Validity first conformal prediction for ridge regression

Usage

vfcp.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
vfcp.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)

Arguments

`X`	A N*d training matrix
`Y`	A N*1 training vector
`X0`	A N0*d testing vector
`lambda`	a sequence of penalty parameters for ridge regression
`alpha`	miscoverage level

Value

upper and lower prediction intervals for X0.

Examples

df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.vfcp=vfcp.fun(X,Y,X0)
out.vfcp$up
out.vfcp$lo
df=3
d = 5
n=50   #number of training samples
n0=10  #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df)	#multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.vfcp=vfcp.fun(X,Y,X0)
out.vfcp$up
out.vfcp$lo

Package 'ConformalSmallest'

Help Index

Blog data

Description

Usage

Format

Source

Concrete data

Description

Usage

Format

Source

Conditional width and coverage for CQR, internal function used inside conf_CQR_conditional

Description

Usage

Arguments

Value

Conditional width and coverage for CQR

Description

Usage

Arguments

Value

preliminary function for CQR

Description

Usage

Arguments

Value

EFCP and VFCP for CQR, CQR-m, CQR-r

Description

Usage

Arguments

Value

Conditional width and coverage for EFCP, VFCP between CQR, CQR-m, CQR-r

Description

Usage

Arguments

Value

Cross validation conformal prediction for ridge regression

Description

Usage

Arguments

Value

Efficiency first conformal prediction for Conformal Quantile Regression

Description

Usage

Arguments

Value

Efficiency first conformal prediction for ridge regression

Description

Usage

Arguments

Value

Examples

Efficiency first conformal prediction for ridge regression

Description

Usage

Arguments

Value

Examples

Conformal prediction for linear regression

Description

Usage

Arguments

Value

Internal function used for ginverse.fun

Description

Usage

Arguments

Kernel data

Description

Usage

Format

Source

Internal function used for ginverse.fun

Description

Usage

Format

Conformal prediction for linear regression

Description

Usage