Get predictions on test sets in MLR
up vote
0
down vote
favorite
I'm fitting classification models for binary issues using MLR package in R. For each model, I perform a cross-validation with embedded feature selection using "selectFeatures" function and retrieve mean AUCs over test sets. I would like next to retrieve predictions on the test sets for each fold but this function does not seem to support that. I already tried to plug selected predictors into the "resample" function to get it. It works, but performance metrics are different which is not suitable for my analysis. I also tried to check in caret package if it is possible but I have not seen a solution at first glance. Any idea how to do it?
Here is my code with synthetic data and with my attempt with "resample" function (again: not suitable in this current version as performance metrics are different) .
# 1. Find a synthetic dataset for supervised learning (two classes)
###################################################################
install.packages("mlbench")
library(mlbench)
data(BreastCancer)
# generate 1000 rows, 21 quantitative candidate predictors and 1 target variable
p<-mlbench.waveform(1000)
# convert list into dataframe
dataset<-as.data.frame(p)
# drop thrid class to get 2 classes
dataset2 = subset(dataset, classes != 3)
# 2. Perform cross validation with embedded feature selection
#############################################################
library(BBmisc)
library(nnet)
library(mlr)
# Choice of algorithm i.e. neural network
mL <- makeLearner("classif.nnet", predict.type = "prob")
# Choice of sampling plan: 10 fold cross validation with stratification of target classes
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
# Choice of feature selection strategy
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of feature selection technique (stepwize family) and p-value
mFSCS = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of seed
set.seed(12)
# Choice of data
mCT <- makeClassifTask(data =dataset2, target = "classes")
# Perform the method
result = selectFeatures(mL,mCT, mRD, control = ctrl, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Retrieve AUC and selected variables
analyzeFeatSelResult(result)
# Result: auc.test.mean=0.9614525 Variables selected: x.10, x.11, x.15, x.17, x.18
# 3. Retrieve predictions on tests sets (to later perform Delong tests on AUCs derived from multiple sets of candidate variables)
#################################################################################################################################
# create new dataset with selected predictors
keep <- c("x.10","x.11","x.15","x.17","x.18","classes")
dataset3 <- dataset2[ , names(dataset2) %in% keep]
# Perform same tasks with resample function instead of selectFeatures function to get predictions on tests set
mL <- makeLearner("classif.nnet", predict.type = "prob")
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
set.seed(12)
mCT <- makeClassifTask(data =dataset3, target = "classes")
r1r = resample(mL, mCT, mRD, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Result: auc.test.mean=0.9673023
cross-validation feature-selection mlr
add a comment |
up vote
0
down vote
favorite
I'm fitting classification models for binary issues using MLR package in R. For each model, I perform a cross-validation with embedded feature selection using "selectFeatures" function and retrieve mean AUCs over test sets. I would like next to retrieve predictions on the test sets for each fold but this function does not seem to support that. I already tried to plug selected predictors into the "resample" function to get it. It works, but performance metrics are different which is not suitable for my analysis. I also tried to check in caret package if it is possible but I have not seen a solution at first glance. Any idea how to do it?
Here is my code with synthetic data and with my attempt with "resample" function (again: not suitable in this current version as performance metrics are different) .
# 1. Find a synthetic dataset for supervised learning (two classes)
###################################################################
install.packages("mlbench")
library(mlbench)
data(BreastCancer)
# generate 1000 rows, 21 quantitative candidate predictors and 1 target variable
p<-mlbench.waveform(1000)
# convert list into dataframe
dataset<-as.data.frame(p)
# drop thrid class to get 2 classes
dataset2 = subset(dataset, classes != 3)
# 2. Perform cross validation with embedded feature selection
#############################################################
library(BBmisc)
library(nnet)
library(mlr)
# Choice of algorithm i.e. neural network
mL <- makeLearner("classif.nnet", predict.type = "prob")
# Choice of sampling plan: 10 fold cross validation with stratification of target classes
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
# Choice of feature selection strategy
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of feature selection technique (stepwize family) and p-value
mFSCS = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of seed
set.seed(12)
# Choice of data
mCT <- makeClassifTask(data =dataset2, target = "classes")
# Perform the method
result = selectFeatures(mL,mCT, mRD, control = ctrl, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Retrieve AUC and selected variables
analyzeFeatSelResult(result)
# Result: auc.test.mean=0.9614525 Variables selected: x.10, x.11, x.15, x.17, x.18
# 3. Retrieve predictions on tests sets (to later perform Delong tests on AUCs derived from multiple sets of candidate variables)
#################################################################################################################################
# create new dataset with selected predictors
keep <- c("x.10","x.11","x.15","x.17","x.18","classes")
dataset3 <- dataset2[ , names(dataset2) %in% keep]
# Perform same tasks with resample function instead of selectFeatures function to get predictions on tests set
mL <- makeLearner("classif.nnet", predict.type = "prob")
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
set.seed(12)
mCT <- makeClassifTask(data =dataset3, target = "classes")
r1r = resample(mL, mCT, mRD, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Result: auc.test.mean=0.9673023
cross-validation feature-selection mlr
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm fitting classification models for binary issues using MLR package in R. For each model, I perform a cross-validation with embedded feature selection using "selectFeatures" function and retrieve mean AUCs over test sets. I would like next to retrieve predictions on the test sets for each fold but this function does not seem to support that. I already tried to plug selected predictors into the "resample" function to get it. It works, but performance metrics are different which is not suitable for my analysis. I also tried to check in caret package if it is possible but I have not seen a solution at first glance. Any idea how to do it?
Here is my code with synthetic data and with my attempt with "resample" function (again: not suitable in this current version as performance metrics are different) .
# 1. Find a synthetic dataset for supervised learning (two classes)
###################################################################
install.packages("mlbench")
library(mlbench)
data(BreastCancer)
# generate 1000 rows, 21 quantitative candidate predictors and 1 target variable
p<-mlbench.waveform(1000)
# convert list into dataframe
dataset<-as.data.frame(p)
# drop thrid class to get 2 classes
dataset2 = subset(dataset, classes != 3)
# 2. Perform cross validation with embedded feature selection
#############################################################
library(BBmisc)
library(nnet)
library(mlr)
# Choice of algorithm i.e. neural network
mL <- makeLearner("classif.nnet", predict.type = "prob")
# Choice of sampling plan: 10 fold cross validation with stratification of target classes
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
# Choice of feature selection strategy
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of feature selection technique (stepwize family) and p-value
mFSCS = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of seed
set.seed(12)
# Choice of data
mCT <- makeClassifTask(data =dataset2, target = "classes")
# Perform the method
result = selectFeatures(mL,mCT, mRD, control = ctrl, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Retrieve AUC and selected variables
analyzeFeatSelResult(result)
# Result: auc.test.mean=0.9614525 Variables selected: x.10, x.11, x.15, x.17, x.18
# 3. Retrieve predictions on tests sets (to later perform Delong tests on AUCs derived from multiple sets of candidate variables)
#################################################################################################################################
# create new dataset with selected predictors
keep <- c("x.10","x.11","x.15","x.17","x.18","classes")
dataset3 <- dataset2[ , names(dataset2) %in% keep]
# Perform same tasks with resample function instead of selectFeatures function to get predictions on tests set
mL <- makeLearner("classif.nnet", predict.type = "prob")
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
set.seed(12)
mCT <- makeClassifTask(data =dataset3, target = "classes")
r1r = resample(mL, mCT, mRD, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Result: auc.test.mean=0.9673023
cross-validation feature-selection mlr
I'm fitting classification models for binary issues using MLR package in R. For each model, I perform a cross-validation with embedded feature selection using "selectFeatures" function and retrieve mean AUCs over test sets. I would like next to retrieve predictions on the test sets for each fold but this function does not seem to support that. I already tried to plug selected predictors into the "resample" function to get it. It works, but performance metrics are different which is not suitable for my analysis. I also tried to check in caret package if it is possible but I have not seen a solution at first glance. Any idea how to do it?
Here is my code with synthetic data and with my attempt with "resample" function (again: not suitable in this current version as performance metrics are different) .
# 1. Find a synthetic dataset for supervised learning (two classes)
###################################################################
install.packages("mlbench")
library(mlbench)
data(BreastCancer)
# generate 1000 rows, 21 quantitative candidate predictors and 1 target variable
p<-mlbench.waveform(1000)
# convert list into dataframe
dataset<-as.data.frame(p)
# drop thrid class to get 2 classes
dataset2 = subset(dataset, classes != 3)
# 2. Perform cross validation with embedded feature selection
#############################################################
library(BBmisc)
library(nnet)
library(mlr)
# Choice of algorithm i.e. neural network
mL <- makeLearner("classif.nnet", predict.type = "prob")
# Choice of sampling plan: 10 fold cross validation with stratification of target classes
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
# Choice of feature selection strategy
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of feature selection technique (stepwize family) and p-value
mFSCS = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of seed
set.seed(12)
# Choice of data
mCT <- makeClassifTask(data =dataset2, target = "classes")
# Perform the method
result = selectFeatures(mL,mCT, mRD, control = ctrl, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Retrieve AUC and selected variables
analyzeFeatSelResult(result)
# Result: auc.test.mean=0.9614525 Variables selected: x.10, x.11, x.15, x.17, x.18
# 3. Retrieve predictions on tests sets (to later perform Delong tests on AUCs derived from multiple sets of candidate variables)
#################################################################################################################################
# create new dataset with selected predictors
keep <- c("x.10","x.11","x.15","x.17","x.18","classes")
dataset3 <- dataset2[ , names(dataset2) %in% keep]
# Perform same tasks with resample function instead of selectFeatures function to get predictions on tests set
mL <- makeLearner("classif.nnet", predict.type = "prob")
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
set.seed(12)
mCT <- makeClassifTask(data =dataset3, target = "classes")
r1r = resample(mL, mCT, mRD, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Result: auc.test.mean=0.9673023
cross-validation feature-selection mlr
cross-validation feature-selection mlr
edited Nov 9 at 10:06
asked Nov 9 at 8:57
Chris
215
215
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
ctrl
is missing in your code.
For getting predictions of your resample object, just use getRRPredictions(r1r)
or
r1r$measures.test
.
Indeed ctrl was missing. I have added it. My question is not on getting predictions from the resample object, I already did it (it was my first attempt). The issue with this attempt is that resample function give a different AUC than the one of makeClassifTask. I have edited my question to make it more clear. Thx!
– Chris
Nov 9 at 10:10
You could use "makeFeatSelWrapper" as alternative. I also get different results, like you...
– PhilippPro
Nov 9 at 12:52
Does makeFeatSelWrapperd do the whole i.e. CV+feature selection+prediced values ?
– Chris
Nov 9 at 15:10
It seems indeed to be a solution. What is however strange is that I get model without variables at the end for logistic regression and erros for a neural network. I'm going to open a separate question for this.
– Chris
Nov 12 at 12:45
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
ctrl
is missing in your code.
For getting predictions of your resample object, just use getRRPredictions(r1r)
or
r1r$measures.test
.
Indeed ctrl was missing. I have added it. My question is not on getting predictions from the resample object, I already did it (it was my first attempt). The issue with this attempt is that resample function give a different AUC than the one of makeClassifTask. I have edited my question to make it more clear. Thx!
– Chris
Nov 9 at 10:10
You could use "makeFeatSelWrapper" as alternative. I also get different results, like you...
– PhilippPro
Nov 9 at 12:52
Does makeFeatSelWrapperd do the whole i.e. CV+feature selection+prediced values ?
– Chris
Nov 9 at 15:10
It seems indeed to be a solution. What is however strange is that I get model without variables at the end for logistic regression and erros for a neural network. I'm going to open a separate question for this.
– Chris
Nov 12 at 12:45
add a comment |
up vote
1
down vote
ctrl
is missing in your code.
For getting predictions of your resample object, just use getRRPredictions(r1r)
or
r1r$measures.test
.
Indeed ctrl was missing. I have added it. My question is not on getting predictions from the resample object, I already did it (it was my first attempt). The issue with this attempt is that resample function give a different AUC than the one of makeClassifTask. I have edited my question to make it more clear. Thx!
– Chris
Nov 9 at 10:10
You could use "makeFeatSelWrapper" as alternative. I also get different results, like you...
– PhilippPro
Nov 9 at 12:52
Does makeFeatSelWrapperd do the whole i.e. CV+feature selection+prediced values ?
– Chris
Nov 9 at 15:10
It seems indeed to be a solution. What is however strange is that I get model without variables at the end for logistic regression and erros for a neural network. I'm going to open a separate question for this.
– Chris
Nov 12 at 12:45
add a comment |
up vote
1
down vote
up vote
1
down vote
ctrl
is missing in your code.
For getting predictions of your resample object, just use getRRPredictions(r1r)
or
r1r$measures.test
.
ctrl
is missing in your code.
For getting predictions of your resample object, just use getRRPredictions(r1r)
or
r1r$measures.test
.
answered Nov 9 at 9:52
PhilippPro
38617
38617
Indeed ctrl was missing. I have added it. My question is not on getting predictions from the resample object, I already did it (it was my first attempt). The issue with this attempt is that resample function give a different AUC than the one of makeClassifTask. I have edited my question to make it more clear. Thx!
– Chris
Nov 9 at 10:10
You could use "makeFeatSelWrapper" as alternative. I also get different results, like you...
– PhilippPro
Nov 9 at 12:52
Does makeFeatSelWrapperd do the whole i.e. CV+feature selection+prediced values ?
– Chris
Nov 9 at 15:10
It seems indeed to be a solution. What is however strange is that I get model without variables at the end for logistic regression and erros for a neural network. I'm going to open a separate question for this.
– Chris
Nov 12 at 12:45
add a comment |
Indeed ctrl was missing. I have added it. My question is not on getting predictions from the resample object, I already did it (it was my first attempt). The issue with this attempt is that resample function give a different AUC than the one of makeClassifTask. I have edited my question to make it more clear. Thx!
– Chris
Nov 9 at 10:10
You could use "makeFeatSelWrapper" as alternative. I also get different results, like you...
– PhilippPro
Nov 9 at 12:52
Does makeFeatSelWrapperd do the whole i.e. CV+feature selection+prediced values ?
– Chris
Nov 9 at 15:10
It seems indeed to be a solution. What is however strange is that I get model without variables at the end for logistic regression and erros for a neural network. I'm going to open a separate question for this.
– Chris
Nov 12 at 12:45
Indeed ctrl was missing. I have added it. My question is not on getting predictions from the resample object, I already did it (it was my first attempt). The issue with this attempt is that resample function give a different AUC than the one of makeClassifTask. I have edited my question to make it more clear. Thx!
– Chris
Nov 9 at 10:10
Indeed ctrl was missing. I have added it. My question is not on getting predictions from the resample object, I already did it (it was my first attempt). The issue with this attempt is that resample function give a different AUC than the one of makeClassifTask. I have edited my question to make it more clear. Thx!
– Chris
Nov 9 at 10:10
You could use "makeFeatSelWrapper" as alternative. I also get different results, like you...
– PhilippPro
Nov 9 at 12:52
You could use "makeFeatSelWrapper" as alternative. I also get different results, like you...
– PhilippPro
Nov 9 at 12:52
Does makeFeatSelWrapperd do the whole i.e. CV+feature selection+prediced values ?
– Chris
Nov 9 at 15:10
Does makeFeatSelWrapperd do the whole i.e. CV+feature selection+prediced values ?
– Chris
Nov 9 at 15:10
It seems indeed to be a solution. What is however strange is that I get model without variables at the end for logistic regression and erros for a neural network. I'm going to open a separate question for this.
– Chris
Nov 12 at 12:45
It seems indeed to be a solution. What is however strange is that I get model without variables at the end for logistic regression and erros for a neural network. I'm going to open a separate question for this.
– Chris
Nov 12 at 12:45
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53222557%2fget-predictions-on-test-sets-in-mlr%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown