Mixed integer programming R: Least absolute deviation with cost associated with each regressor











up vote
0
down vote

favorite












I have been presented with a problem, regarding the minimization of the absolute error, the problem know as LAD(Least absolute deviation) but, being each regressor the result of expensive test with an associated cost, one should refrain from using regressors that don't explain variance to a high degree. It takes the following equations:



enter image description here



Where N is the total number of observations, E the deviation associated with observation i, S the number of independant variables, lambda a penalty coefficient for the cost, and C the cost associated with performing the test.



So far, I have oriented as usual. To make it lineal, I transformed the absolute value in two errors, e^+ and e^-, where e= y_i-(B_0+sum(B_j*X_ij) and the following constraints:




  • z_j ={0,1}, binary value about whether the regressor enters my model.


  • B_i<=M_zj; B_i>=-M_zj


  • E^+, E^- >=0


A toy subset of data I'm working has the following structure:
For y



 quality
1 5
2 5
3 5
4 6
5 7
6 5


For the regressors



fixed.acidity volatile.acidity citric.acid
1 7.5 0.610 0.26
2 5.6 0.540 0.04
3 7.4 0.965 0.00
4 6.7 0.460 0.24
5 6.1 0.400 0.16
6 9.7 0.690 0.32


And for the cost



fixed.acidity volatile.acidity citric.acid
1 0.26 0.6 0.52


So far, my code looks like this:



# loading the matrixes
y <- read.csv(file="PATH\y.csv", header = TRUE, sep = ",") #dim=100*11
regresores <- read.csv(file="PATH\regressors.csv", header = TRUE, sep = ",")#dim=100*1
cost <- read.csv(file="PATH\cost.csv", header = TRUE, sep = ",")#dim=1*11

for (i in seq(0, 1, by = 0.1)){#so as to have a collection of models with different penalties
obj.fun <- c(1,1,i*coste)
constr <- matrix(
c(y,regresores,-regresores),
c(-y,-regresores,regresores),
sum(regresores),ncol = ,byrow = TRUE)
constr.dir <- c("<=",">=","<=","==")
rhs<-c(regresores,-regresores,1,binary)
sol<- lp("min", obj.fun, constr, constr.tr, rhs)
sol$objval
sol$solution}


I know theres is a LAD function in R, but for consistence sake with my colleagues, as well as a pretty annoying phD tutor, I have to perform this using lpSolve in R. I have just started with R for the project and I don't know exactly why this won't run. Is there something wrong with the syntax or my formulation of the model. Right know, the main problem I have is:




"Error in matrix(c(y, regressors, -regressors), c(-y, -regressors, regressors), : non-numeric matrix extent".




Mainly, I intended for it to create said weighted LAD model and have it return the different values of lambda, from 0 to 1 in a 0.1 step.



Thanks in advance and sorry for any inconvenience, neither English nor R are my native languages.










share|improve this question
























  • What is your question?
    – emilliman5
    Nov 9 at 12:38










  • Sorry, just realized I forgot the most important part. I'll edit it now.
    – Aaron G.
    Nov 9 at 12:54










  • It helps us help you if you provide 1) the smallest amount of code possible to demonstrate the problem, 2) data for that code to run on, 3) an explicit statement of expected results, 4) an explanation of why your code doesn't work (e.g., how are the results different from the desired output or what error messages do you receive).
    – Lyngbakr
    Nov 9 at 13:03












  • 2)the data is kind of massive, it is a matrix of 11*100 samples.3) I wish to have result showing, for each lambda, a model to work with.4) the code has given me some different errors, the main one being non-numeric matrix extend.1) The code I provided, is it not showing? Otherwise I don´t know exactly what it is supposed to be.
    – Aaron G.
    Nov 9 at 13:13












  • The code is showing, I'm just explaining what a complete question looks like. (See here for more details.) If you can't provide all the data, provide a subset or a toy data set that can be used to demonstrate the problem. There are lots of data sets here for this sort of thing. Also, edit your question to include the exact error messages you receive. If you provide a well-structured question, you're more likely to get a good answer.
    – Lyngbakr
    Nov 9 at 15:58















up vote
0
down vote

favorite












I have been presented with a problem, regarding the minimization of the absolute error, the problem know as LAD(Least absolute deviation) but, being each regressor the result of expensive test with an associated cost, one should refrain from using regressors that don't explain variance to a high degree. It takes the following equations:



enter image description here



Where N is the total number of observations, E the deviation associated with observation i, S the number of independant variables, lambda a penalty coefficient for the cost, and C the cost associated with performing the test.



So far, I have oriented as usual. To make it lineal, I transformed the absolute value in two errors, e^+ and e^-, where e= y_i-(B_0+sum(B_j*X_ij) and the following constraints:




  • z_j ={0,1}, binary value about whether the regressor enters my model.


  • B_i<=M_zj; B_i>=-M_zj


  • E^+, E^- >=0


A toy subset of data I'm working has the following structure:
For y



 quality
1 5
2 5
3 5
4 6
5 7
6 5


For the regressors



fixed.acidity volatile.acidity citric.acid
1 7.5 0.610 0.26
2 5.6 0.540 0.04
3 7.4 0.965 0.00
4 6.7 0.460 0.24
5 6.1 0.400 0.16
6 9.7 0.690 0.32


And for the cost



fixed.acidity volatile.acidity citric.acid
1 0.26 0.6 0.52


So far, my code looks like this:



# loading the matrixes
y <- read.csv(file="PATH\y.csv", header = TRUE, sep = ",") #dim=100*11
regresores <- read.csv(file="PATH\regressors.csv", header = TRUE, sep = ",")#dim=100*1
cost <- read.csv(file="PATH\cost.csv", header = TRUE, sep = ",")#dim=1*11

for (i in seq(0, 1, by = 0.1)){#so as to have a collection of models with different penalties
obj.fun <- c(1,1,i*coste)
constr <- matrix(
c(y,regresores,-regresores),
c(-y,-regresores,regresores),
sum(regresores),ncol = ,byrow = TRUE)
constr.dir <- c("<=",">=","<=","==")
rhs<-c(regresores,-regresores,1,binary)
sol<- lp("min", obj.fun, constr, constr.tr, rhs)
sol$objval
sol$solution}


I know theres is a LAD function in R, but for consistence sake with my colleagues, as well as a pretty annoying phD tutor, I have to perform this using lpSolve in R. I have just started with R for the project and I don't know exactly why this won't run. Is there something wrong with the syntax or my formulation of the model. Right know, the main problem I have is:




"Error in matrix(c(y, regressors, -regressors), c(-y, -regressors, regressors), : non-numeric matrix extent".




Mainly, I intended for it to create said weighted LAD model and have it return the different values of lambda, from 0 to 1 in a 0.1 step.



Thanks in advance and sorry for any inconvenience, neither English nor R are my native languages.










share|improve this question
























  • What is your question?
    – emilliman5
    Nov 9 at 12:38










  • Sorry, just realized I forgot the most important part. I'll edit it now.
    – Aaron G.
    Nov 9 at 12:54










  • It helps us help you if you provide 1) the smallest amount of code possible to demonstrate the problem, 2) data for that code to run on, 3) an explicit statement of expected results, 4) an explanation of why your code doesn't work (e.g., how are the results different from the desired output or what error messages do you receive).
    – Lyngbakr
    Nov 9 at 13:03












  • 2)the data is kind of massive, it is a matrix of 11*100 samples.3) I wish to have result showing, for each lambda, a model to work with.4) the code has given me some different errors, the main one being non-numeric matrix extend.1) The code I provided, is it not showing? Otherwise I don´t know exactly what it is supposed to be.
    – Aaron G.
    Nov 9 at 13:13












  • The code is showing, I'm just explaining what a complete question looks like. (See here for more details.) If you can't provide all the data, provide a subset or a toy data set that can be used to demonstrate the problem. There are lots of data sets here for this sort of thing. Also, edit your question to include the exact error messages you receive. If you provide a well-structured question, you're more likely to get a good answer.
    – Lyngbakr
    Nov 9 at 15:58













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I have been presented with a problem, regarding the minimization of the absolute error, the problem know as LAD(Least absolute deviation) but, being each regressor the result of expensive test with an associated cost, one should refrain from using regressors that don't explain variance to a high degree. It takes the following equations:



enter image description here



Where N is the total number of observations, E the deviation associated with observation i, S the number of independant variables, lambda a penalty coefficient for the cost, and C the cost associated with performing the test.



So far, I have oriented as usual. To make it lineal, I transformed the absolute value in two errors, e^+ and e^-, where e= y_i-(B_0+sum(B_j*X_ij) and the following constraints:




  • z_j ={0,1}, binary value about whether the regressor enters my model.


  • B_i<=M_zj; B_i>=-M_zj


  • E^+, E^- >=0


A toy subset of data I'm working has the following structure:
For y



 quality
1 5
2 5
3 5
4 6
5 7
6 5


For the regressors



fixed.acidity volatile.acidity citric.acid
1 7.5 0.610 0.26
2 5.6 0.540 0.04
3 7.4 0.965 0.00
4 6.7 0.460 0.24
5 6.1 0.400 0.16
6 9.7 0.690 0.32


And for the cost



fixed.acidity volatile.acidity citric.acid
1 0.26 0.6 0.52


So far, my code looks like this:



# loading the matrixes
y <- read.csv(file="PATH\y.csv", header = TRUE, sep = ",") #dim=100*11
regresores <- read.csv(file="PATH\regressors.csv", header = TRUE, sep = ",")#dim=100*1
cost <- read.csv(file="PATH\cost.csv", header = TRUE, sep = ",")#dim=1*11

for (i in seq(0, 1, by = 0.1)){#so as to have a collection of models with different penalties
obj.fun <- c(1,1,i*coste)
constr <- matrix(
c(y,regresores,-regresores),
c(-y,-regresores,regresores),
sum(regresores),ncol = ,byrow = TRUE)
constr.dir <- c("<=",">=","<=","==")
rhs<-c(regresores,-regresores,1,binary)
sol<- lp("min", obj.fun, constr, constr.tr, rhs)
sol$objval
sol$solution}


I know theres is a LAD function in R, but for consistence sake with my colleagues, as well as a pretty annoying phD tutor, I have to perform this using lpSolve in R. I have just started with R for the project and I don't know exactly why this won't run. Is there something wrong with the syntax or my formulation of the model. Right know, the main problem I have is:




"Error in matrix(c(y, regressors, -regressors), c(-y, -regressors, regressors), : non-numeric matrix extent".




Mainly, I intended for it to create said weighted LAD model and have it return the different values of lambda, from 0 to 1 in a 0.1 step.



Thanks in advance and sorry for any inconvenience, neither English nor R are my native languages.










share|improve this question















I have been presented with a problem, regarding the minimization of the absolute error, the problem know as LAD(Least absolute deviation) but, being each regressor the result of expensive test with an associated cost, one should refrain from using regressors that don't explain variance to a high degree. It takes the following equations:



enter image description here



Where N is the total number of observations, E the deviation associated with observation i, S the number of independant variables, lambda a penalty coefficient for the cost, and C the cost associated with performing the test.



So far, I have oriented as usual. To make it lineal, I transformed the absolute value in two errors, e^+ and e^-, where e= y_i-(B_0+sum(B_j*X_ij) and the following constraints:




  • z_j ={0,1}, binary value about whether the regressor enters my model.


  • B_i<=M_zj; B_i>=-M_zj


  • E^+, E^- >=0


A toy subset of data I'm working has the following structure:
For y



 quality
1 5
2 5
3 5
4 6
5 7
6 5


For the regressors



fixed.acidity volatile.acidity citric.acid
1 7.5 0.610 0.26
2 5.6 0.540 0.04
3 7.4 0.965 0.00
4 6.7 0.460 0.24
5 6.1 0.400 0.16
6 9.7 0.690 0.32


And for the cost



fixed.acidity volatile.acidity citric.acid
1 0.26 0.6 0.52


So far, my code looks like this:



# loading the matrixes
y <- read.csv(file="PATH\y.csv", header = TRUE, sep = ",") #dim=100*11
regresores <- read.csv(file="PATH\regressors.csv", header = TRUE, sep = ",")#dim=100*1
cost <- read.csv(file="PATH\cost.csv", header = TRUE, sep = ",")#dim=1*11

for (i in seq(0, 1, by = 0.1)){#so as to have a collection of models with different penalties
obj.fun <- c(1,1,i*coste)
constr <- matrix(
c(y,regresores,-regresores),
c(-y,-regresores,regresores),
sum(regresores),ncol = ,byrow = TRUE)
constr.dir <- c("<=",">=","<=","==")
rhs<-c(regresores,-regresores,1,binary)
sol<- lp("min", obj.fun, constr, constr.tr, rhs)
sol$objval
sol$solution}


I know theres is a LAD function in R, but for consistence sake with my colleagues, as well as a pretty annoying phD tutor, I have to perform this using lpSolve in R. I have just started with R for the project and I don't know exactly why this won't run. Is there something wrong with the syntax or my formulation of the model. Right know, the main problem I have is:




"Error in matrix(c(y, regressors, -regressors), c(-y, -regressors, regressors), : non-numeric matrix extent".




Mainly, I intended for it to create said weighted LAD model and have it return the different values of lambda, from 0 to 1 in a 0.1 step.



Thanks in advance and sorry for any inconvenience, neither English nor R are my native languages.







r regression mixed-integer-programming






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 at 15:41









emilliman5

3,91321429




3,91321429










asked Nov 9 at 12:34









Aaron G.

11




11












  • What is your question?
    – emilliman5
    Nov 9 at 12:38










  • Sorry, just realized I forgot the most important part. I'll edit it now.
    – Aaron G.
    Nov 9 at 12:54










  • It helps us help you if you provide 1) the smallest amount of code possible to demonstrate the problem, 2) data for that code to run on, 3) an explicit statement of expected results, 4) an explanation of why your code doesn't work (e.g., how are the results different from the desired output or what error messages do you receive).
    – Lyngbakr
    Nov 9 at 13:03












  • 2)the data is kind of massive, it is a matrix of 11*100 samples.3) I wish to have result showing, for each lambda, a model to work with.4) the code has given me some different errors, the main one being non-numeric matrix extend.1) The code I provided, is it not showing? Otherwise I don´t know exactly what it is supposed to be.
    – Aaron G.
    Nov 9 at 13:13












  • The code is showing, I'm just explaining what a complete question looks like. (See here for more details.) If you can't provide all the data, provide a subset or a toy data set that can be used to demonstrate the problem. There are lots of data sets here for this sort of thing. Also, edit your question to include the exact error messages you receive. If you provide a well-structured question, you're more likely to get a good answer.
    – Lyngbakr
    Nov 9 at 15:58


















  • What is your question?
    – emilliman5
    Nov 9 at 12:38










  • Sorry, just realized I forgot the most important part. I'll edit it now.
    – Aaron G.
    Nov 9 at 12:54










  • It helps us help you if you provide 1) the smallest amount of code possible to demonstrate the problem, 2) data for that code to run on, 3) an explicit statement of expected results, 4) an explanation of why your code doesn't work (e.g., how are the results different from the desired output or what error messages do you receive).
    – Lyngbakr
    Nov 9 at 13:03












  • 2)the data is kind of massive, it is a matrix of 11*100 samples.3) I wish to have result showing, for each lambda, a model to work with.4) the code has given me some different errors, the main one being non-numeric matrix extend.1) The code I provided, is it not showing? Otherwise I don´t know exactly what it is supposed to be.
    – Aaron G.
    Nov 9 at 13:13












  • The code is showing, I'm just explaining what a complete question looks like. (See here for more details.) If you can't provide all the data, provide a subset or a toy data set that can be used to demonstrate the problem. There are lots of data sets here for this sort of thing. Also, edit your question to include the exact error messages you receive. If you provide a well-structured question, you're more likely to get a good answer.
    – Lyngbakr
    Nov 9 at 15:58
















What is your question?
– emilliman5
Nov 9 at 12:38




What is your question?
– emilliman5
Nov 9 at 12:38












Sorry, just realized I forgot the most important part. I'll edit it now.
– Aaron G.
Nov 9 at 12:54




Sorry, just realized I forgot the most important part. I'll edit it now.
– Aaron G.
Nov 9 at 12:54












It helps us help you if you provide 1) the smallest amount of code possible to demonstrate the problem, 2) data for that code to run on, 3) an explicit statement of expected results, 4) an explanation of why your code doesn't work (e.g., how are the results different from the desired output or what error messages do you receive).
– Lyngbakr
Nov 9 at 13:03






It helps us help you if you provide 1) the smallest amount of code possible to demonstrate the problem, 2) data for that code to run on, 3) an explicit statement of expected results, 4) an explanation of why your code doesn't work (e.g., how are the results different from the desired output or what error messages do you receive).
– Lyngbakr
Nov 9 at 13:03














2)the data is kind of massive, it is a matrix of 11*100 samples.3) I wish to have result showing, for each lambda, a model to work with.4) the code has given me some different errors, the main one being non-numeric matrix extend.1) The code I provided, is it not showing? Otherwise I don´t know exactly what it is supposed to be.
– Aaron G.
Nov 9 at 13:13






2)the data is kind of massive, it is a matrix of 11*100 samples.3) I wish to have result showing, for each lambda, a model to work with.4) the code has given me some different errors, the main one being non-numeric matrix extend.1) The code I provided, is it not showing? Otherwise I don´t know exactly what it is supposed to be.
– Aaron G.
Nov 9 at 13:13














The code is showing, I'm just explaining what a complete question looks like. (See here for more details.) If you can't provide all the data, provide a subset or a toy data set that can be used to demonstrate the problem. There are lots of data sets here for this sort of thing. Also, edit your question to include the exact error messages you receive. If you provide a well-structured question, you're more likely to get a good answer.
– Lyngbakr
Nov 9 at 15:58




The code is showing, I'm just explaining what a complete question looks like. (See here for more details.) If you can't provide all the data, provide a subset or a toy data set that can be used to demonstrate the problem. There are lots of data sets here for this sort of thing. Also, edit your question to include the exact error messages you receive. If you provide a well-structured question, you're more likely to get a good answer.
– Lyngbakr
Nov 9 at 15:58

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53225806%2fmixed-integer-programming-r-least-absolute-deviation-with-cost-associated-with%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53225806%2fmixed-integer-programming-r-least-absolute-deviation-with-cost-associated-with%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Guess what letter conforming each word

Port of Spain

Run scheduled task as local user group (not BUILTIN)