How to create new data frame with custom columns out of another data frame, on filtered data per each column

I've tried to look for a similar question but I couldn't find anything similar.

I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.

Eg:

df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 



> df

   level     CODE ADD_ALLOW_MEAL     ALLOW_MEALLOW

1      1     1234           <NA>                NA

2      1     3452              Y                40

3      1     1234              Y                60

4      2     7654              N                NA

5      2     6547              N                NA

6      3     6546           <NA>                NA

7      4     7683           <NA>                NA

8      5     6543              Y                50

9      6     7683              Y                70

10     7     6543              N                NA

11     8     7683              N                NA

What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:

df %>% 

  group_by(level) %>% 

  summarise()

The problem is I need to create several very custom columns on the filtered data, all per level.

Example:

df %>% 

  group_by(level) %>% 

  summarise(

    Meal_Average =  filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}  

  )

I get an error of:

Column `Meal_Average` must be length 1 (a summary value), not 4

My desired result would be:

 level    Meal_Average 

1      1   50

2      2   NA

3      3   NA

4      4   NA

5      5   NA

6      6   NA

7      7   NA

8      8   NA

Any ideas how can I do that?

Thanks!!

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

asked Nov 20 '18 at 11:29

Marta

134

add a comment |

I've tried to look for a similar question but I couldn't find anything similar.

I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.

Eg:

df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 



> df

   level     CODE ADD_ALLOW_MEAL     ALLOW_MEALLOW

1      1     1234           <NA>                NA

2      1     3452              Y                40

3      1     1234              Y                60

4      2     7654              N                NA

5      2     6547              N                NA

6      3     6546           <NA>                NA

7      4     7683           <NA>                NA

8      5     6543              Y                50

9      6     7683              Y                70

10     7     6543              N                NA

11     8     7683              N                NA

What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:

df %>% 

  group_by(level) %>% 

  summarise()

The problem is I need to create several very custom columns on the filtered data, all per level.

Example:

df %>% 

  group_by(level) %>% 

  summarise(

    Meal_Average =  filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}  

  )

I get an error of:

Column `Meal_Average` must be length 1 (a summary value), not 4

My desired result would be:

 level    Meal_Average 

1      1   50

2      2   NA

3      3   NA

4      4   NA

5      5   NA

6      6   NA

7      7   NA

8      8   NA

Any ideas how can I do that?

Thanks!!

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

asked Nov 20 '18 at 11:29

Marta

134

add a comment |

I've tried to look for a similar question but I couldn't find anything similar.

I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.

Eg:

df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 



> df

   level     CODE ADD_ALLOW_MEAL     ALLOW_MEALLOW

1      1     1234           <NA>                NA

2      1     3452              Y                40

3      1     1234              Y                60

4      2     7654              N                NA

5      2     6547              N                NA

6      3     6546           <NA>                NA

7      4     7683           <NA>                NA

8      5     6543              Y                50

9      6     7683              Y                70

10     7     6543              N                NA

11     8     7683              N                NA

What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:

df %>% 

  group_by(level) %>% 

  summarise()

The problem is I need to create several very custom columns on the filtered data, all per level.

Example:

df %>% 

  group_by(level) %>% 

  summarise(

    Meal_Average =  filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}  

  )

I get an error of:

Column `Meal_Average` must be length 1 (a summary value), not 4

My desired result would be:

 level    Meal_Average 

1      1   50

2      2   NA

3      3   NA

4      4   NA

5      5   NA

6      6   NA

7      7   NA

8      8   NA

Any ideas how can I do that?

Thanks!!

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

asked Nov 20 '18 at 11:29

Marta

134

I've tried to look for a similar question but I couldn't find anything similar.

I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.

Eg:

df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 



> df

   level     CODE ADD_ALLOW_MEAL     ALLOW_MEALLOW

1      1     1234           <NA>                NA

2      1     3452              Y                40

3      1     1234              Y                60

4      2     7654              N                NA

5      2     6547              N                NA

6      3     6546           <NA>                NA

7      4     7683           <NA>                NA

8      5     6543              Y                50

9      6     7683              Y                70

10     7     6543              N                NA

11     8     7683              N                NA

What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:

df %>% 

  group_by(level) %>% 

  summarise()

The problem is I need to create several very custom columns on the filtered data, all per level.

Example:

df %>% 

  group_by(level) %>% 

  summarise(

    Meal_Average =  filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}  

  )

I get an error of:

Column `Meal_Average` must be length 1 (a summary value), not 4

My desired result would be:

 level    Meal_Average 

1      1   50

2      2   NA

3      3   NA

4      4   NA

5      5   NA

6      6   NA

7      7   NA

8      8   NA

Any ideas how can I do that?

Thanks!!

r group-by summarize

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

asked Nov 20 '18 at 11:29

Marta

134

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

asked Nov 20 '18 at 11:29

Marta

134

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

edited Nov 21 '18 at 9:11

Sotos

30.2k51640

asked Nov 20 '18 at 11:29

Marta

134

asked Nov 20 '18 at 11:29

Marta

134

asked Nov 20 '18 at 11:29

Marta

134

add a comment |

1 Answer
1

active

oldest

votes

Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.

library(dplyr)



df %>% 

 mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>% 

 group_by(level) %>% 

 summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))

which gives,

# A tibble: 8 x 2

  level new_mean

  <dbl>    <dbl>

1     1       50

2     2        0

3     3        0

4     4        0

5     5        0

6     6        0

7     7        0

8     8        0

NOTE: You can replace 0 with NA as usual

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

– Marta
Nov 21 '18 at 11:55

Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

– Sotos
Nov 21 '18 at 13:27

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53392041%2fhow-to-create-new-data-frame-with-custom-columns-out-of-another-data-frame-on-f%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.

library(dplyr)



df %>% 

 mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>% 

 group_by(level) %>% 

 summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))

which gives,

# A tibble: 8 x 2

  level new_mean

  <dbl>    <dbl>

1     1       50

2     2        0

3     3        0

4     4        0

5     5        0

6     6        0

7     7        0

8     8        0

NOTE: You can replace 0 with NA as usual

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

– Marta
Nov 21 '18 at 11:55

Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

– Sotos
Nov 21 '18 at 13:27

add a comment |

Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.

library(dplyr)



df %>% 

 mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>% 

 group_by(level) %>% 

 summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))

which gives,

# A tibble: 8 x 2

  level new_mean

  <dbl>    <dbl>

1     1       50

2     2        0

3     3        0

4     4        0

5     5        0

6     6        0

7     7        0

8     8        0

NOTE: You can replace 0 with NA as usual

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

– Marta
Nov 21 '18 at 11:55

Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

– Sotos
Nov 21 '18 at 13:27

add a comment |

Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.

library(dplyr)



df %>% 

 mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>% 

 group_by(level) %>% 

 summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))

which gives,

# A tibble: 8 x 2

  level new_mean

  <dbl>    <dbl>

1     1       50

2     2        0

3     3        0

4     4        0

5     5        0

6     6        0

7     7        0

8     8        0

NOTE: You can replace 0 with NA as usual

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.

library(dplyr)



df %>% 

 mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>% 

 group_by(level) %>% 

 summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))

which gives,

# A tibble: 8 x 2

  level new_mean

  <dbl>    <dbl>

1     1       50

2     2        0

3     3        0

4     4        0

5     5        0

6     6        0

7     7        0

8     8        0

NOTE: You can replace 0 with NA as usual

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

answered Nov 21 '18 at 9:08

Sotos

30.2k51640

Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

– Marta
Nov 21 '18 at 11:55

Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

– Sotos
Nov 21 '18 at 13:27

add a comment |

Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

– Marta
Nov 21 '18 at 11:55

Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

– Sotos
Nov 21 '18 at 13:27

Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

– Marta
Nov 21 '18 at 11:55

Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

– Sotos
Nov 21 '18 at 13:27

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk