How to create new data frame with custom columns out of another data frame, on filtered data per each column












1















I've tried to look for a similar question but I couldn't find anything similar.



I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.



Eg:



df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 

> df
level CODE ADD_ALLOW_MEAL ALLOW_MEALLOW
1 1 1234 <NA> NA
2 1 3452 Y 40
3 1 1234 Y 60
4 2 7654 N NA
5 2 6547 N NA
6 3 6546 <NA> NA
7 4 7683 <NA> NA
8 5 6543 Y 50
9 6 7683 Y 70
10 7 6543 N NA
11 8 7683 N NA


What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:



df %>% 
group_by(level) %>%
summarise()


The problem is I need to create several very custom columns on the filtered data, all per level.



Example:



df %>% 
group_by(level) %>%
summarise(
Meal_Average = filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}
)


I get an error of:




Column `Meal_Average` must be length 1 (a summary value), not 4



My desired result would be:



 level    Meal_Average 
1 1 50
2 2 NA
3 3 NA
4 4 NA
5 5 NA
6 6 NA
7 7 NA
8 8 NA


Any ideas how can I do that?



Thanks!!










share|improve this question





























    1















    I've tried to look for a similar question but I couldn't find anything similar.



    I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.



    Eg:



    df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 

    > df
    level CODE ADD_ALLOW_MEAL ALLOW_MEALLOW
    1 1 1234 <NA> NA
    2 1 3452 Y 40
    3 1 1234 Y 60
    4 2 7654 N NA
    5 2 6547 N NA
    6 3 6546 <NA> NA
    7 4 7683 <NA> NA
    8 5 6543 Y 50
    9 6 7683 Y 70
    10 7 6543 N NA
    11 8 7683 N NA


    What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:



    df %>% 
    group_by(level) %>%
    summarise()


    The problem is I need to create several very custom columns on the filtered data, all per level.



    Example:



    df %>% 
    group_by(level) %>%
    summarise(
    Meal_Average = filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}
    )


    I get an error of:




    Column `Meal_Average` must be length 1 (a summary value), not 4



    My desired result would be:



     level    Meal_Average 
    1 1 50
    2 2 NA
    3 3 NA
    4 4 NA
    5 5 NA
    6 6 NA
    7 7 NA
    8 8 NA


    Any ideas how can I do that?



    Thanks!!










    share|improve this question



























      1












      1








      1








      I've tried to look for a similar question but I couldn't find anything similar.



      I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.



      Eg:



      df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 

      > df
      level CODE ADD_ALLOW_MEAL ALLOW_MEALLOW
      1 1 1234 <NA> NA
      2 1 3452 Y 40
      3 1 1234 Y 60
      4 2 7654 N NA
      5 2 6547 N NA
      6 3 6546 <NA> NA
      7 4 7683 <NA> NA
      8 5 6543 Y 50
      9 6 7683 Y 70
      10 7 6543 N NA
      11 8 7683 N NA


      What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:



      df %>% 
      group_by(level) %>%
      summarise()


      The problem is I need to create several very custom columns on the filtered data, all per level.



      Example:



      df %>% 
      group_by(level) %>%
      summarise(
      Meal_Average = filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}
      )


      I get an error of:




      Column `Meal_Average` must be length 1 (a summary value), not 4



      My desired result would be:



       level    Meal_Average 
      1 1 50
      2 2 NA
      3 3 NA
      4 4 NA
      5 5 NA
      6 6 NA
      7 7 NA
      8 8 NA


      Any ideas how can I do that?



      Thanks!!










      share|improve this question
















      I've tried to look for a similar question but I couldn't find anything similar.



      I have a data frame df with hundreds of rows and several variables. The first variable is level which goes from 1 to 8.



      Eg:



      df<- data.frame(level = c(1,1,1,2,2,3,4,5,6,7,8), CODE = c("1234", "3452", "1234", "7654","6547","6546", "7683", "6543", "7683", "6543", "7683"), ADD_ALLOW_MEAL = c(NA, "Y", "Y", "N", "N", NA, NA, "Y", "Y", "N", 'N'), ALLOW_MEALLOW = c(NA, 40, 60, NA, NA, NA, NA, 50, 70, NA, NA)) 

      > df
      level CODE ADD_ALLOW_MEAL ALLOW_MEALLOW
      1 1 1234 <NA> NA
      2 1 3452 Y 40
      3 1 1234 Y 60
      4 2 7654 N NA
      5 2 6547 N NA
      6 3 6546 <NA> NA
      7 4 7683 <NA> NA
      8 5 6543 Y 50
      9 6 7683 Y 70
      10 7 6543 N NA
      11 8 7683 N NA


      What I need is to create a new data frame which will have only 8 rows (8 levels from df). Normally I'd use simple:



      df %>% 
      group_by(level) %>%
      summarise()


      The problem is I need to create several very custom columns on the filtered data, all per level.



      Example:



      df %>% 
      group_by(level) %>%
      summarise(
      Meal_Average = filter(., ADD_ALLOW_MEAL =="Y" & ALLOW_MEALLOW>0) %>% {ifelse(str_detect(.$CODE, "2")=="TRUE", round(mean(.$ALLOW_MEALLOW, na.rm = TRUE),3), NA_real_ )}
      )


      I get an error of:




      Column `Meal_Average` must be length 1 (a summary value), not 4



      My desired result would be:



       level    Meal_Average 
      1 1 50
      2 2 NA
      3 3 NA
      4 4 NA
      5 5 NA
      6 6 NA
      7 7 NA
      8 8 NA


      Any ideas how can I do that?



      Thanks!!







      r group-by summarize






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 21 '18 at 9:11









      Sotos

      30.2k51640




      30.2k51640










      asked Nov 20 '18 at 11:29









      MartaMarta

      134




      134
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.



          library(dplyr)

          df %>%
          mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>%
          group_by(level) %>%
          summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))


          which gives,




          # A tibble: 8 x 2
          level new_mean
          <dbl> <dbl>
          1 1 50
          2 2 0
          3 3 0
          4 4 0
          5 5 0
          6 6 0
          7 7 0
          8 8 0



          NOTE: You can replace 0 with NA as usual






          share|improve this answer
























          • Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

            – Marta
            Nov 21 '18 at 11:55











          • Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

            – Sotos
            Nov 21 '18 at 13:27











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53392041%2fhow-to-create-new-data-frame-with-custom-columns-out-of-another-data-frame-on-f%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.



          library(dplyr)

          df %>%
          mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>%
          group_by(level) %>%
          summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))


          which gives,




          # A tibble: 8 x 2
          level new_mean
          <dbl> <dbl>
          1 1 50
          2 2 0
          3 3 0
          4 4 0
          5 5 0
          6 6 0
          7 7 0
          8 8 0



          NOTE: You can replace 0 with NA as usual






          share|improve this answer
























          • Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

            – Marta
            Nov 21 '18 at 11:55











          • Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

            – Sotos
            Nov 21 '18 at 13:27
















          0














          Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.



          library(dplyr)

          df %>%
          mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>%
          group_by(level) %>%
          summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))


          which gives,




          # A tibble: 8 x 2
          level new_mean
          <dbl> <dbl>
          1 1 50
          2 2 0
          3 3 0
          4 4 0
          5 5 0
          6 6 0
          7 7 0
          8 8 0



          NOTE: You can replace 0 with NA as usual






          share|improve this answer
























          • Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

            – Marta
            Nov 21 '18 at 11:55











          • Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

            – Sotos
            Nov 21 '18 at 13:27














          0












          0








          0







          Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.



          library(dplyr)

          df %>%
          mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>%
          group_by(level) %>%
          summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))


          which gives,




          # A tibble: 8 x 2
          level new_mean
          <dbl> <dbl>
          1 1 50
          2 2 0
          3 3 0
          4 4 0
          5 5 0
          6 6 0
          7 7 0
          8 8 0



          NOTE: You can replace 0 with NA as usual






          share|improve this answer













          Here is an idea using dplyr. What I did here is I just used your conditions to replace the ALLOW_MEALLOW value to 0 (so that it does not impact the mean), i.e.



          library(dplyr)

          df %>%
          mutate(ALLOW_MEALLOW = replace(ALLOW_MEALLOW, ADD_ALLOW_MEAL == 'N' & ALLOW_MEALLOW < 0 | !grepl('2', CODE), 0)) %>%
          group_by(level) %>%
          summarise(new_mean = mean(ALLOW_MEALLOW, na.rm = TRUE))


          which gives,




          # A tibble: 8 x 2
          level new_mean
          <dbl> <dbl>
          1 1 50
          2 2 0
          3 3 0
          4 4 0
          5 5 0
          6 6 0
          7 7 0
          8 8 0



          NOTE: You can replace 0 with NA as usual







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 21 '18 at 9:08









          SotosSotos

          30.2k51640




          30.2k51640













          • Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

            – Marta
            Nov 21 '18 at 11:55











          • Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

            – Sotos
            Nov 21 '18 at 13:27



















          • Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

            – Marta
            Nov 21 '18 at 11:55











          • Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

            – Sotos
            Nov 21 '18 at 13:27

















          Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

          – Marta
          Nov 21 '18 at 11:55





          Thanks for your answer, this is a new way for me to use replace, very useful! Do you know how could I use that solution to create more custom columns? I need plenty of them in one new data frame (that's why I wanted to use one "summarize"). In your solution using mutate it would be several separate data frames, right? Thanks!

          – Marta
          Nov 21 '18 at 11:55













          Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

          – Sotos
          Nov 21 '18 at 13:27





          Hi @Marta. No, mutate does not create data frames. Try iris %>% group_by(Species) %>% summarise(new = mean(Sepal.Width)) Vs iris %>% group_by(Species) %>% mutate(new = mean(Sepal.Width)). If you want to apply the function to more columns there are mutate_at, mutate_if, mutate_all (same with summarise)

          – Sotos
          Nov 21 '18 at 13:27




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53392041%2fhow-to-create-new-data-frame-with-custom-columns-out-of-another-data-frame-on-f%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Guess what letter conforming each word

          Port of Spain

          Run scheduled task as local user group (not BUILTIN)