R, Dplyr, Combine info by group and row/column specification
I want to create a new column that has combined info from two columns, but one column is on a different row. Below is an example dataframe I want to start with:
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
I want to create a new column that looks like:
data_frame(combined_meal = c("Chicken", "Beef", "Soup and Beef", "Lamb",
"Lamb","Salad and Lamb","Beef"))
If the dependency is used, I want to combine that "food" with the "meal".
I have a large dataset with several dependencies that I need to combine into one field. I feel like there should be a simple way to do this, but I can't seem to come up with one.
Thanks!
edit:
I want to thank those who have commented so far. The tidyverse option worked best for my needs. I have one edit that I meant to add - when searching through the meals - I may need to add more than one meal together.
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb and meal 3",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
which gives:
# A tibble: 7 x 5
person meal food dependencies solo_meal
<chr> <int> <chr> <dbl> <dbl>
1 Joe 1 Chicken NA 1
2 Joe 2 Beef NA 1
3 Joe 3 Soup and meal 2 2 0
4 Joe 4 Lamb and meal 3 3 1
5 Bob 1 Lamb NA 1
6 Bob 2 Salad and meal 1 1 0
7 Bob 3 Beef NA 1
I want to have a column of combined meals:
# A tibble: 7 x 1
combined_meal
<chr>
1 Chicken
2 Beef
3 Soup and Beef
4 Lamb and Soup and Beef
5 Lamb
6 Salad and Lamb
7 Beef
How do I recursively add the meals? Preferably using the tidyverse.
Thanks again!
r dplyr
add a comment |
I want to create a new column that has combined info from two columns, but one column is on a different row. Below is an example dataframe I want to start with:
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
I want to create a new column that looks like:
data_frame(combined_meal = c("Chicken", "Beef", "Soup and Beef", "Lamb",
"Lamb","Salad and Lamb","Beef"))
If the dependency is used, I want to combine that "food" with the "meal".
I have a large dataset with several dependencies that I need to combine into one field. I feel like there should be a simple way to do this, but I can't seem to come up with one.
Thanks!
edit:
I want to thank those who have commented so far. The tidyverse option worked best for my needs. I have one edit that I meant to add - when searching through the meals - I may need to add more than one meal together.
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb and meal 3",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
which gives:
# A tibble: 7 x 5
person meal food dependencies solo_meal
<chr> <int> <chr> <dbl> <dbl>
1 Joe 1 Chicken NA 1
2 Joe 2 Beef NA 1
3 Joe 3 Soup and meal 2 2 0
4 Joe 4 Lamb and meal 3 3 1
5 Bob 1 Lamb NA 1
6 Bob 2 Salad and meal 1 1 0
7 Bob 3 Beef NA 1
I want to have a column of combined meals:
# A tibble: 7 x 1
combined_meal
<chr>
1 Chicken
2 Beef
3 Soup and Beef
4 Lamb and Soup and Beef
5 Lamb
6 Salad and Lamb
7 Beef
How do I recursively add the meals? Preferably using the tidyverse.
Thanks again!
r dplyr
Re your edit, why is there no dependency Joe's meal 4?
– iod
Nov 14 at 23:55
I forgot to update that column in the edit. Should be fixed now.
– JoeShmo
Nov 15 at 16:09
add a comment |
I want to create a new column that has combined info from two columns, but one column is on a different row. Below is an example dataframe I want to start with:
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
I want to create a new column that looks like:
data_frame(combined_meal = c("Chicken", "Beef", "Soup and Beef", "Lamb",
"Lamb","Salad and Lamb","Beef"))
If the dependency is used, I want to combine that "food" with the "meal".
I have a large dataset with several dependencies that I need to combine into one field. I feel like there should be a simple way to do this, but I can't seem to come up with one.
Thanks!
edit:
I want to thank those who have commented so far. The tidyverse option worked best for my needs. I have one edit that I meant to add - when searching through the meals - I may need to add more than one meal together.
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb and meal 3",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
which gives:
# A tibble: 7 x 5
person meal food dependencies solo_meal
<chr> <int> <chr> <dbl> <dbl>
1 Joe 1 Chicken NA 1
2 Joe 2 Beef NA 1
3 Joe 3 Soup and meal 2 2 0
4 Joe 4 Lamb and meal 3 3 1
5 Bob 1 Lamb NA 1
6 Bob 2 Salad and meal 1 1 0
7 Bob 3 Beef NA 1
I want to have a column of combined meals:
# A tibble: 7 x 1
combined_meal
<chr>
1 Chicken
2 Beef
3 Soup and Beef
4 Lamb and Soup and Beef
5 Lamb
6 Salad and Lamb
7 Beef
How do I recursively add the meals? Preferably using the tidyverse.
Thanks again!
r dplyr
I want to create a new column that has combined info from two columns, but one column is on a different row. Below is an example dataframe I want to start with:
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
I want to create a new column that looks like:
data_frame(combined_meal = c("Chicken", "Beef", "Soup and Beef", "Lamb",
"Lamb","Salad and Lamb","Beef"))
If the dependency is used, I want to combine that "food" with the "meal".
I have a large dataset with several dependencies that I need to combine into one field. I feel like there should be a simple way to do this, but I can't seem to come up with one.
Thanks!
edit:
I want to thank those who have commented so far. The tidyverse option worked best for my needs. I have one edit that I meant to add - when searching through the meals - I may need to add more than one meal together.
df <- data_frame(person = c(rep("Joe",4),rep("Bob",3)),
meal = c(seq(1:4),seq(1:3)),
food = c("Chicken", "Beef", "Soup and meal 2", "Lamb and meal 3",
"Lamb","Salad and meal 1","Beef"),
dependencies = c(NA,NA,2,3,NA,1,NA),
solo_meal = c(1,1,0,1,1,0,1))
which gives:
# A tibble: 7 x 5
person meal food dependencies solo_meal
<chr> <int> <chr> <dbl> <dbl>
1 Joe 1 Chicken NA 1
2 Joe 2 Beef NA 1
3 Joe 3 Soup and meal 2 2 0
4 Joe 4 Lamb and meal 3 3 1
5 Bob 1 Lamb NA 1
6 Bob 2 Salad and meal 1 1 0
7 Bob 3 Beef NA 1
I want to have a column of combined meals:
# A tibble: 7 x 1
combined_meal
<chr>
1 Chicken
2 Beef
3 Soup and Beef
4 Lamb and Soup and Beef
5 Lamb
6 Salad and Lamb
7 Beef
How do I recursively add the meals? Preferably using the tidyverse.
Thanks again!
r dplyr
r dplyr
edited Nov 15 at 16:08
asked Nov 13 at 0:49
JoeShmo
184
184
Re your edit, why is there no dependency Joe's meal 4?
– iod
Nov 14 at 23:55
I forgot to update that column in the edit. Should be fixed now.
– JoeShmo
Nov 15 at 16:09
add a comment |
Re your edit, why is there no dependency Joe's meal 4?
– iod
Nov 14 at 23:55
I forgot to update that column in the edit. Should be fixed now.
– JoeShmo
Nov 15 at 16:09
Re your edit, why is there no dependency Joe's meal 4?
– iod
Nov 14 at 23:55
Re your edit, why is there no dependency Joe's meal 4?
– iod
Nov 14 at 23:55
I forgot to update that column in the edit. Should be fixed now.
– JoeShmo
Nov 15 at 16:09
I forgot to update that column in the edit. Should be fixed now.
– JoeShmo
Nov 15 at 16:09
add a comment |
3 Answers
3
active
oldest
votes
A solution using the tidyverse
. The idea is to self join the df
table based on person
, dependencies
and mean
, and then with some further operations.
library(tidyverse)
df2 <- df %>%
left_join(df %>% select(-dependencies, -solo_meal),
by = c("person", "dependencies" = "meal")) %>%
mutate(food.z = str_replace(food.x, "meal [0-9]", "")) %>%
mutate(combined_meal = ifelse(is.na(food.y), food.z, str_c(food.z, food.y, sep = ""))) %>%
rename(food = food.x) %>%
select(names(df), combined_meal)
df2
# # A tibble: 7 x 6
# person meal food dependencies solo_meal combined_meal
# <chr> <int> <chr> <dbl> <dbl> <chr>
# 1 Joe 1 Chicken NA 1 Chicken
# 2 Joe 2 Beef NA 1 Beef
# 3 Joe 3 Soup and meal 2 2 0 Soup and Beef
# 4 Joe 4 Lamb NA 1 Lamb
# 5 Bob 1 Lamb NA 1 Lamb
# 6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
# 7 Bob 3 Beef NA 1 Beef
This works the best of the answers so far, but I'm not sure how to implement it with the updated example I gave. I think I need a second round of joining, but that seems like it will cause more headaches than anything.
– JoeShmo
Nov 13 at 22:15
@JoeShmo I don't know how to solve your updated question. If my answer solved your original question, perhaps you can accept my answer and then post a new question with your updated question. By doing that, more people are likely to see your question and help you.
– www
Nov 14 at 14:32
add a comment |
This is a base solution. (I find base solutions easier to understand.) You make an index vector of rows to modify and then build a new value from the items to be modified and the ones immediately preceding them ( which from your example appears to be the assigned task.
idx <- which(grepl("meal", df$food))
df[ idx, "combined_meal"] <-
paste( sub("meal.*$", "", df$food[idx] ), df$food [idx-1] )
# The fill in NA's with the original `food` values
df$combined_meal[ is.na(df$combined_meal)] <-
df$food[ is.na(df$combined_meal)]
> df
# A tibble: 7 x 6
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1 Chicken
2 Joe 2 Beef NA 1 Beef
3 Joe 3 Soup and meal 2 2 0 Soup and Beef
4 Joe 4 Lamb NA 1 Lamb
5 Bob 1 Lamb NA 1 Lamb
6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
7 Bob 3 Beef NA 1 Beef
>
That seems like a big assumption to make, and not part of the OP's description of the problem.
– iod
Nov 13 at 1:52
Agreed it was an assumption but one would need to make some sort of assumption about what the intended replacement would be, since THEY WERE NOT DESCRIBED.
– 42-
Nov 13 at 2:36
I think it's quite obvious that the replacement is the meal with the number that appears under "Dependencies" for the meal to be combined (e.g., for Joe, the dependency on row 3 is 2, which is beef).
– iod
Nov 13 at 2:39
It obviously was not obvious to me.
– 42-
Nov 13 at 2:42
I like the simplicity of the base example, but I need to move between more than 1 line at a time. I'm also not sure if it fits the updated example I gave.
– JoeShmo
Nov 13 at 22:14
add a comment |
Single line solution (using dplyr
):
df %>% group_by(person) %>%
mutate(combined_meal=ifelse(!is.na(dependencies), paste0(gsub("(.* and ).*","\1",food), food[dependencies]),food))
For each person
, we create a column combined_meal
where if there are no dependencies
, it repeats whatever's in food
, and if there is one, it paste
s together everything that comes before the word "and" with whatever's in the food column with the row number of the dependency.
(Note this assumes that the number in "dependency" is identical to the row number of the data frame if we only get the data frame for that person. That also implies the data frame is sorted by meal
. If that assumption is incorrect, you can include the line arrange(meal)
after the group_by
.)
Result:
# A tibble: 7 x 6
# Groups: person [2]
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1. Chicken
2 Joe 2 Beef NA 1. Beef
3 Joe 3 Soup and meal 2 2. 0. Soup and Beef
4 Joe 4 Lamb NA 1. Lamb
5 Bob 1 Lamb NA 1. Lamb
6 Bob 2 Salad and meal 1 1. 0. Salad and Lamb
7 Bob 3 Beef NA 1. Beef
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53272200%2fr-dplyr-combine-info-by-group-and-row-column-specification%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
A solution using the tidyverse
. The idea is to self join the df
table based on person
, dependencies
and mean
, and then with some further operations.
library(tidyverse)
df2 <- df %>%
left_join(df %>% select(-dependencies, -solo_meal),
by = c("person", "dependencies" = "meal")) %>%
mutate(food.z = str_replace(food.x, "meal [0-9]", "")) %>%
mutate(combined_meal = ifelse(is.na(food.y), food.z, str_c(food.z, food.y, sep = ""))) %>%
rename(food = food.x) %>%
select(names(df), combined_meal)
df2
# # A tibble: 7 x 6
# person meal food dependencies solo_meal combined_meal
# <chr> <int> <chr> <dbl> <dbl> <chr>
# 1 Joe 1 Chicken NA 1 Chicken
# 2 Joe 2 Beef NA 1 Beef
# 3 Joe 3 Soup and meal 2 2 0 Soup and Beef
# 4 Joe 4 Lamb NA 1 Lamb
# 5 Bob 1 Lamb NA 1 Lamb
# 6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
# 7 Bob 3 Beef NA 1 Beef
This works the best of the answers so far, but I'm not sure how to implement it with the updated example I gave. I think I need a second round of joining, but that seems like it will cause more headaches than anything.
– JoeShmo
Nov 13 at 22:15
@JoeShmo I don't know how to solve your updated question. If my answer solved your original question, perhaps you can accept my answer and then post a new question with your updated question. By doing that, more people are likely to see your question and help you.
– www
Nov 14 at 14:32
add a comment |
A solution using the tidyverse
. The idea is to self join the df
table based on person
, dependencies
and mean
, and then with some further operations.
library(tidyverse)
df2 <- df %>%
left_join(df %>% select(-dependencies, -solo_meal),
by = c("person", "dependencies" = "meal")) %>%
mutate(food.z = str_replace(food.x, "meal [0-9]", "")) %>%
mutate(combined_meal = ifelse(is.na(food.y), food.z, str_c(food.z, food.y, sep = ""))) %>%
rename(food = food.x) %>%
select(names(df), combined_meal)
df2
# # A tibble: 7 x 6
# person meal food dependencies solo_meal combined_meal
# <chr> <int> <chr> <dbl> <dbl> <chr>
# 1 Joe 1 Chicken NA 1 Chicken
# 2 Joe 2 Beef NA 1 Beef
# 3 Joe 3 Soup and meal 2 2 0 Soup and Beef
# 4 Joe 4 Lamb NA 1 Lamb
# 5 Bob 1 Lamb NA 1 Lamb
# 6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
# 7 Bob 3 Beef NA 1 Beef
This works the best of the answers so far, but I'm not sure how to implement it with the updated example I gave. I think I need a second round of joining, but that seems like it will cause more headaches than anything.
– JoeShmo
Nov 13 at 22:15
@JoeShmo I don't know how to solve your updated question. If my answer solved your original question, perhaps you can accept my answer and then post a new question with your updated question. By doing that, more people are likely to see your question and help you.
– www
Nov 14 at 14:32
add a comment |
A solution using the tidyverse
. The idea is to self join the df
table based on person
, dependencies
and mean
, and then with some further operations.
library(tidyverse)
df2 <- df %>%
left_join(df %>% select(-dependencies, -solo_meal),
by = c("person", "dependencies" = "meal")) %>%
mutate(food.z = str_replace(food.x, "meal [0-9]", "")) %>%
mutate(combined_meal = ifelse(is.na(food.y), food.z, str_c(food.z, food.y, sep = ""))) %>%
rename(food = food.x) %>%
select(names(df), combined_meal)
df2
# # A tibble: 7 x 6
# person meal food dependencies solo_meal combined_meal
# <chr> <int> <chr> <dbl> <dbl> <chr>
# 1 Joe 1 Chicken NA 1 Chicken
# 2 Joe 2 Beef NA 1 Beef
# 3 Joe 3 Soup and meal 2 2 0 Soup and Beef
# 4 Joe 4 Lamb NA 1 Lamb
# 5 Bob 1 Lamb NA 1 Lamb
# 6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
# 7 Bob 3 Beef NA 1 Beef
A solution using the tidyverse
. The idea is to self join the df
table based on person
, dependencies
and mean
, and then with some further operations.
library(tidyverse)
df2 <- df %>%
left_join(df %>% select(-dependencies, -solo_meal),
by = c("person", "dependencies" = "meal")) %>%
mutate(food.z = str_replace(food.x, "meal [0-9]", "")) %>%
mutate(combined_meal = ifelse(is.na(food.y), food.z, str_c(food.z, food.y, sep = ""))) %>%
rename(food = food.x) %>%
select(names(df), combined_meal)
df2
# # A tibble: 7 x 6
# person meal food dependencies solo_meal combined_meal
# <chr> <int> <chr> <dbl> <dbl> <chr>
# 1 Joe 1 Chicken NA 1 Chicken
# 2 Joe 2 Beef NA 1 Beef
# 3 Joe 3 Soup and meal 2 2 0 Soup and Beef
# 4 Joe 4 Lamb NA 1 Lamb
# 5 Bob 1 Lamb NA 1 Lamb
# 6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
# 7 Bob 3 Beef NA 1 Beef
answered Nov 13 at 1:30
www
25.8k102240
25.8k102240
This works the best of the answers so far, but I'm not sure how to implement it with the updated example I gave. I think I need a second round of joining, but that seems like it will cause more headaches than anything.
– JoeShmo
Nov 13 at 22:15
@JoeShmo I don't know how to solve your updated question. If my answer solved your original question, perhaps you can accept my answer and then post a new question with your updated question. By doing that, more people are likely to see your question and help you.
– www
Nov 14 at 14:32
add a comment |
This works the best of the answers so far, but I'm not sure how to implement it with the updated example I gave. I think I need a second round of joining, but that seems like it will cause more headaches than anything.
– JoeShmo
Nov 13 at 22:15
@JoeShmo I don't know how to solve your updated question. If my answer solved your original question, perhaps you can accept my answer and then post a new question with your updated question. By doing that, more people are likely to see your question and help you.
– www
Nov 14 at 14:32
This works the best of the answers so far, but I'm not sure how to implement it with the updated example I gave. I think I need a second round of joining, but that seems like it will cause more headaches than anything.
– JoeShmo
Nov 13 at 22:15
This works the best of the answers so far, but I'm not sure how to implement it with the updated example I gave. I think I need a second round of joining, but that seems like it will cause more headaches than anything.
– JoeShmo
Nov 13 at 22:15
@JoeShmo I don't know how to solve your updated question. If my answer solved your original question, perhaps you can accept my answer and then post a new question with your updated question. By doing that, more people are likely to see your question and help you.
– www
Nov 14 at 14:32
@JoeShmo I don't know how to solve your updated question. If my answer solved your original question, perhaps you can accept my answer and then post a new question with your updated question. By doing that, more people are likely to see your question and help you.
– www
Nov 14 at 14:32
add a comment |
This is a base solution. (I find base solutions easier to understand.) You make an index vector of rows to modify and then build a new value from the items to be modified and the ones immediately preceding them ( which from your example appears to be the assigned task.
idx <- which(grepl("meal", df$food))
df[ idx, "combined_meal"] <-
paste( sub("meal.*$", "", df$food[idx] ), df$food [idx-1] )
# The fill in NA's with the original `food` values
df$combined_meal[ is.na(df$combined_meal)] <-
df$food[ is.na(df$combined_meal)]
> df
# A tibble: 7 x 6
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1 Chicken
2 Joe 2 Beef NA 1 Beef
3 Joe 3 Soup and meal 2 2 0 Soup and Beef
4 Joe 4 Lamb NA 1 Lamb
5 Bob 1 Lamb NA 1 Lamb
6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
7 Bob 3 Beef NA 1 Beef
>
That seems like a big assumption to make, and not part of the OP's description of the problem.
– iod
Nov 13 at 1:52
Agreed it was an assumption but one would need to make some sort of assumption about what the intended replacement would be, since THEY WERE NOT DESCRIBED.
– 42-
Nov 13 at 2:36
I think it's quite obvious that the replacement is the meal with the number that appears under "Dependencies" for the meal to be combined (e.g., for Joe, the dependency on row 3 is 2, which is beef).
– iod
Nov 13 at 2:39
It obviously was not obvious to me.
– 42-
Nov 13 at 2:42
I like the simplicity of the base example, but I need to move between more than 1 line at a time. I'm also not sure if it fits the updated example I gave.
– JoeShmo
Nov 13 at 22:14
add a comment |
This is a base solution. (I find base solutions easier to understand.) You make an index vector of rows to modify and then build a new value from the items to be modified and the ones immediately preceding them ( which from your example appears to be the assigned task.
idx <- which(grepl("meal", df$food))
df[ idx, "combined_meal"] <-
paste( sub("meal.*$", "", df$food[idx] ), df$food [idx-1] )
# The fill in NA's with the original `food` values
df$combined_meal[ is.na(df$combined_meal)] <-
df$food[ is.na(df$combined_meal)]
> df
# A tibble: 7 x 6
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1 Chicken
2 Joe 2 Beef NA 1 Beef
3 Joe 3 Soup and meal 2 2 0 Soup and Beef
4 Joe 4 Lamb NA 1 Lamb
5 Bob 1 Lamb NA 1 Lamb
6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
7 Bob 3 Beef NA 1 Beef
>
That seems like a big assumption to make, and not part of the OP's description of the problem.
– iod
Nov 13 at 1:52
Agreed it was an assumption but one would need to make some sort of assumption about what the intended replacement would be, since THEY WERE NOT DESCRIBED.
– 42-
Nov 13 at 2:36
I think it's quite obvious that the replacement is the meal with the number that appears under "Dependencies" for the meal to be combined (e.g., for Joe, the dependency on row 3 is 2, which is beef).
– iod
Nov 13 at 2:39
It obviously was not obvious to me.
– 42-
Nov 13 at 2:42
I like the simplicity of the base example, but I need to move between more than 1 line at a time. I'm also not sure if it fits the updated example I gave.
– JoeShmo
Nov 13 at 22:14
add a comment |
This is a base solution. (I find base solutions easier to understand.) You make an index vector of rows to modify and then build a new value from the items to be modified and the ones immediately preceding them ( which from your example appears to be the assigned task.
idx <- which(grepl("meal", df$food))
df[ idx, "combined_meal"] <-
paste( sub("meal.*$", "", df$food[idx] ), df$food [idx-1] )
# The fill in NA's with the original `food` values
df$combined_meal[ is.na(df$combined_meal)] <-
df$food[ is.na(df$combined_meal)]
> df
# A tibble: 7 x 6
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1 Chicken
2 Joe 2 Beef NA 1 Beef
3 Joe 3 Soup and meal 2 2 0 Soup and Beef
4 Joe 4 Lamb NA 1 Lamb
5 Bob 1 Lamb NA 1 Lamb
6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
7 Bob 3 Beef NA 1 Beef
>
This is a base solution. (I find base solutions easier to understand.) You make an index vector of rows to modify and then build a new value from the items to be modified and the ones immediately preceding them ( which from your example appears to be the assigned task.
idx <- which(grepl("meal", df$food))
df[ idx, "combined_meal"] <-
paste( sub("meal.*$", "", df$food[idx] ), df$food [idx-1] )
# The fill in NA's with the original `food` values
df$combined_meal[ is.na(df$combined_meal)] <-
df$food[ is.na(df$combined_meal)]
> df
# A tibble: 7 x 6
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1 Chicken
2 Joe 2 Beef NA 1 Beef
3 Joe 3 Soup and meal 2 2 0 Soup and Beef
4 Joe 4 Lamb NA 1 Lamb
5 Bob 1 Lamb NA 1 Lamb
6 Bob 2 Salad and meal 1 1 0 Salad and Lamb
7 Bob 3 Beef NA 1 Beef
>
edited Nov 13 at 1:33
answered Nov 13 at 1:25
42-
211k14249394
211k14249394
That seems like a big assumption to make, and not part of the OP's description of the problem.
– iod
Nov 13 at 1:52
Agreed it was an assumption but one would need to make some sort of assumption about what the intended replacement would be, since THEY WERE NOT DESCRIBED.
– 42-
Nov 13 at 2:36
I think it's quite obvious that the replacement is the meal with the number that appears under "Dependencies" for the meal to be combined (e.g., for Joe, the dependency on row 3 is 2, which is beef).
– iod
Nov 13 at 2:39
It obviously was not obvious to me.
– 42-
Nov 13 at 2:42
I like the simplicity of the base example, but I need to move between more than 1 line at a time. I'm also not sure if it fits the updated example I gave.
– JoeShmo
Nov 13 at 22:14
add a comment |
That seems like a big assumption to make, and not part of the OP's description of the problem.
– iod
Nov 13 at 1:52
Agreed it was an assumption but one would need to make some sort of assumption about what the intended replacement would be, since THEY WERE NOT DESCRIBED.
– 42-
Nov 13 at 2:36
I think it's quite obvious that the replacement is the meal with the number that appears under "Dependencies" for the meal to be combined (e.g., for Joe, the dependency on row 3 is 2, which is beef).
– iod
Nov 13 at 2:39
It obviously was not obvious to me.
– 42-
Nov 13 at 2:42
I like the simplicity of the base example, but I need to move between more than 1 line at a time. I'm also not sure if it fits the updated example I gave.
– JoeShmo
Nov 13 at 22:14
That seems like a big assumption to make, and not part of the OP's description of the problem.
– iod
Nov 13 at 1:52
That seems like a big assumption to make, and not part of the OP's description of the problem.
– iod
Nov 13 at 1:52
Agreed it was an assumption but one would need to make some sort of assumption about what the intended replacement would be, since THEY WERE NOT DESCRIBED.
– 42-
Nov 13 at 2:36
Agreed it was an assumption but one would need to make some sort of assumption about what the intended replacement would be, since THEY WERE NOT DESCRIBED.
– 42-
Nov 13 at 2:36
I think it's quite obvious that the replacement is the meal with the number that appears under "Dependencies" for the meal to be combined (e.g., for Joe, the dependency on row 3 is 2, which is beef).
– iod
Nov 13 at 2:39
I think it's quite obvious that the replacement is the meal with the number that appears under "Dependencies" for the meal to be combined (e.g., for Joe, the dependency on row 3 is 2, which is beef).
– iod
Nov 13 at 2:39
It obviously was not obvious to me.
– 42-
Nov 13 at 2:42
It obviously was not obvious to me.
– 42-
Nov 13 at 2:42
I like the simplicity of the base example, but I need to move between more than 1 line at a time. I'm also not sure if it fits the updated example I gave.
– JoeShmo
Nov 13 at 22:14
I like the simplicity of the base example, but I need to move between more than 1 line at a time. I'm also not sure if it fits the updated example I gave.
– JoeShmo
Nov 13 at 22:14
add a comment |
Single line solution (using dplyr
):
df %>% group_by(person) %>%
mutate(combined_meal=ifelse(!is.na(dependencies), paste0(gsub("(.* and ).*","\1",food), food[dependencies]),food))
For each person
, we create a column combined_meal
where if there are no dependencies
, it repeats whatever's in food
, and if there is one, it paste
s together everything that comes before the word "and" with whatever's in the food column with the row number of the dependency.
(Note this assumes that the number in "dependency" is identical to the row number of the data frame if we only get the data frame for that person. That also implies the data frame is sorted by meal
. If that assumption is incorrect, you can include the line arrange(meal)
after the group_by
.)
Result:
# A tibble: 7 x 6
# Groups: person [2]
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1. Chicken
2 Joe 2 Beef NA 1. Beef
3 Joe 3 Soup and meal 2 2. 0. Soup and Beef
4 Joe 4 Lamb NA 1. Lamb
5 Bob 1 Lamb NA 1. Lamb
6 Bob 2 Salad and meal 1 1. 0. Salad and Lamb
7 Bob 3 Beef NA 1. Beef
add a comment |
Single line solution (using dplyr
):
df %>% group_by(person) %>%
mutate(combined_meal=ifelse(!is.na(dependencies), paste0(gsub("(.* and ).*","\1",food), food[dependencies]),food))
For each person
, we create a column combined_meal
where if there are no dependencies
, it repeats whatever's in food
, and if there is one, it paste
s together everything that comes before the word "and" with whatever's in the food column with the row number of the dependency.
(Note this assumes that the number in "dependency" is identical to the row number of the data frame if we only get the data frame for that person. That also implies the data frame is sorted by meal
. If that assumption is incorrect, you can include the line arrange(meal)
after the group_by
.)
Result:
# A tibble: 7 x 6
# Groups: person [2]
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1. Chicken
2 Joe 2 Beef NA 1. Beef
3 Joe 3 Soup and meal 2 2. 0. Soup and Beef
4 Joe 4 Lamb NA 1. Lamb
5 Bob 1 Lamb NA 1. Lamb
6 Bob 2 Salad and meal 1 1. 0. Salad and Lamb
7 Bob 3 Beef NA 1. Beef
add a comment |
Single line solution (using dplyr
):
df %>% group_by(person) %>%
mutate(combined_meal=ifelse(!is.na(dependencies), paste0(gsub("(.* and ).*","\1",food), food[dependencies]),food))
For each person
, we create a column combined_meal
where if there are no dependencies
, it repeats whatever's in food
, and if there is one, it paste
s together everything that comes before the word "and" with whatever's in the food column with the row number of the dependency.
(Note this assumes that the number in "dependency" is identical to the row number of the data frame if we only get the data frame for that person. That also implies the data frame is sorted by meal
. If that assumption is incorrect, you can include the line arrange(meal)
after the group_by
.)
Result:
# A tibble: 7 x 6
# Groups: person [2]
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1. Chicken
2 Joe 2 Beef NA 1. Beef
3 Joe 3 Soup and meal 2 2. 0. Soup and Beef
4 Joe 4 Lamb NA 1. Lamb
5 Bob 1 Lamb NA 1. Lamb
6 Bob 2 Salad and meal 1 1. 0. Salad and Lamb
7 Bob 3 Beef NA 1. Beef
Single line solution (using dplyr
):
df %>% group_by(person) %>%
mutate(combined_meal=ifelse(!is.na(dependencies), paste0(gsub("(.* and ).*","\1",food), food[dependencies]),food))
For each person
, we create a column combined_meal
where if there are no dependencies
, it repeats whatever's in food
, and if there is one, it paste
s together everything that comes before the word "and" with whatever's in the food column with the row number of the dependency.
(Note this assumes that the number in "dependency" is identical to the row number of the data frame if we only get the data frame for that person. That also implies the data frame is sorted by meal
. If that assumption is incorrect, you can include the line arrange(meal)
after the group_by
.)
Result:
# A tibble: 7 x 6
# Groups: person [2]
person meal food dependencies solo_meal combined_meal
<chr> <int> <chr> <dbl> <dbl> <chr>
1 Joe 1 Chicken NA 1. Chicken
2 Joe 2 Beef NA 1. Beef
3 Joe 3 Soup and meal 2 2. 0. Soup and Beef
4 Joe 4 Lamb NA 1. Lamb
5 Bob 1 Lamb NA 1. Lamb
6 Bob 2 Salad and meal 1 1. 0. Salad and Lamb
7 Bob 3 Beef NA 1. Beef
edited Nov 13 at 2:03
answered Nov 13 at 1:45
iod
3,4742721
3,4742721
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53272200%2fr-dplyr-combine-info-by-group-and-row-column-specification%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Re your edit, why is there no dependency Joe's meal 4?
– iod
Nov 14 at 23:55
I forgot to update that column in the edit. Should be fixed now.
– JoeShmo
Nov 15 at 16:09