Manipulating data into histogram like bins
I would like to change the format of my data for some specific code that I am working on. Below are the first 50 observations and the format it is in, each individual has its own line with the observation number, species, length (mm), weight (kg), and mesh size of the net it was caught in (in inches).
fish_data <- read.table(header = T,
text = "Index Species Length Weight mesh
1 SVCP 450 1.26 4
2 SVCP 584 2.24 3
3 SVCP 586 2.46 3
6 SVCP 590 2.4 3
7 SVCP 590 2.04 3
8 SVCP 594 2.62 3
9 SVCP 595 2.24 3
10 SVCP 595 2.04 3
11 SVCP 596 2.46 3
12 SVCP 603 2.6 3
13 SVCP 603 2.44 3
14 SVCP 604 2.68 3
15 SVCP 604 2.48 3
16 SVCP 606 2.06 3
17 SVCP 609 3.74 5
18 SVCP 609 2.44 3
20 SVCP 611 2.56 3
30 SVCP 618 2.52 3
31 SVCP 620 2.66 3
32 SVCP 620 2.66 3
33 SVCP 621 2.72 3
34 SVCP 625 2.8 3
36 SVCP 625 2.08 3
37 SVCP 626 2.74 3
38 SVCP 627 2.09 3
39 SVCP 627 2.82 3
40 SVCP 628 2.8 3
41 SVCP 630 2.68 3
42 SVCP 630 2.82 3
43 SVCP 637 3 3
45 SVCP 639 2.54 3
47 SVCP 640 3.01 3
49 SVCP 643 3.36 3
50 SVCP 644 6.82 4.25")
I would like to change the format to something like this below. Where the first column is the mesh size of the net, and the subsequent columns are the number of observations in specific length bin (for example 101-105mm, 106-110mm, 111-115 mm... ect.). I will be using 10 mm length bins.
52.5 52 11 1 1 0 0 0 0
54.5 102 91 16 4 4 2 0 3
56.5 295 232 131 61 17 13 3 1
58.5 309 318 362 243 95 26 4 3
60.5 118 173 326 342 199 100 10 11
62.5 79 87 191 239 202 201 39 15
64.5 27 48 111 143 133 185 72 25
66.5 14 17 44 51 52 122 74 41
68.5 8 6 14 23 25 59 65 76
70.5 7 3 8 14 15 16 34 33
72.5 0 3 1 2 5 4 6 15
r histogram bins
add a comment |
I would like to change the format of my data for some specific code that I am working on. Below are the first 50 observations and the format it is in, each individual has its own line with the observation number, species, length (mm), weight (kg), and mesh size of the net it was caught in (in inches).
fish_data <- read.table(header = T,
text = "Index Species Length Weight mesh
1 SVCP 450 1.26 4
2 SVCP 584 2.24 3
3 SVCP 586 2.46 3
6 SVCP 590 2.4 3
7 SVCP 590 2.04 3
8 SVCP 594 2.62 3
9 SVCP 595 2.24 3
10 SVCP 595 2.04 3
11 SVCP 596 2.46 3
12 SVCP 603 2.6 3
13 SVCP 603 2.44 3
14 SVCP 604 2.68 3
15 SVCP 604 2.48 3
16 SVCP 606 2.06 3
17 SVCP 609 3.74 5
18 SVCP 609 2.44 3
20 SVCP 611 2.56 3
30 SVCP 618 2.52 3
31 SVCP 620 2.66 3
32 SVCP 620 2.66 3
33 SVCP 621 2.72 3
34 SVCP 625 2.8 3
36 SVCP 625 2.08 3
37 SVCP 626 2.74 3
38 SVCP 627 2.09 3
39 SVCP 627 2.82 3
40 SVCP 628 2.8 3
41 SVCP 630 2.68 3
42 SVCP 630 2.82 3
43 SVCP 637 3 3
45 SVCP 639 2.54 3
47 SVCP 640 3.01 3
49 SVCP 643 3.36 3
50 SVCP 644 6.82 4.25")
I would like to change the format to something like this below. Where the first column is the mesh size of the net, and the subsequent columns are the number of observations in specific length bin (for example 101-105mm, 106-110mm, 111-115 mm... ect.). I will be using 10 mm length bins.
52.5 52 11 1 1 0 0 0 0
54.5 102 91 16 4 4 2 0 3
56.5 295 232 131 61 17 13 3 1
58.5 309 318 362 243 95 26 4 3
60.5 118 173 326 342 199 100 10 11
62.5 79 87 191 239 202 201 39 15
64.5 27 48 111 143 133 185 72 25
66.5 14 17 44 51 52 122 74 41
68.5 8 6 14 23 25 59 65 76
70.5 7 3 8 14 15 16 34 33
72.5 0 3 1 2 5 4 6 15
r histogram bins
1
Please review how to share your data in a reproducible format
– Conor Neilson
Nov 20 '18 at 23:58
1
It is not clear how the second data table is related to the first data table? Please explain how rows in the second table are computed?
– TeeKea
Nov 21 '18 at 0:05
They are not related, it is an example of what I need to do. The rows are counts for a specific mesh size and the number of fish in a size bin. For example in the 1st row: the 1st value is the mesh size of the net (52.5 units of measure), the 2nd value (52) is the number of fish in a certain size bin caught in that net.
– fishy_stats
Nov 21 '18 at 0:12
hey fishy welcome to stack. next time you post a question use thatread.table()
pattern to share data
– Nate
Nov 21 '18 at 0:21
hist(..., plot = FALSE)
will put your data into histogram bins. Specifybreaks = c(...)
for your bin intervals
– Umaomamaomao
Nov 21 '18 at 0:23
add a comment |
I would like to change the format of my data for some specific code that I am working on. Below are the first 50 observations and the format it is in, each individual has its own line with the observation number, species, length (mm), weight (kg), and mesh size of the net it was caught in (in inches).
fish_data <- read.table(header = T,
text = "Index Species Length Weight mesh
1 SVCP 450 1.26 4
2 SVCP 584 2.24 3
3 SVCP 586 2.46 3
6 SVCP 590 2.4 3
7 SVCP 590 2.04 3
8 SVCP 594 2.62 3
9 SVCP 595 2.24 3
10 SVCP 595 2.04 3
11 SVCP 596 2.46 3
12 SVCP 603 2.6 3
13 SVCP 603 2.44 3
14 SVCP 604 2.68 3
15 SVCP 604 2.48 3
16 SVCP 606 2.06 3
17 SVCP 609 3.74 5
18 SVCP 609 2.44 3
20 SVCP 611 2.56 3
30 SVCP 618 2.52 3
31 SVCP 620 2.66 3
32 SVCP 620 2.66 3
33 SVCP 621 2.72 3
34 SVCP 625 2.8 3
36 SVCP 625 2.08 3
37 SVCP 626 2.74 3
38 SVCP 627 2.09 3
39 SVCP 627 2.82 3
40 SVCP 628 2.8 3
41 SVCP 630 2.68 3
42 SVCP 630 2.82 3
43 SVCP 637 3 3
45 SVCP 639 2.54 3
47 SVCP 640 3.01 3
49 SVCP 643 3.36 3
50 SVCP 644 6.82 4.25")
I would like to change the format to something like this below. Where the first column is the mesh size of the net, and the subsequent columns are the number of observations in specific length bin (for example 101-105mm, 106-110mm, 111-115 mm... ect.). I will be using 10 mm length bins.
52.5 52 11 1 1 0 0 0 0
54.5 102 91 16 4 4 2 0 3
56.5 295 232 131 61 17 13 3 1
58.5 309 318 362 243 95 26 4 3
60.5 118 173 326 342 199 100 10 11
62.5 79 87 191 239 202 201 39 15
64.5 27 48 111 143 133 185 72 25
66.5 14 17 44 51 52 122 74 41
68.5 8 6 14 23 25 59 65 76
70.5 7 3 8 14 15 16 34 33
72.5 0 3 1 2 5 4 6 15
r histogram bins
I would like to change the format of my data for some specific code that I am working on. Below are the first 50 observations and the format it is in, each individual has its own line with the observation number, species, length (mm), weight (kg), and mesh size of the net it was caught in (in inches).
fish_data <- read.table(header = T,
text = "Index Species Length Weight mesh
1 SVCP 450 1.26 4
2 SVCP 584 2.24 3
3 SVCP 586 2.46 3
6 SVCP 590 2.4 3
7 SVCP 590 2.04 3
8 SVCP 594 2.62 3
9 SVCP 595 2.24 3
10 SVCP 595 2.04 3
11 SVCP 596 2.46 3
12 SVCP 603 2.6 3
13 SVCP 603 2.44 3
14 SVCP 604 2.68 3
15 SVCP 604 2.48 3
16 SVCP 606 2.06 3
17 SVCP 609 3.74 5
18 SVCP 609 2.44 3
20 SVCP 611 2.56 3
30 SVCP 618 2.52 3
31 SVCP 620 2.66 3
32 SVCP 620 2.66 3
33 SVCP 621 2.72 3
34 SVCP 625 2.8 3
36 SVCP 625 2.08 3
37 SVCP 626 2.74 3
38 SVCP 627 2.09 3
39 SVCP 627 2.82 3
40 SVCP 628 2.8 3
41 SVCP 630 2.68 3
42 SVCP 630 2.82 3
43 SVCP 637 3 3
45 SVCP 639 2.54 3
47 SVCP 640 3.01 3
49 SVCP 643 3.36 3
50 SVCP 644 6.82 4.25")
I would like to change the format to something like this below. Where the first column is the mesh size of the net, and the subsequent columns are the number of observations in specific length bin (for example 101-105mm, 106-110mm, 111-115 mm... ect.). I will be using 10 mm length bins.
52.5 52 11 1 1 0 0 0 0
54.5 102 91 16 4 4 2 0 3
56.5 295 232 131 61 17 13 3 1
58.5 309 318 362 243 95 26 4 3
60.5 118 173 326 342 199 100 10 11
62.5 79 87 191 239 202 201 39 15
64.5 27 48 111 143 133 185 72 25
66.5 14 17 44 51 52 122 74 41
68.5 8 6 14 23 25 59 65 76
70.5 7 3 8 14 15 16 34 33
72.5 0 3 1 2 5 4 6 15
r histogram bins
r histogram bins
edited Nov 21 '18 at 0:20
Nate
6,58512030
6,58512030
asked Nov 20 '18 at 23:51
fishy_statsfishy_stats
53
53
1
Please review how to share your data in a reproducible format
– Conor Neilson
Nov 20 '18 at 23:58
1
It is not clear how the second data table is related to the first data table? Please explain how rows in the second table are computed?
– TeeKea
Nov 21 '18 at 0:05
They are not related, it is an example of what I need to do. The rows are counts for a specific mesh size and the number of fish in a size bin. For example in the 1st row: the 1st value is the mesh size of the net (52.5 units of measure), the 2nd value (52) is the number of fish in a certain size bin caught in that net.
– fishy_stats
Nov 21 '18 at 0:12
hey fishy welcome to stack. next time you post a question use thatread.table()
pattern to share data
– Nate
Nov 21 '18 at 0:21
hist(..., plot = FALSE)
will put your data into histogram bins. Specifybreaks = c(...)
for your bin intervals
– Umaomamaomao
Nov 21 '18 at 0:23
add a comment |
1
Please review how to share your data in a reproducible format
– Conor Neilson
Nov 20 '18 at 23:58
1
It is not clear how the second data table is related to the first data table? Please explain how rows in the second table are computed?
– TeeKea
Nov 21 '18 at 0:05
They are not related, it is an example of what I need to do. The rows are counts for a specific mesh size and the number of fish in a size bin. For example in the 1st row: the 1st value is the mesh size of the net (52.5 units of measure), the 2nd value (52) is the number of fish in a certain size bin caught in that net.
– fishy_stats
Nov 21 '18 at 0:12
hey fishy welcome to stack. next time you post a question use thatread.table()
pattern to share data
– Nate
Nov 21 '18 at 0:21
hist(..., plot = FALSE)
will put your data into histogram bins. Specifybreaks = c(...)
for your bin intervals
– Umaomamaomao
Nov 21 '18 at 0:23
1
1
Please review how to share your data in a reproducible format
– Conor Neilson
Nov 20 '18 at 23:58
Please review how to share your data in a reproducible format
– Conor Neilson
Nov 20 '18 at 23:58
1
1
It is not clear how the second data table is related to the first data table? Please explain how rows in the second table are computed?
– TeeKea
Nov 21 '18 at 0:05
It is not clear how the second data table is related to the first data table? Please explain how rows in the second table are computed?
– TeeKea
Nov 21 '18 at 0:05
They are not related, it is an example of what I need to do. The rows are counts for a specific mesh size and the number of fish in a size bin. For example in the 1st row: the 1st value is the mesh size of the net (52.5 units of measure), the 2nd value (52) is the number of fish in a certain size bin caught in that net.
– fishy_stats
Nov 21 '18 at 0:12
They are not related, it is an example of what I need to do. The rows are counts for a specific mesh size and the number of fish in a size bin. For example in the 1st row: the 1st value is the mesh size of the net (52.5 units of measure), the 2nd value (52) is the number of fish in a certain size bin caught in that net.
– fishy_stats
Nov 21 '18 at 0:12
hey fishy welcome to stack. next time you post a question use that
read.table()
pattern to share data– Nate
Nov 21 '18 at 0:21
hey fishy welcome to stack. next time you post a question use that
read.table()
pattern to share data– Nate
Nov 21 '18 at 0:21
hist(..., plot = FALSE)
will put your data into histogram bins. Specify breaks = c(...)
for your bin intervals– Umaomamaomao
Nov 21 '18 at 0:23
hist(..., plot = FALSE)
will put your data into histogram bins. Specify breaks = c(...)
for your bin intervals– Umaomamaomao
Nov 21 '18 at 0:23
add a comment |
1 Answer
1
active
oldest
votes
Here's an approach using dplyr
and tidyr
from the tidyverse
meta-package. First I create a new variable Length_bin
to assign the bin, then count how many in each mesh side are in each bin, then spread from long format to wide format.
library(tidyverse)
fish_data %>%
mutate(Length_bin = (floor(Length / 5) * 5)) %>%
count(mesh, Length_bin) %>%
spread(Length_bin, n, fill = 0)
# A tibble: 4 x 15
# mesh `450` `580` `585` `590` `595` `600` `605` `610` `615` `620` `625` `630` `635` `640`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 3 0 1 1 3 3 4 2 1 1 3 6 2 2 2
#2 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0
#3 4.25 0 0 0 0 0 0 0 0 0 0 0 0 0 1
#4 5 0 0 0 0 0 0 1 0 0 0 0 0 0 0
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403349%2fmanipulating-data-into-histogram-like-bins%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here's an approach using dplyr
and tidyr
from the tidyverse
meta-package. First I create a new variable Length_bin
to assign the bin, then count how many in each mesh side are in each bin, then spread from long format to wide format.
library(tidyverse)
fish_data %>%
mutate(Length_bin = (floor(Length / 5) * 5)) %>%
count(mesh, Length_bin) %>%
spread(Length_bin, n, fill = 0)
# A tibble: 4 x 15
# mesh `450` `580` `585` `590` `595` `600` `605` `610` `615` `620` `625` `630` `635` `640`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 3 0 1 1 3 3 4 2 1 1 3 6 2 2 2
#2 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0
#3 4.25 0 0 0 0 0 0 0 0 0 0 0 0 0 1
#4 5 0 0 0 0 0 0 1 0 0 0 0 0 0 0
add a comment |
Here's an approach using dplyr
and tidyr
from the tidyverse
meta-package. First I create a new variable Length_bin
to assign the bin, then count how many in each mesh side are in each bin, then spread from long format to wide format.
library(tidyverse)
fish_data %>%
mutate(Length_bin = (floor(Length / 5) * 5)) %>%
count(mesh, Length_bin) %>%
spread(Length_bin, n, fill = 0)
# A tibble: 4 x 15
# mesh `450` `580` `585` `590` `595` `600` `605` `610` `615` `620` `625` `630` `635` `640`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 3 0 1 1 3 3 4 2 1 1 3 6 2 2 2
#2 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0
#3 4.25 0 0 0 0 0 0 0 0 0 0 0 0 0 1
#4 5 0 0 0 0 0 0 1 0 0 0 0 0 0 0
add a comment |
Here's an approach using dplyr
and tidyr
from the tidyverse
meta-package. First I create a new variable Length_bin
to assign the bin, then count how many in each mesh side are in each bin, then spread from long format to wide format.
library(tidyverse)
fish_data %>%
mutate(Length_bin = (floor(Length / 5) * 5)) %>%
count(mesh, Length_bin) %>%
spread(Length_bin, n, fill = 0)
# A tibble: 4 x 15
# mesh `450` `580` `585` `590` `595` `600` `605` `610` `615` `620` `625` `630` `635` `640`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 3 0 1 1 3 3 4 2 1 1 3 6 2 2 2
#2 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0
#3 4.25 0 0 0 0 0 0 0 0 0 0 0 0 0 1
#4 5 0 0 0 0 0 0 1 0 0 0 0 0 0 0
Here's an approach using dplyr
and tidyr
from the tidyverse
meta-package. First I create a new variable Length_bin
to assign the bin, then count how many in each mesh side are in each bin, then spread from long format to wide format.
library(tidyverse)
fish_data %>%
mutate(Length_bin = (floor(Length / 5) * 5)) %>%
count(mesh, Length_bin) %>%
spread(Length_bin, n, fill = 0)
# A tibble: 4 x 15
# mesh `450` `580` `585` `590` `595` `600` `605` `610` `615` `620` `625` `630` `635` `640`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 3 0 1 1 3 3 4 2 1 1 3 6 2 2 2
#2 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0
#3 4.25 0 0 0 0 0 0 0 0 0 0 0 0 0 1
#4 5 0 0 0 0 0 0 1 0 0 0 0 0 0 0
answered Nov 21 '18 at 1:23
Jon SpringJon Spring
6,9881829
6,9881829
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403349%2fmanipulating-data-into-histogram-like-bins%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Please review how to share your data in a reproducible format
– Conor Neilson
Nov 20 '18 at 23:58
1
It is not clear how the second data table is related to the first data table? Please explain how rows in the second table are computed?
– TeeKea
Nov 21 '18 at 0:05
They are not related, it is an example of what I need to do. The rows are counts for a specific mesh size and the number of fish in a size bin. For example in the 1st row: the 1st value is the mesh size of the net (52.5 units of measure), the 2nd value (52) is the number of fish in a certain size bin caught in that net.
– fishy_stats
Nov 21 '18 at 0:12
hey fishy welcome to stack. next time you post a question use that
read.table()
pattern to share data– Nate
Nov 21 '18 at 0:21
hist(..., plot = FALSE)
will put your data into histogram bins. Specifybreaks = c(...)
for your bin intervals– Umaomamaomao
Nov 21 '18 at 0:23