split an array into a list of arrays
How can I split a 2D array by a grouping variable, and return a list of arrays please (also the order is important).
To show expected outcome, the equivalent in R can be done as
> (A = matrix(c("a", "b", "a", "c", "b", "d"), nr=3, byrow=TRUE)) # input
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "d"
> (split.data.frame(A, A[,1])) # output
$a
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
$b
[,1] [,2]
[1,] "b" "d"
EDIT: To clarify: I'd like to split the array/matrix, A into a list of multiple arrays based on the unique values in the first column. That is, split A into one array where the first column has an a, and another array where the first column has a b.
I have tried Python equivalent of R "split"-function but this gives three arrays
import numpy as np
import itertools
A = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
b = a[:,0]
def split(x, f):
return list(itertools.compress(x, f)), list(itertools.compress(x, (not i for i in f)))
split(A, b)
([array(['a', 'b'], dtype='<U1'),
array(['a', 'c'], dtype='<U1'),
array(['b', 'd'], dtype='<U1')],
)
And also numpy.split, using np.split(A, b), but which needs integers. I though I may be able to use How to convert strings into integers in Python? to convert the letters to integers, but even if I pass integers, it doesn't split as expected
c = np.transpose(np.array([1,1,2]))
np.split(A, c) # returns 4 arrays
Can this be done? thanks
EDIT: please note that this is a small example, and the number of groups may be greater than two and they may not be ordered.
python arrays numpy
add a comment |
How can I split a 2D array by a grouping variable, and return a list of arrays please (also the order is important).
To show expected outcome, the equivalent in R can be done as
> (A = matrix(c("a", "b", "a", "c", "b", "d"), nr=3, byrow=TRUE)) # input
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "d"
> (split.data.frame(A, A[,1])) # output
$a
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
$b
[,1] [,2]
[1,] "b" "d"
EDIT: To clarify: I'd like to split the array/matrix, A into a list of multiple arrays based on the unique values in the first column. That is, split A into one array where the first column has an a, and another array where the first column has a b.
I have tried Python equivalent of R "split"-function but this gives three arrays
import numpy as np
import itertools
A = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
b = a[:,0]
def split(x, f):
return list(itertools.compress(x, f)), list(itertools.compress(x, (not i for i in f)))
split(A, b)
([array(['a', 'b'], dtype='<U1'),
array(['a', 'c'], dtype='<U1'),
array(['b', 'd'], dtype='<U1')],
)
And also numpy.split, using np.split(A, b), but which needs integers. I though I may be able to use How to convert strings into integers in Python? to convert the letters to integers, but even if I pass integers, it doesn't split as expected
c = np.transpose(np.array([1,1,2]))
np.split(A, c) # returns 4 arrays
Can this be done? thanks
EDIT: please note that this is a small example, and the number of groups may be greater than two and they may not be ordered.
python arrays numpy
Not sure I understand your expected output @user2957945
– RafaelC
Nov 14 '18 at 18:45
okay, thanks @RafaelC -- I'll clarify
– user2957945
Nov 14 '18 at 18:46
add a comment |
How can I split a 2D array by a grouping variable, and return a list of arrays please (also the order is important).
To show expected outcome, the equivalent in R can be done as
> (A = matrix(c("a", "b", "a", "c", "b", "d"), nr=3, byrow=TRUE)) # input
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "d"
> (split.data.frame(A, A[,1])) # output
$a
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
$b
[,1] [,2]
[1,] "b" "d"
EDIT: To clarify: I'd like to split the array/matrix, A into a list of multiple arrays based on the unique values in the first column. That is, split A into one array where the first column has an a, and another array where the first column has a b.
I have tried Python equivalent of R "split"-function but this gives three arrays
import numpy as np
import itertools
A = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
b = a[:,0]
def split(x, f):
return list(itertools.compress(x, f)), list(itertools.compress(x, (not i for i in f)))
split(A, b)
([array(['a', 'b'], dtype='<U1'),
array(['a', 'c'], dtype='<U1'),
array(['b', 'd'], dtype='<U1')],
)
And also numpy.split, using np.split(A, b), but which needs integers. I though I may be able to use How to convert strings into integers in Python? to convert the letters to integers, but even if I pass integers, it doesn't split as expected
c = np.transpose(np.array([1,1,2]))
np.split(A, c) # returns 4 arrays
Can this be done? thanks
EDIT: please note that this is a small example, and the number of groups may be greater than two and they may not be ordered.
python arrays numpy
How can I split a 2D array by a grouping variable, and return a list of arrays please (also the order is important).
To show expected outcome, the equivalent in R can be done as
> (A = matrix(c("a", "b", "a", "c", "b", "d"), nr=3, byrow=TRUE)) # input
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "d"
> (split.data.frame(A, A[,1])) # output
$a
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
$b
[,1] [,2]
[1,] "b" "d"
EDIT: To clarify: I'd like to split the array/matrix, A into a list of multiple arrays based on the unique values in the first column. That is, split A into one array where the first column has an a, and another array where the first column has a b.
I have tried Python equivalent of R "split"-function but this gives three arrays
import numpy as np
import itertools
A = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
b = a[:,0]
def split(x, f):
return list(itertools.compress(x, f)), list(itertools.compress(x, (not i for i in f)))
split(A, b)
([array(['a', 'b'], dtype='<U1'),
array(['a', 'c'], dtype='<U1'),
array(['b', 'd'], dtype='<U1')],
)
And also numpy.split, using np.split(A, b), but which needs integers. I though I may be able to use How to convert strings into integers in Python? to convert the letters to integers, but even if I pass integers, it doesn't split as expected
c = np.transpose(np.array([1,1,2]))
np.split(A, c) # returns 4 arrays
Can this be done? thanks
EDIT: please note that this is a small example, and the number of groups may be greater than two and they may not be ordered.
python arrays numpy
python arrays numpy
edited Nov 14 '18 at 18:57
asked Nov 14 '18 at 18:42
user2957945
1,14011022
1,14011022
Not sure I understand your expected output @user2957945
– RafaelC
Nov 14 '18 at 18:45
okay, thanks @RafaelC -- I'll clarify
– user2957945
Nov 14 '18 at 18:46
add a comment |
Not sure I understand your expected output @user2957945
– RafaelC
Nov 14 '18 at 18:45
okay, thanks @RafaelC -- I'll clarify
– user2957945
Nov 14 '18 at 18:46
Not sure I understand your expected output @user2957945
– RafaelC
Nov 14 '18 at 18:45
Not sure I understand your expected output @user2957945
– RafaelC
Nov 14 '18 at 18:45
okay, thanks @RafaelC -- I'll clarify
– user2957945
Nov 14 '18 at 18:46
okay, thanks @RafaelC -- I'll clarify
– user2957945
Nov 14 '18 at 18:46
add a comment |
2 Answers
2
active
oldest
votes
You can use pandas:
import pandas as pd
import numpy as np
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
listofdfs = {}
for n,g in pd.DataFrame(a).groupby(0):
listofdfs[n] = g
listofdfs['a'].values
Output:
array([['a', 'b'],
['a', 'c']], dtype=object)
And,
listofdfs['b'].values
Output:
array([['b', 'd']], dtype=object)
Or, you could use itertools groupby:
import numpy as np
from itertools import groupby
l = [np.stack(list(g)) for k, g in groupby(a, lambda x: x[0])]
l[0]
Output:
array([['a', 'b'],
['a', 'c']], dtype='<U1')
And,
l[1]
Output:
array([['b', 'd']], dtype='<U1')
1
Great, thanks Scott, that looks good. I'd considered coercing to a dataframe but I thought there may be array tools -- but this is good.
– user2957945
Nov 14 '18 at 19:14
@user2957945 I did add an update.
– Scott Boston
Nov 14 '18 at 19:27
brilliant, thank you very much. I'm trapesing through stackoverflow.com/questions/773/…, so your edit gives me the output for my understanding to work towards
– user2957945
Nov 14 '18 at 19:30
add a comment |
If I understand your question, you can do simple slicing, as in:
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
x,y=a[:2,:],a[2,:]
x
array([['a', 'b'],
['a', 'c']], dtype='<U1')
y
array(['b', 'd'], dtype='<U1')
Hi G.Anderson, thank you for your answer. This would fail fora = np.array([["a", "b"], ["b", "d"], ["a", "c"], ["b", "d"]]), or if there were more groups. Apologies maybe my example was to minimal.
– user2957945
Nov 14 '18 at 18:55
I see. I answered before you edited about grouping the splits based on value. Perhaps this answer might help?
– G. Anderson
Nov 14 '18 at 19:07
Thanks. That looks promising-- I'm just trying to tweak it to my example.
– user2957945
Nov 14 '18 at 19:13
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306827%2fsplit-an-array-into-a-list-of-arrays%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use pandas:
import pandas as pd
import numpy as np
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
listofdfs = {}
for n,g in pd.DataFrame(a).groupby(0):
listofdfs[n] = g
listofdfs['a'].values
Output:
array([['a', 'b'],
['a', 'c']], dtype=object)
And,
listofdfs['b'].values
Output:
array([['b', 'd']], dtype=object)
Or, you could use itertools groupby:
import numpy as np
from itertools import groupby
l = [np.stack(list(g)) for k, g in groupby(a, lambda x: x[0])]
l[0]
Output:
array([['a', 'b'],
['a', 'c']], dtype='<U1')
And,
l[1]
Output:
array([['b', 'd']], dtype='<U1')
1
Great, thanks Scott, that looks good. I'd considered coercing to a dataframe but I thought there may be array tools -- but this is good.
– user2957945
Nov 14 '18 at 19:14
@user2957945 I did add an update.
– Scott Boston
Nov 14 '18 at 19:27
brilliant, thank you very much. I'm trapesing through stackoverflow.com/questions/773/…, so your edit gives me the output for my understanding to work towards
– user2957945
Nov 14 '18 at 19:30
add a comment |
You can use pandas:
import pandas as pd
import numpy as np
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
listofdfs = {}
for n,g in pd.DataFrame(a).groupby(0):
listofdfs[n] = g
listofdfs['a'].values
Output:
array([['a', 'b'],
['a', 'c']], dtype=object)
And,
listofdfs['b'].values
Output:
array([['b', 'd']], dtype=object)
Or, you could use itertools groupby:
import numpy as np
from itertools import groupby
l = [np.stack(list(g)) for k, g in groupby(a, lambda x: x[0])]
l[0]
Output:
array([['a', 'b'],
['a', 'c']], dtype='<U1')
And,
l[1]
Output:
array([['b', 'd']], dtype='<U1')
1
Great, thanks Scott, that looks good. I'd considered coercing to a dataframe but I thought there may be array tools -- but this is good.
– user2957945
Nov 14 '18 at 19:14
@user2957945 I did add an update.
– Scott Boston
Nov 14 '18 at 19:27
brilliant, thank you very much. I'm trapesing through stackoverflow.com/questions/773/…, so your edit gives me the output for my understanding to work towards
– user2957945
Nov 14 '18 at 19:30
add a comment |
You can use pandas:
import pandas as pd
import numpy as np
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
listofdfs = {}
for n,g in pd.DataFrame(a).groupby(0):
listofdfs[n] = g
listofdfs['a'].values
Output:
array([['a', 'b'],
['a', 'c']], dtype=object)
And,
listofdfs['b'].values
Output:
array([['b', 'd']], dtype=object)
Or, you could use itertools groupby:
import numpy as np
from itertools import groupby
l = [np.stack(list(g)) for k, g in groupby(a, lambda x: x[0])]
l[0]
Output:
array([['a', 'b'],
['a', 'c']], dtype='<U1')
And,
l[1]
Output:
array([['b', 'd']], dtype='<U1')
You can use pandas:
import pandas as pd
import numpy as np
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
listofdfs = {}
for n,g in pd.DataFrame(a).groupby(0):
listofdfs[n] = g
listofdfs['a'].values
Output:
array([['a', 'b'],
['a', 'c']], dtype=object)
And,
listofdfs['b'].values
Output:
array([['b', 'd']], dtype=object)
Or, you could use itertools groupby:
import numpy as np
from itertools import groupby
l = [np.stack(list(g)) for k, g in groupby(a, lambda x: x[0])]
l[0]
Output:
array([['a', 'b'],
['a', 'c']], dtype='<U1')
And,
l[1]
Output:
array([['b', 'd']], dtype='<U1')
edited Nov 14 '18 at 19:24
answered Nov 14 '18 at 19:08
Scott Boston
52k72955
52k72955
1
Great, thanks Scott, that looks good. I'd considered coercing to a dataframe but I thought there may be array tools -- but this is good.
– user2957945
Nov 14 '18 at 19:14
@user2957945 I did add an update.
– Scott Boston
Nov 14 '18 at 19:27
brilliant, thank you very much. I'm trapesing through stackoverflow.com/questions/773/…, so your edit gives me the output for my understanding to work towards
– user2957945
Nov 14 '18 at 19:30
add a comment |
1
Great, thanks Scott, that looks good. I'd considered coercing to a dataframe but I thought there may be array tools -- but this is good.
– user2957945
Nov 14 '18 at 19:14
@user2957945 I did add an update.
– Scott Boston
Nov 14 '18 at 19:27
brilliant, thank you very much. I'm trapesing through stackoverflow.com/questions/773/…, so your edit gives me the output for my understanding to work towards
– user2957945
Nov 14 '18 at 19:30
1
1
Great, thanks Scott, that looks good. I'd considered coercing to a dataframe but I thought there may be array tools -- but this is good.
– user2957945
Nov 14 '18 at 19:14
Great, thanks Scott, that looks good. I'd considered coercing to a dataframe but I thought there may be array tools -- but this is good.
– user2957945
Nov 14 '18 at 19:14
@user2957945 I did add an update.
– Scott Boston
Nov 14 '18 at 19:27
@user2957945 I did add an update.
– Scott Boston
Nov 14 '18 at 19:27
brilliant, thank you very much. I'm trapesing through stackoverflow.com/questions/773/…, so your edit gives me the output for my understanding to work towards
– user2957945
Nov 14 '18 at 19:30
brilliant, thank you very much. I'm trapesing through stackoverflow.com/questions/773/…, so your edit gives me the output for my understanding to work towards
– user2957945
Nov 14 '18 at 19:30
add a comment |
If I understand your question, you can do simple slicing, as in:
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
x,y=a[:2,:],a[2,:]
x
array([['a', 'b'],
['a', 'c']], dtype='<U1')
y
array(['b', 'd'], dtype='<U1')
Hi G.Anderson, thank you for your answer. This would fail fora = np.array([["a", "b"], ["b", "d"], ["a", "c"], ["b", "d"]]), or if there were more groups. Apologies maybe my example was to minimal.
– user2957945
Nov 14 '18 at 18:55
I see. I answered before you edited about grouping the splits based on value. Perhaps this answer might help?
– G. Anderson
Nov 14 '18 at 19:07
Thanks. That looks promising-- I'm just trying to tweak it to my example.
– user2957945
Nov 14 '18 at 19:13
add a comment |
If I understand your question, you can do simple slicing, as in:
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
x,y=a[:2,:],a[2,:]
x
array([['a', 'b'],
['a', 'c']], dtype='<U1')
y
array(['b', 'd'], dtype='<U1')
Hi G.Anderson, thank you for your answer. This would fail fora = np.array([["a", "b"], ["b", "d"], ["a", "c"], ["b", "d"]]), or if there were more groups. Apologies maybe my example was to minimal.
– user2957945
Nov 14 '18 at 18:55
I see. I answered before you edited about grouping the splits based on value. Perhaps this answer might help?
– G. Anderson
Nov 14 '18 at 19:07
Thanks. That looks promising-- I'm just trying to tweak it to my example.
– user2957945
Nov 14 '18 at 19:13
add a comment |
If I understand your question, you can do simple slicing, as in:
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
x,y=a[:2,:],a[2,:]
x
array([['a', 'b'],
['a', 'c']], dtype='<U1')
y
array(['b', 'd'], dtype='<U1')
If I understand your question, you can do simple slicing, as in:
a = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
x,y=a[:2,:],a[2,:]
x
array([['a', 'b'],
['a', 'c']], dtype='<U1')
y
array(['b', 'd'], dtype='<U1')
answered Nov 14 '18 at 18:49
G. Anderson
1,06829
1,06829
Hi G.Anderson, thank you for your answer. This would fail fora = np.array([["a", "b"], ["b", "d"], ["a", "c"], ["b", "d"]]), or if there were more groups. Apologies maybe my example was to minimal.
– user2957945
Nov 14 '18 at 18:55
I see. I answered before you edited about grouping the splits based on value. Perhaps this answer might help?
– G. Anderson
Nov 14 '18 at 19:07
Thanks. That looks promising-- I'm just trying to tweak it to my example.
– user2957945
Nov 14 '18 at 19:13
add a comment |
Hi G.Anderson, thank you for your answer. This would fail fora = np.array([["a", "b"], ["b", "d"], ["a", "c"], ["b", "d"]]), or if there were more groups. Apologies maybe my example was to minimal.
– user2957945
Nov 14 '18 at 18:55
I see. I answered before you edited about grouping the splits based on value. Perhaps this answer might help?
– G. Anderson
Nov 14 '18 at 19:07
Thanks. That looks promising-- I'm just trying to tweak it to my example.
– user2957945
Nov 14 '18 at 19:13
Hi G.Anderson, thank you for your answer. This would fail for
a = np.array([["a", "b"], ["b", "d"], ["a", "c"], ["b", "d"]]), or if there were more groups. Apologies maybe my example was to minimal.– user2957945
Nov 14 '18 at 18:55
Hi G.Anderson, thank you for your answer. This would fail for
a = np.array([["a", "b"], ["b", "d"], ["a", "c"], ["b", "d"]]), or if there were more groups. Apologies maybe my example was to minimal.– user2957945
Nov 14 '18 at 18:55
I see. I answered before you edited about grouping the splits based on value. Perhaps this answer might help?
– G. Anderson
Nov 14 '18 at 19:07
I see. I answered before you edited about grouping the splits based on value. Perhaps this answer might help?
– G. Anderson
Nov 14 '18 at 19:07
Thanks. That looks promising-- I'm just trying to tweak it to my example.
– user2957945
Nov 14 '18 at 19:13
Thanks. That looks promising-- I'm just trying to tweak it to my example.
– user2957945
Nov 14 '18 at 19:13
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306827%2fsplit-an-array-into-a-list-of-arrays%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Not sure I understand your expected output @user2957945
– RafaelC
Nov 14 '18 at 18:45
okay, thanks @RafaelC -- I'll clarify
– user2957945
Nov 14 '18 at 18:46