Make a numpy array monotonic without a Python loop
I have a 1D array of values which is supposed to be monotonic (let's say decreasing), but there are random regions where the value increases with index.
I need an array where each region is replaced with a value directly preceding it, so that the resulting array is sorted.
So if given array is:
a = np.array([10.0, 9.5, 8.0, 7.2, 7.8, 8.0, 7.0, 5.0, 3.0, 2.5, 3.0, 2.0])
I want the result to be
b = np.array([10.0, 9.5, 8.0, 7.2, 7.2, 7.2, 7.0, 5.0, 3.0, 2.5, 2.5, 2.0])
Here's a graphical representation:
I know how to achieve it with a Python loop, but is there a way to do this with NumPy machinery?
Python code for clarity:
b = np.array(a)
for i in range(1, b.size):
if b[i] > b[i-1]:
b[i] = b[i-1]
python arrays numpy vectorization
add a comment |
I have a 1D array of values which is supposed to be monotonic (let's say decreasing), but there are random regions where the value increases with index.
I need an array where each region is replaced with a value directly preceding it, so that the resulting array is sorted.
So if given array is:
a = np.array([10.0, 9.5, 8.0, 7.2, 7.8, 8.0, 7.0, 5.0, 3.0, 2.5, 3.0, 2.0])
I want the result to be
b = np.array([10.0, 9.5, 8.0, 7.2, 7.2, 7.2, 7.0, 5.0, 3.0, 2.5, 2.5, 2.0])
Here's a graphical representation:
I know how to achieve it with a Python loop, but is there a way to do this with NumPy machinery?
Python code for clarity:
b = np.array(a)
for i in range(1, b.size):
if b[i] > b[i-1]:
b[i] = b[i-1]
python arrays numpy vectorization
Why the concern for "without a loop"? Whether you write an explicit loop, or the loop is done in an imported function from some module/package, it's still there. There aren't very many ways to do something to a series of values that don't involve a loop, unless you want to completely unroll the entire loop into a linear sequence of operations, which is ugly for several different reasons - portability, flexibility, code size, etc...
– twalberg
Feb 17 '15 at 15:00
@twalberg I think it's common to try to avoid Python loops when using NumPy, because the performance generally improves if the iteration is done inside the functions implemented in C. It also often happens that the code is shorter cleaner.
– Lev Levitsky
Feb 17 '15 at 15:04
That's a valid point when working with large data sets. However, in this example (and without any indication that the "real" problem is orders of magnitude larger), I think the overhead of converting the Python list into a data structure that the C loop can work on, and then converting it back into the appropriate Python data structure, probably voids any potential gain from not just using a Python loop to iterate over a dozen entries... Better to validate that the loop is a problem before just blindly trying to eliminate it...
– twalberg
Feb 17 '15 at 15:10
@twalberg Fair enough; I should have mentioned that the real data is indeed thousands of elements in size and already in the form of a NumPy array. This question, however, also has an educational purpose.
– Lev Levitsky
Feb 17 '15 at 15:16
add a comment |
I have a 1D array of values which is supposed to be monotonic (let's say decreasing), but there are random regions where the value increases with index.
I need an array where each region is replaced with a value directly preceding it, so that the resulting array is sorted.
So if given array is:
a = np.array([10.0, 9.5, 8.0, 7.2, 7.8, 8.0, 7.0, 5.0, 3.0, 2.5, 3.0, 2.0])
I want the result to be
b = np.array([10.0, 9.5, 8.0, 7.2, 7.2, 7.2, 7.0, 5.0, 3.0, 2.5, 2.5, 2.0])
Here's a graphical representation:
I know how to achieve it with a Python loop, but is there a way to do this with NumPy machinery?
Python code for clarity:
b = np.array(a)
for i in range(1, b.size):
if b[i] > b[i-1]:
b[i] = b[i-1]
python arrays numpy vectorization
I have a 1D array of values which is supposed to be monotonic (let's say decreasing), but there are random regions where the value increases with index.
I need an array where each region is replaced with a value directly preceding it, so that the resulting array is sorted.
So if given array is:
a = np.array([10.0, 9.5, 8.0, 7.2, 7.8, 8.0, 7.0, 5.0, 3.0, 2.5, 3.0, 2.0])
I want the result to be
b = np.array([10.0, 9.5, 8.0, 7.2, 7.2, 7.2, 7.0, 5.0, 3.0, 2.5, 2.5, 2.0])
Here's a graphical representation:
I know how to achieve it with a Python loop, but is there a way to do this with NumPy machinery?
Python code for clarity:
b = np.array(a)
for i in range(1, b.size):
if b[i] > b[i-1]:
b[i] = b[i-1]
python arrays numpy vectorization
python arrays numpy vectorization
edited Jan 28 '17 at 10:56
Alex Riley
82.4k26163166
82.4k26163166
asked Feb 17 '15 at 14:30
Lev LevitskyLev Levitsky
44.8k11109142
44.8k11109142
Why the concern for "without a loop"? Whether you write an explicit loop, or the loop is done in an imported function from some module/package, it's still there. There aren't very many ways to do something to a series of values that don't involve a loop, unless you want to completely unroll the entire loop into a linear sequence of operations, which is ugly for several different reasons - portability, flexibility, code size, etc...
– twalberg
Feb 17 '15 at 15:00
@twalberg I think it's common to try to avoid Python loops when using NumPy, because the performance generally improves if the iteration is done inside the functions implemented in C. It also often happens that the code is shorter cleaner.
– Lev Levitsky
Feb 17 '15 at 15:04
That's a valid point when working with large data sets. However, in this example (and without any indication that the "real" problem is orders of magnitude larger), I think the overhead of converting the Python list into a data structure that the C loop can work on, and then converting it back into the appropriate Python data structure, probably voids any potential gain from not just using a Python loop to iterate over a dozen entries... Better to validate that the loop is a problem before just blindly trying to eliminate it...
– twalberg
Feb 17 '15 at 15:10
@twalberg Fair enough; I should have mentioned that the real data is indeed thousands of elements in size and already in the form of a NumPy array. This question, however, also has an educational purpose.
– Lev Levitsky
Feb 17 '15 at 15:16
add a comment |
Why the concern for "without a loop"? Whether you write an explicit loop, or the loop is done in an imported function from some module/package, it's still there. There aren't very many ways to do something to a series of values that don't involve a loop, unless you want to completely unroll the entire loop into a linear sequence of operations, which is ugly for several different reasons - portability, flexibility, code size, etc...
– twalberg
Feb 17 '15 at 15:00
@twalberg I think it's common to try to avoid Python loops when using NumPy, because the performance generally improves if the iteration is done inside the functions implemented in C. It also often happens that the code is shorter cleaner.
– Lev Levitsky
Feb 17 '15 at 15:04
That's a valid point when working with large data sets. However, in this example (and without any indication that the "real" problem is orders of magnitude larger), I think the overhead of converting the Python list into a data structure that the C loop can work on, and then converting it back into the appropriate Python data structure, probably voids any potential gain from not just using a Python loop to iterate over a dozen entries... Better to validate that the loop is a problem before just blindly trying to eliminate it...
– twalberg
Feb 17 '15 at 15:10
@twalberg Fair enough; I should have mentioned that the real data is indeed thousands of elements in size and already in the form of a NumPy array. This question, however, also has an educational purpose.
– Lev Levitsky
Feb 17 '15 at 15:16
Why the concern for "without a loop"? Whether you write an explicit loop, or the loop is done in an imported function from some module/package, it's still there. There aren't very many ways to do something to a series of values that don't involve a loop, unless you want to completely unroll the entire loop into a linear sequence of operations, which is ugly for several different reasons - portability, flexibility, code size, etc...
– twalberg
Feb 17 '15 at 15:00
Why the concern for "without a loop"? Whether you write an explicit loop, or the loop is done in an imported function from some module/package, it's still there. There aren't very many ways to do something to a series of values that don't involve a loop, unless you want to completely unroll the entire loop into a linear sequence of operations, which is ugly for several different reasons - portability, flexibility, code size, etc...
– twalberg
Feb 17 '15 at 15:00
@twalberg I think it's common to try to avoid Python loops when using NumPy, because the performance generally improves if the iteration is done inside the functions implemented in C. It also often happens that the code is shorter cleaner.
– Lev Levitsky
Feb 17 '15 at 15:04
@twalberg I think it's common to try to avoid Python loops when using NumPy, because the performance generally improves if the iteration is done inside the functions implemented in C. It also often happens that the code is shorter cleaner.
– Lev Levitsky
Feb 17 '15 at 15:04
That's a valid point when working with large data sets. However, in this example (and without any indication that the "real" problem is orders of magnitude larger), I think the overhead of converting the Python list into a data structure that the C loop can work on, and then converting it back into the appropriate Python data structure, probably voids any potential gain from not just using a Python loop to iterate over a dozen entries... Better to validate that the loop is a problem before just blindly trying to eliminate it...
– twalberg
Feb 17 '15 at 15:10
That's a valid point when working with large data sets. However, in this example (and without any indication that the "real" problem is orders of magnitude larger), I think the overhead of converting the Python list into a data structure that the C loop can work on, and then converting it back into the appropriate Python data structure, probably voids any potential gain from not just using a Python loop to iterate over a dozen entries... Better to validate that the loop is a problem before just blindly trying to eliminate it...
– twalberg
Feb 17 '15 at 15:10
@twalberg Fair enough; I should have mentioned that the real data is indeed thousands of elements in size and already in the form of a NumPy array. This question, however, also has an educational purpose.
– Lev Levitsky
Feb 17 '15 at 15:16
@twalberg Fair enough; I should have mentioned that the real data is indeed thousands of elements in size and already in the form of a NumPy array. This question, however, also has an educational purpose.
– Lev Levitsky
Feb 17 '15 at 15:16
add a comment |
1 Answer
1
active
oldest
votes
You can use np.minimum.accumulate
to collect the minimum values as you move through the array:
>>> np.minimum.accumulate(a)
array([ 10. , 9.5, 8. , 7.2, 7.2, 7.2, 7. , 5. , 3. ,
2.5, 2.5, 2. ])
At each element in the array, this function returns the minimum value seen so far.
If you wanted an array to be monotonic increasing, you could use np.maximum.accumulate
.
Many other universal functions in NumPy have an accumulate
method to simulate looping through an array, applying the function to each element and collecting the returned values into an array of the same size.
1
Wow. I didn't know about.accumulate
! This is very close to Python2reduce
(or Python3functools.reduce
) , am I correct?
– Lev Levitsky
Feb 17 '15 at 14:45
1
It's very similar -accumulate
stores the result of the operation on each element returning an array of the same length, whereas the unfunc'sreduce
method just shows the final result (collapsing the array).
– Alex Riley
Feb 17 '15 at 14:48
Oh yes, you're right. Soreduce
would basically return the last element of whataccumulate
returns.
– Lev Levitsky
Feb 17 '15 at 14:49
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f28563711%2fmake-a-numpy-array-monotonic-without-a-python-loop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use np.minimum.accumulate
to collect the minimum values as you move through the array:
>>> np.minimum.accumulate(a)
array([ 10. , 9.5, 8. , 7.2, 7.2, 7.2, 7. , 5. , 3. ,
2.5, 2.5, 2. ])
At each element in the array, this function returns the minimum value seen so far.
If you wanted an array to be monotonic increasing, you could use np.maximum.accumulate
.
Many other universal functions in NumPy have an accumulate
method to simulate looping through an array, applying the function to each element and collecting the returned values into an array of the same size.
1
Wow. I didn't know about.accumulate
! This is very close to Python2reduce
(or Python3functools.reduce
) , am I correct?
– Lev Levitsky
Feb 17 '15 at 14:45
1
It's very similar -accumulate
stores the result of the operation on each element returning an array of the same length, whereas the unfunc'sreduce
method just shows the final result (collapsing the array).
– Alex Riley
Feb 17 '15 at 14:48
Oh yes, you're right. Soreduce
would basically return the last element of whataccumulate
returns.
– Lev Levitsky
Feb 17 '15 at 14:49
add a comment |
You can use np.minimum.accumulate
to collect the minimum values as you move through the array:
>>> np.minimum.accumulate(a)
array([ 10. , 9.5, 8. , 7.2, 7.2, 7.2, 7. , 5. , 3. ,
2.5, 2.5, 2. ])
At each element in the array, this function returns the minimum value seen so far.
If you wanted an array to be monotonic increasing, you could use np.maximum.accumulate
.
Many other universal functions in NumPy have an accumulate
method to simulate looping through an array, applying the function to each element and collecting the returned values into an array of the same size.
1
Wow. I didn't know about.accumulate
! This is very close to Python2reduce
(or Python3functools.reduce
) , am I correct?
– Lev Levitsky
Feb 17 '15 at 14:45
1
It's very similar -accumulate
stores the result of the operation on each element returning an array of the same length, whereas the unfunc'sreduce
method just shows the final result (collapsing the array).
– Alex Riley
Feb 17 '15 at 14:48
Oh yes, you're right. Soreduce
would basically return the last element of whataccumulate
returns.
– Lev Levitsky
Feb 17 '15 at 14:49
add a comment |
You can use np.minimum.accumulate
to collect the minimum values as you move through the array:
>>> np.minimum.accumulate(a)
array([ 10. , 9.5, 8. , 7.2, 7.2, 7.2, 7. , 5. , 3. ,
2.5, 2.5, 2. ])
At each element in the array, this function returns the minimum value seen so far.
If you wanted an array to be monotonic increasing, you could use np.maximum.accumulate
.
Many other universal functions in NumPy have an accumulate
method to simulate looping through an array, applying the function to each element and collecting the returned values into an array of the same size.
You can use np.minimum.accumulate
to collect the minimum values as you move through the array:
>>> np.minimum.accumulate(a)
array([ 10. , 9.5, 8. , 7.2, 7.2, 7.2, 7. , 5. , 3. ,
2.5, 2.5, 2. ])
At each element in the array, this function returns the minimum value seen so far.
If you wanted an array to be monotonic increasing, you could use np.maximum.accumulate
.
Many other universal functions in NumPy have an accumulate
method to simulate looping through an array, applying the function to each element and collecting the returned values into an array of the same size.
edited Feb 26 '15 at 23:05
answered Feb 17 '15 at 14:39
Alex RileyAlex Riley
82.4k26163166
82.4k26163166
1
Wow. I didn't know about.accumulate
! This is very close to Python2reduce
(or Python3functools.reduce
) , am I correct?
– Lev Levitsky
Feb 17 '15 at 14:45
1
It's very similar -accumulate
stores the result of the operation on each element returning an array of the same length, whereas the unfunc'sreduce
method just shows the final result (collapsing the array).
– Alex Riley
Feb 17 '15 at 14:48
Oh yes, you're right. Soreduce
would basically return the last element of whataccumulate
returns.
– Lev Levitsky
Feb 17 '15 at 14:49
add a comment |
1
Wow. I didn't know about.accumulate
! This is very close to Python2reduce
(or Python3functools.reduce
) , am I correct?
– Lev Levitsky
Feb 17 '15 at 14:45
1
It's very similar -accumulate
stores the result of the operation on each element returning an array of the same length, whereas the unfunc'sreduce
method just shows the final result (collapsing the array).
– Alex Riley
Feb 17 '15 at 14:48
Oh yes, you're right. Soreduce
would basically return the last element of whataccumulate
returns.
– Lev Levitsky
Feb 17 '15 at 14:49
1
1
Wow. I didn't know about
.accumulate
! This is very close to Python2 reduce
(or Python3 functools.reduce
) , am I correct?– Lev Levitsky
Feb 17 '15 at 14:45
Wow. I didn't know about
.accumulate
! This is very close to Python2 reduce
(or Python3 functools.reduce
) , am I correct?– Lev Levitsky
Feb 17 '15 at 14:45
1
1
It's very similar -
accumulate
stores the result of the operation on each element returning an array of the same length, whereas the unfunc's reduce
method just shows the final result (collapsing the array).– Alex Riley
Feb 17 '15 at 14:48
It's very similar -
accumulate
stores the result of the operation on each element returning an array of the same length, whereas the unfunc's reduce
method just shows the final result (collapsing the array).– Alex Riley
Feb 17 '15 at 14:48
Oh yes, you're right. So
reduce
would basically return the last element of what accumulate
returns.– Lev Levitsky
Feb 17 '15 at 14:49
Oh yes, you're right. So
reduce
would basically return the last element of what accumulate
returns.– Lev Levitsky
Feb 17 '15 at 14:49
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f28563711%2fmake-a-numpy-array-monotonic-without-a-python-loop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Why the concern for "without a loop"? Whether you write an explicit loop, or the loop is done in an imported function from some module/package, it's still there. There aren't very many ways to do something to a series of values that don't involve a loop, unless you want to completely unroll the entire loop into a linear sequence of operations, which is ugly for several different reasons - portability, flexibility, code size, etc...
– twalberg
Feb 17 '15 at 15:00
@twalberg I think it's common to try to avoid Python loops when using NumPy, because the performance generally improves if the iteration is done inside the functions implemented in C. It also often happens that the code is shorter cleaner.
– Lev Levitsky
Feb 17 '15 at 15:04
That's a valid point when working with large data sets. However, in this example (and without any indication that the "real" problem is orders of magnitude larger), I think the overhead of converting the Python list into a data structure that the C loop can work on, and then converting it back into the appropriate Python data structure, probably voids any potential gain from not just using a Python loop to iterate over a dozen entries... Better to validate that the loop is a problem before just blindly trying to eliminate it...
– twalberg
Feb 17 '15 at 15:10
@twalberg Fair enough; I should have mentioned that the real data is indeed thousands of elements in size and already in the form of a NumPy array. This question, however, also has an educational purpose.
– Lev Levitsky
Feb 17 '15 at 15:16