Most efficient way to convert values of column in Pandas DataFrame
I have a a pd.DataFrame that looks like:
I want to create a cutoff on the values to push them into binary digits, my cutoff in this case is 0.85
. I want the resulting dataframe to look like:
The script I wrote to do this is easy to understand but for large datasets it is inefficient. I'm sure Pandas has some way of taking care of these types of transformations.
Does anyone know of an efficient way to convert a column of floats to a column of integers using a threshold?
My extremely naive way of doing such a thing:
DF_test = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0.12,0.23,0.93,0.86,0.33]]).T,columns=["c1","c2","value"])
DF_want = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0,0,1,1,0]]).T,columns=["c1","c2","value"])
threshold = 0.85
#Empty dataframe to append rows
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
#Relabel columns
DF_naive.columns = DF_test.columns
DF_naive.head()
#the sample DF_want
python pandas int dataframe
add a comment |
I have a a pd.DataFrame that looks like:
I want to create a cutoff on the values to push them into binary digits, my cutoff in this case is 0.85
. I want the resulting dataframe to look like:
The script I wrote to do this is easy to understand but for large datasets it is inefficient. I'm sure Pandas has some way of taking care of these types of transformations.
Does anyone know of an efficient way to convert a column of floats to a column of integers using a threshold?
My extremely naive way of doing such a thing:
DF_test = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0.12,0.23,0.93,0.86,0.33]]).T,columns=["c1","c2","value"])
DF_want = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0,0,1,1,0]]).T,columns=["c1","c2","value"])
threshold = 0.85
#Empty dataframe to append rows
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
#Relabel columns
DF_naive.columns = DF_test.columns
DF_naive.head()
#the sample DF_want
python pandas int dataframe
can't you just dodf['value'] = np.where(df['value'] > 0.85, 1, 0)
? this will convert and set the entire column
– EdChum
Feb 25 '16 at 22:28
add a comment |
I have a a pd.DataFrame that looks like:
I want to create a cutoff on the values to push them into binary digits, my cutoff in this case is 0.85
. I want the resulting dataframe to look like:
The script I wrote to do this is easy to understand but for large datasets it is inefficient. I'm sure Pandas has some way of taking care of these types of transformations.
Does anyone know of an efficient way to convert a column of floats to a column of integers using a threshold?
My extremely naive way of doing such a thing:
DF_test = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0.12,0.23,0.93,0.86,0.33]]).T,columns=["c1","c2","value"])
DF_want = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0,0,1,1,0]]).T,columns=["c1","c2","value"])
threshold = 0.85
#Empty dataframe to append rows
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
#Relabel columns
DF_naive.columns = DF_test.columns
DF_naive.head()
#the sample DF_want
python pandas int dataframe
I have a a pd.DataFrame that looks like:
I want to create a cutoff on the values to push them into binary digits, my cutoff in this case is 0.85
. I want the resulting dataframe to look like:
The script I wrote to do this is easy to understand but for large datasets it is inefficient. I'm sure Pandas has some way of taking care of these types of transformations.
Does anyone know of an efficient way to convert a column of floats to a column of integers using a threshold?
My extremely naive way of doing such a thing:
DF_test = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0.12,0.23,0.93,0.86,0.33]]).T,columns=["c1","c2","value"])
DF_want = pd.DataFrame(np.array([list("abcde"),list("pqrst"),[0,0,1,1,0]]).T,columns=["c1","c2","value"])
threshold = 0.85
#Empty dataframe to append rows
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
#Relabel columns
DF_naive.columns = DF_test.columns
DF_naive.head()
#the sample DF_want
python pandas int dataframe
python pandas int dataframe
asked Feb 25 '16 at 22:22
O.rkaO.rka
7,16030107169
7,16030107169
can't you just dodf['value'] = np.where(df['value'] > 0.85, 1, 0)
? this will convert and set the entire column
– EdChum
Feb 25 '16 at 22:28
add a comment |
can't you just dodf['value'] = np.where(df['value'] > 0.85, 1, 0)
? this will convert and set the entire column
– EdChum
Feb 25 '16 at 22:28
can't you just do
df['value'] = np.where(df['value'] > 0.85, 1, 0)
? this will convert and set the entire column– EdChum
Feb 25 '16 at 22:28
can't you just do
df['value'] = np.where(df['value'] > 0.85, 1, 0)
? this will convert and set the entire column– EdChum
Feb 25 '16 at 22:28
add a comment |
2 Answers
2
active
oldest
votes
You can use np.where
to set your desired value based on a boolean condition:
In [18]:
DF_test['value'] = np.where(DF_test['value'] > threshold, 1,0)
DF_test
Out[18]:
c1 c2 value
0 a p 0
1 b q 0
2 c r 1
3 d s 1
4 e t 0
Note that because your data is a heterogenous np array the 'value' column contains strings rather than floats:
In [58]:
DF_test.iloc[0]['value']
Out[58]:
'0.12'
So you'll need to convert the dtype
to float
first: DF_test['value'] = DF_test['value'].astype(float)
You can compare the timings:
In [16]:
%timeit np.where(DF_test['value'] > threshold, 1,0)
1000 loops, best of 3: 297 µs per loop
In [17]:
%%timeit
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
10 loops, best of 3: 39.3 ms per loop
the np.where
version is over 100x faster, admittedly your code is doing a lot of unnecessary stuff but you get the point
When I run this, the entire columnvalue
is then filled up with 1s.np.where(DF_test['value'] > 0.85)
returns (array([0, 1, 2, 3, 4]),)
andDF_test['value'] > 0.85
returnsTrue
everywhere. Any idea why that happens? I copy-pastedDF_test
from above.
– Cleb
Feb 25 '16 at 23:08
1
You may need to convert theDF_test['value']
dtype
firstDF_test['value'] = DF_test'].astype(float)
otherwise I haven't a clue
– EdChum
Feb 25 '16 at 23:11
That's it, thanks.
– Cleb
Feb 25 '16 at 23:12
1
@Cleb the OP created a heterogenous np.array as the data for the df, this made all the values in 'value' column into strings hence the need to convert the dtype
– EdChum
Feb 25 '16 at 23:15
1
Ok, you might want to add this to your answer. +1 from my side.
– Cleb
Feb 25 '16 at 23:15
add a comment |
Since bool
is a subclass of int
, i.e. True == 1
and False == 0
, you can convert a Boolean series to its integer form:
DF_test['value'] = (DF_test['value'] > threshold).astype(int)
Generally, including most uses in computation or indexing, the int
conversion is not necessary and you may wish to forego it altogether.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f35639588%2fmost-efficient-way-to-convert-values-of-column-in-pandas-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use np.where
to set your desired value based on a boolean condition:
In [18]:
DF_test['value'] = np.where(DF_test['value'] > threshold, 1,0)
DF_test
Out[18]:
c1 c2 value
0 a p 0
1 b q 0
2 c r 1
3 d s 1
4 e t 0
Note that because your data is a heterogenous np array the 'value' column contains strings rather than floats:
In [58]:
DF_test.iloc[0]['value']
Out[58]:
'0.12'
So you'll need to convert the dtype
to float
first: DF_test['value'] = DF_test['value'].astype(float)
You can compare the timings:
In [16]:
%timeit np.where(DF_test['value'] > threshold, 1,0)
1000 loops, best of 3: 297 µs per loop
In [17]:
%%timeit
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
10 loops, best of 3: 39.3 ms per loop
the np.where
version is over 100x faster, admittedly your code is doing a lot of unnecessary stuff but you get the point
When I run this, the entire columnvalue
is then filled up with 1s.np.where(DF_test['value'] > 0.85)
returns (array([0, 1, 2, 3, 4]),)
andDF_test['value'] > 0.85
returnsTrue
everywhere. Any idea why that happens? I copy-pastedDF_test
from above.
– Cleb
Feb 25 '16 at 23:08
1
You may need to convert theDF_test['value']
dtype
firstDF_test['value'] = DF_test'].astype(float)
otherwise I haven't a clue
– EdChum
Feb 25 '16 at 23:11
That's it, thanks.
– Cleb
Feb 25 '16 at 23:12
1
@Cleb the OP created a heterogenous np.array as the data for the df, this made all the values in 'value' column into strings hence the need to convert the dtype
– EdChum
Feb 25 '16 at 23:15
1
Ok, you might want to add this to your answer. +1 from my side.
– Cleb
Feb 25 '16 at 23:15
add a comment |
You can use np.where
to set your desired value based on a boolean condition:
In [18]:
DF_test['value'] = np.where(DF_test['value'] > threshold, 1,0)
DF_test
Out[18]:
c1 c2 value
0 a p 0
1 b q 0
2 c r 1
3 d s 1
4 e t 0
Note that because your data is a heterogenous np array the 'value' column contains strings rather than floats:
In [58]:
DF_test.iloc[0]['value']
Out[58]:
'0.12'
So you'll need to convert the dtype
to float
first: DF_test['value'] = DF_test['value'].astype(float)
You can compare the timings:
In [16]:
%timeit np.where(DF_test['value'] > threshold, 1,0)
1000 loops, best of 3: 297 µs per loop
In [17]:
%%timeit
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
10 loops, best of 3: 39.3 ms per loop
the np.where
version is over 100x faster, admittedly your code is doing a lot of unnecessary stuff but you get the point
When I run this, the entire columnvalue
is then filled up with 1s.np.where(DF_test['value'] > 0.85)
returns (array([0, 1, 2, 3, 4]),)
andDF_test['value'] > 0.85
returnsTrue
everywhere. Any idea why that happens? I copy-pastedDF_test
from above.
– Cleb
Feb 25 '16 at 23:08
1
You may need to convert theDF_test['value']
dtype
firstDF_test['value'] = DF_test'].astype(float)
otherwise I haven't a clue
– EdChum
Feb 25 '16 at 23:11
That's it, thanks.
– Cleb
Feb 25 '16 at 23:12
1
@Cleb the OP created a heterogenous np.array as the data for the df, this made all the values in 'value' column into strings hence the need to convert the dtype
– EdChum
Feb 25 '16 at 23:15
1
Ok, you might want to add this to your answer. +1 from my side.
– Cleb
Feb 25 '16 at 23:15
add a comment |
You can use np.where
to set your desired value based on a boolean condition:
In [18]:
DF_test['value'] = np.where(DF_test['value'] > threshold, 1,0)
DF_test
Out[18]:
c1 c2 value
0 a p 0
1 b q 0
2 c r 1
3 d s 1
4 e t 0
Note that because your data is a heterogenous np array the 'value' column contains strings rather than floats:
In [58]:
DF_test.iloc[0]['value']
Out[58]:
'0.12'
So you'll need to convert the dtype
to float
first: DF_test['value'] = DF_test['value'].astype(float)
You can compare the timings:
In [16]:
%timeit np.where(DF_test['value'] > threshold, 1,0)
1000 loops, best of 3: 297 µs per loop
In [17]:
%%timeit
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
10 loops, best of 3: 39.3 ms per loop
the np.where
version is over 100x faster, admittedly your code is doing a lot of unnecessary stuff but you get the point
You can use np.where
to set your desired value based on a boolean condition:
In [18]:
DF_test['value'] = np.where(DF_test['value'] > threshold, 1,0)
DF_test
Out[18]:
c1 c2 value
0 a p 0
1 b q 0
2 c r 1
3 d s 1
4 e t 0
Note that because your data is a heterogenous np array the 'value' column contains strings rather than floats:
In [58]:
DF_test.iloc[0]['value']
Out[58]:
'0.12'
So you'll need to convert the dtype
to float
first: DF_test['value'] = DF_test['value'].astype(float)
You can compare the timings:
In [16]:
%timeit np.where(DF_test['value'] > threshold, 1,0)
1000 loops, best of 3: 297 µs per loop
In [17]:
%%timeit
DF_naive = pd.DataFrame()
for i in range(DF_test.shape[0]):
#Get first 2 columns
first2cols = list(DF_test.ix[i][:-1])
#Check if value is greater than threshold
binary_value = [int((bool(float(DF_test.ix[i][-1]) > threshold)))]
#Create series object
SR_row = pd.Series( first2cols + binary_value,name=i)
#Add to empty dataframe container
DF_naive = DF_naive.append(SR_row)
10 loops, best of 3: 39.3 ms per loop
the np.where
version is over 100x faster, admittedly your code is doing a lot of unnecessary stuff but you get the point
edited Feb 25 '16 at 23:18
answered Feb 25 '16 at 22:32
EdChumEdChum
174k32369319
174k32369319
When I run this, the entire columnvalue
is then filled up with 1s.np.where(DF_test['value'] > 0.85)
returns (array([0, 1, 2, 3, 4]),)
andDF_test['value'] > 0.85
returnsTrue
everywhere. Any idea why that happens? I copy-pastedDF_test
from above.
– Cleb
Feb 25 '16 at 23:08
1
You may need to convert theDF_test['value']
dtype
firstDF_test['value'] = DF_test'].astype(float)
otherwise I haven't a clue
– EdChum
Feb 25 '16 at 23:11
That's it, thanks.
– Cleb
Feb 25 '16 at 23:12
1
@Cleb the OP created a heterogenous np.array as the data for the df, this made all the values in 'value' column into strings hence the need to convert the dtype
– EdChum
Feb 25 '16 at 23:15
1
Ok, you might want to add this to your answer. +1 from my side.
– Cleb
Feb 25 '16 at 23:15
add a comment |
When I run this, the entire columnvalue
is then filled up with 1s.np.where(DF_test['value'] > 0.85)
returns (array([0, 1, 2, 3, 4]),)
andDF_test['value'] > 0.85
returnsTrue
everywhere. Any idea why that happens? I copy-pastedDF_test
from above.
– Cleb
Feb 25 '16 at 23:08
1
You may need to convert theDF_test['value']
dtype
firstDF_test['value'] = DF_test'].astype(float)
otherwise I haven't a clue
– EdChum
Feb 25 '16 at 23:11
That's it, thanks.
– Cleb
Feb 25 '16 at 23:12
1
@Cleb the OP created a heterogenous np.array as the data for the df, this made all the values in 'value' column into strings hence the need to convert the dtype
– EdChum
Feb 25 '16 at 23:15
1
Ok, you might want to add this to your answer. +1 from my side.
– Cleb
Feb 25 '16 at 23:15
When I run this, the entire column
value
is then filled up with 1s. np.where(DF_test['value'] > 0.85)
returns (array([0, 1, 2, 3, 4]),)
and DF_test['value'] > 0.85
returns True
everywhere. Any idea why that happens? I copy-pasted DF_test
from above.– Cleb
Feb 25 '16 at 23:08
When I run this, the entire column
value
is then filled up with 1s. np.where(DF_test['value'] > 0.85)
returns (array([0, 1, 2, 3, 4]),)
and DF_test['value'] > 0.85
returns True
everywhere. Any idea why that happens? I copy-pasted DF_test
from above.– Cleb
Feb 25 '16 at 23:08
1
1
You may need to convert the
DF_test['value']
dtype
first DF_test['value'] = DF_test'].astype(float)
otherwise I haven't a clue– EdChum
Feb 25 '16 at 23:11
You may need to convert the
DF_test['value']
dtype
first DF_test['value'] = DF_test'].astype(float)
otherwise I haven't a clue– EdChum
Feb 25 '16 at 23:11
That's it, thanks.
– Cleb
Feb 25 '16 at 23:12
That's it, thanks.
– Cleb
Feb 25 '16 at 23:12
1
1
@Cleb the OP created a heterogenous np.array as the data for the df, this made all the values in 'value' column into strings hence the need to convert the dtype
– EdChum
Feb 25 '16 at 23:15
@Cleb the OP created a heterogenous np.array as the data for the df, this made all the values in 'value' column into strings hence the need to convert the dtype
– EdChum
Feb 25 '16 at 23:15
1
1
Ok, you might want to add this to your answer. +1 from my side.
– Cleb
Feb 25 '16 at 23:15
Ok, you might want to add this to your answer. +1 from my side.
– Cleb
Feb 25 '16 at 23:15
add a comment |
Since bool
is a subclass of int
, i.e. True == 1
and False == 0
, you can convert a Boolean series to its integer form:
DF_test['value'] = (DF_test['value'] > threshold).astype(int)
Generally, including most uses in computation or indexing, the int
conversion is not necessary and you may wish to forego it altogether.
add a comment |
Since bool
is a subclass of int
, i.e. True == 1
and False == 0
, you can convert a Boolean series to its integer form:
DF_test['value'] = (DF_test['value'] > threshold).astype(int)
Generally, including most uses in computation or indexing, the int
conversion is not necessary and you may wish to forego it altogether.
add a comment |
Since bool
is a subclass of int
, i.e. True == 1
and False == 0
, you can convert a Boolean series to its integer form:
DF_test['value'] = (DF_test['value'] > threshold).astype(int)
Generally, including most uses in computation or indexing, the int
conversion is not necessary and you may wish to forego it altogether.
Since bool
is a subclass of int
, i.e. True == 1
and False == 0
, you can convert a Boolean series to its integer form:
DF_test['value'] = (DF_test['value'] > threshold).astype(int)
Generally, including most uses in computation or indexing, the int
conversion is not necessary and you may wish to forego it altogether.
answered Nov 18 '18 at 23:44
jppjpp
97.7k2159109
97.7k2159109
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f35639588%2fmost-efficient-way-to-convert-values-of-column-in-pandas-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
can't you just do
df['value'] = np.where(df['value'] > 0.85, 1, 0)
? this will convert and set the entire column– EdChum
Feb 25 '16 at 22:28