Use column combinations to find data mismatch in rows pandas
What's the best way to get all cell values based on a combination of column values?
Sample dataframe One:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Sample dataframe Two:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.
I'm using Pandas/Python3.7
Sample Output:
BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300
python python-3.x pandas dataframe merge
add a comment |
What's the best way to get all cell values based on a combination of column values?
Sample dataframe One:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Sample dataframe Two:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.
I'm using Pandas/Python3.7
Sample Output:
BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300
python python-3.x pandas dataframe merge
Do you have stock names constant between both the DataFrames orStock
name also be mismatched?
– pygo
Nov 21 '18 at 4:31
Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.
– Greedy Coder
Nov 21 '18 at 14:24
@ Greedy coder, then my given answer fits into your solution to get the match as you want.
– pygo
Nov 21 '18 at 14:26
Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.
– pygo
Nov 21 '18 at 14:38
add a comment |
What's the best way to get all cell values based on a combination of column values?
Sample dataframe One:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Sample dataframe Two:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.
I'm using Pandas/Python3.7
Sample Output:
BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300
python python-3.x pandas dataframe merge
What's the best way to get all cell values based on a combination of column values?
Sample dataframe One:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Sample dataframe Two:
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.
I'm using Pandas/Python3.7
Sample Output:
BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300
python python-3.x pandas dataframe merge
python python-3.x pandas dataframe merge
edited Nov 20 '18 at 21:50
coldspeed
135k23145230
135k23145230
asked Nov 20 '18 at 21:42
Greedy CoderGreedy Coder
6118
6118
Do you have stock names constant between both the DataFrames orStock
name also be mismatched?
– pygo
Nov 21 '18 at 4:31
Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.
– Greedy Coder
Nov 21 '18 at 14:24
@ Greedy coder, then my given answer fits into your solution to get the match as you want.
– pygo
Nov 21 '18 at 14:26
Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.
– pygo
Nov 21 '18 at 14:38
add a comment |
Do you have stock names constant between both the DataFrames orStock
name also be mismatched?
– pygo
Nov 21 '18 at 4:31
Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.
– Greedy Coder
Nov 21 '18 at 14:24
@ Greedy coder, then my given answer fits into your solution to get the match as you want.
– pygo
Nov 21 '18 at 14:26
Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.
– pygo
Nov 21 '18 at 14:38
Do you have stock names constant between both the DataFrames or
Stock
name also be mismatched?– pygo
Nov 21 '18 at 4:31
Do you have stock names constant between both the DataFrames or
Stock
name also be mismatched?– pygo
Nov 21 '18 at 4:31
Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.
– Greedy Coder
Nov 21 '18 at 14:24
Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.
– Greedy Coder
Nov 21 '18 at 14:24
@ Greedy coder, then my given answer fits into your solution to get the match as you want.
– pygo
Nov 21 '18 at 14:26
@ Greedy coder, then my given answer fits into your solution to get the match as you want.
– pygo
Nov 21 '18 at 14:26
Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.
– pygo
Nov 21 '18 at 14:38
Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.
– pygo
Nov 21 '18 at 14:38
add a comment |
3 Answers
3
active
oldest
votes
Perhaps, a FULL INNER JOIN using merge
+ query
?
df1.merge(df2, on='Stock').query('Name_x != Name_y')
Stock Name_x Price_x Name_y Price_y
2 BAC Bank of America Corporation 300 Branch of America Corporation 300
Or, a slightly different solution with map
, you can use to get the stock symbols:
m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
symbols = df1.loc[m, 'Stock']
print(symbols)
2 BAC
Name: Stock, dtype: object
And then access each DataFrame row by stock symbol:
df1[df1.Stock.isin(symbols)]
Stock Name Price
2 BAC Bank of America Corporation 300
df2[df2.Stock.isin(symbols)]
Stock Name Price
2 BAC Branch of America Corporation 300
1
may the Name same but the Stock does not ?
– Wen-Ben
Nov 20 '18 at 21:46
@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)
– coldspeed
Nov 20 '18 at 21:48
That's incorrect.
– coldspeed
Nov 21 '18 at 6:40
The Name field can be different/have errors/typos - that's what I want to catch.
– Greedy Coder
Nov 21 '18 at 14:25
add a comment |
If they are in two dataframes, merging them without condition is pretty straightforward with .concat
. Once they are joined, here's one way to get the mismatch:
import pandas as pd
df1 = pd.DataFrame({
"Ticker_y": list("qwerty"),
"Name_y": list("asdfgh"),
"Ticker_x": list("qw3r7y"),
"Name_x": list("as6f8h")
})
mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]
The last line just says "the df only where these conditions are met."
add a comment |
We can use isin
using the sequence of values to test as it ensures each element in the DataFrame is contained in values
First DataFrame
>>> df1
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Second DataFrame
>>> df2
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Here you can go..
>>> df2[~df2.Name.isin(df1.Name.values)]
Stock Name Price
2 BAC Branch of America Corporation 300
OR
>>> df1[~df1.Name.isin(df2.Name.values)]
Stock Name Price
2 BAC Bank of America Corporation 300
I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)
– coldspeed
Nov 21 '18 at 6:39
This will work ifStock
names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.
– pygo
Nov 21 '18 at 6:42
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402012%2fuse-column-combinations-to-find-data-mismatch-in-rows-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Perhaps, a FULL INNER JOIN using merge
+ query
?
df1.merge(df2, on='Stock').query('Name_x != Name_y')
Stock Name_x Price_x Name_y Price_y
2 BAC Bank of America Corporation 300 Branch of America Corporation 300
Or, a slightly different solution with map
, you can use to get the stock symbols:
m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
symbols = df1.loc[m, 'Stock']
print(symbols)
2 BAC
Name: Stock, dtype: object
And then access each DataFrame row by stock symbol:
df1[df1.Stock.isin(symbols)]
Stock Name Price
2 BAC Bank of America Corporation 300
df2[df2.Stock.isin(symbols)]
Stock Name Price
2 BAC Branch of America Corporation 300
1
may the Name same but the Stock does not ?
– Wen-Ben
Nov 20 '18 at 21:46
@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)
– coldspeed
Nov 20 '18 at 21:48
That's incorrect.
– coldspeed
Nov 21 '18 at 6:40
The Name field can be different/have errors/typos - that's what I want to catch.
– Greedy Coder
Nov 21 '18 at 14:25
add a comment |
Perhaps, a FULL INNER JOIN using merge
+ query
?
df1.merge(df2, on='Stock').query('Name_x != Name_y')
Stock Name_x Price_x Name_y Price_y
2 BAC Bank of America Corporation 300 Branch of America Corporation 300
Or, a slightly different solution with map
, you can use to get the stock symbols:
m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
symbols = df1.loc[m, 'Stock']
print(symbols)
2 BAC
Name: Stock, dtype: object
And then access each DataFrame row by stock symbol:
df1[df1.Stock.isin(symbols)]
Stock Name Price
2 BAC Bank of America Corporation 300
df2[df2.Stock.isin(symbols)]
Stock Name Price
2 BAC Branch of America Corporation 300
1
may the Name same but the Stock does not ?
– Wen-Ben
Nov 20 '18 at 21:46
@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)
– coldspeed
Nov 20 '18 at 21:48
That's incorrect.
– coldspeed
Nov 21 '18 at 6:40
The Name field can be different/have errors/typos - that's what I want to catch.
– Greedy Coder
Nov 21 '18 at 14:25
add a comment |
Perhaps, a FULL INNER JOIN using merge
+ query
?
df1.merge(df2, on='Stock').query('Name_x != Name_y')
Stock Name_x Price_x Name_y Price_y
2 BAC Bank of America Corporation 300 Branch of America Corporation 300
Or, a slightly different solution with map
, you can use to get the stock symbols:
m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
symbols = df1.loc[m, 'Stock']
print(symbols)
2 BAC
Name: Stock, dtype: object
And then access each DataFrame row by stock symbol:
df1[df1.Stock.isin(symbols)]
Stock Name Price
2 BAC Bank of America Corporation 300
df2[df2.Stock.isin(symbols)]
Stock Name Price
2 BAC Branch of America Corporation 300
Perhaps, a FULL INNER JOIN using merge
+ query
?
df1.merge(df2, on='Stock').query('Name_x != Name_y')
Stock Name_x Price_x Name_y Price_y
2 BAC Bank of America Corporation 300 Branch of America Corporation 300
Or, a slightly different solution with map
, you can use to get the stock symbols:
m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
symbols = df1.loc[m, 'Stock']
print(symbols)
2 BAC
Name: Stock, dtype: object
And then access each DataFrame row by stock symbol:
df1[df1.Stock.isin(symbols)]
Stock Name Price
2 BAC Bank of America Corporation 300
df2[df2.Stock.isin(symbols)]
Stock Name Price
2 BAC Branch of America Corporation 300
edited Nov 20 '18 at 21:48
answered Nov 20 '18 at 21:45
coldspeedcoldspeed
135k23145230
135k23145230
1
may the Name same but the Stock does not ?
– Wen-Ben
Nov 20 '18 at 21:46
@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)
– coldspeed
Nov 20 '18 at 21:48
That's incorrect.
– coldspeed
Nov 21 '18 at 6:40
The Name field can be different/have errors/typos - that's what I want to catch.
– Greedy Coder
Nov 21 '18 at 14:25
add a comment |
1
may the Name same but the Stock does not ?
– Wen-Ben
Nov 20 '18 at 21:46
@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)
– coldspeed
Nov 20 '18 at 21:48
That's incorrect.
– coldspeed
Nov 21 '18 at 6:40
The Name field can be different/have errors/typos - that's what I want to catch.
– Greedy Coder
Nov 21 '18 at 14:25
1
1
may the Name same but the Stock does not ?
– Wen-Ben
Nov 20 '18 at 21:46
may the Name same but the Stock does not ?
– Wen-Ben
Nov 20 '18 at 21:46
@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)
– coldspeed
Nov 20 '18 at 21:48
@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)
– coldspeed
Nov 20 '18 at 21:48
That's incorrect.
– coldspeed
Nov 21 '18 at 6:40
That's incorrect.
– coldspeed
Nov 21 '18 at 6:40
The Name field can be different/have errors/typos - that's what I want to catch.
– Greedy Coder
Nov 21 '18 at 14:25
The Name field can be different/have errors/typos - that's what I want to catch.
– Greedy Coder
Nov 21 '18 at 14:25
add a comment |
If they are in two dataframes, merging them without condition is pretty straightforward with .concat
. Once they are joined, here's one way to get the mismatch:
import pandas as pd
df1 = pd.DataFrame({
"Ticker_y": list("qwerty"),
"Name_y": list("asdfgh"),
"Ticker_x": list("qw3r7y"),
"Name_x": list("as6f8h")
})
mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]
The last line just says "the df only where these conditions are met."
add a comment |
If they are in two dataframes, merging them without condition is pretty straightforward with .concat
. Once they are joined, here's one way to get the mismatch:
import pandas as pd
df1 = pd.DataFrame({
"Ticker_y": list("qwerty"),
"Name_y": list("asdfgh"),
"Ticker_x": list("qw3r7y"),
"Name_x": list("as6f8h")
})
mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]
The last line just says "the df only where these conditions are met."
add a comment |
If they are in two dataframes, merging them without condition is pretty straightforward with .concat
. Once they are joined, here's one way to get the mismatch:
import pandas as pd
df1 = pd.DataFrame({
"Ticker_y": list("qwerty"),
"Name_y": list("asdfgh"),
"Ticker_x": list("qw3r7y"),
"Name_x": list("as6f8h")
})
mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]
The last line just says "the df only where these conditions are met."
If they are in two dataframes, merging them without condition is pretty straightforward with .concat
. Once they are joined, here's one way to get the mismatch:
import pandas as pd
df1 = pd.DataFrame({
"Ticker_y": list("qwerty"),
"Name_y": list("asdfgh"),
"Ticker_x": list("qw3r7y"),
"Name_x": list("as6f8h")
})
mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]
The last line just says "the df only where these conditions are met."
answered Nov 20 '18 at 21:52
Charles LandauCharles Landau
2,7031216
2,7031216
add a comment |
add a comment |
We can use isin
using the sequence of values to test as it ensures each element in the DataFrame is contained in values
First DataFrame
>>> df1
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Second DataFrame
>>> df2
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Here you can go..
>>> df2[~df2.Name.isin(df1.Name.values)]
Stock Name Price
2 BAC Branch of America Corporation 300
OR
>>> df1[~df1.Name.isin(df2.Name.values)]
Stock Name Price
2 BAC Bank of America Corporation 300
I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)
– coldspeed
Nov 21 '18 at 6:39
This will work ifStock
names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.
– pygo
Nov 21 '18 at 6:42
add a comment |
We can use isin
using the sequence of values to test as it ensures each element in the DataFrame is contained in values
First DataFrame
>>> df1
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Second DataFrame
>>> df2
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Here you can go..
>>> df2[~df2.Name.isin(df1.Name.values)]
Stock Name Price
2 BAC Branch of America Corporation 300
OR
>>> df1[~df1.Name.isin(df2.Name.values)]
Stock Name Price
2 BAC Bank of America Corporation 300
I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)
– coldspeed
Nov 21 '18 at 6:39
This will work ifStock
names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.
– pygo
Nov 21 '18 at 6:42
add a comment |
We can use isin
using the sequence of values to test as it ensures each element in the DataFrame is contained in values
First DataFrame
>>> df1
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Second DataFrame
>>> df2
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Here you can go..
>>> df2[~df2.Name.isin(df1.Name.values)]
Stock Name Price
2 BAC Branch of America Corporation 300
OR
>>> df1[~df1.Name.isin(df2.Name.values)]
Stock Name Price
2 BAC Bank of America Corporation 300
We can use isin
using the sequence of values to test as it ensures each element in the DataFrame is contained in values
First DataFrame
>>> df1
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Second DataFrame
>>> df2
Stock Name Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 APPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000
Here you can go..
>>> df2[~df2.Name.isin(df1.Name.values)]
Stock Name Price
2 BAC Branch of America Corporation 300
OR
>>> df1[~df1.Name.isin(df2.Name.values)]
Stock Name Price
2 BAC Bank of America Corporation 300
answered Nov 21 '18 at 4:23
pygopygo
3,1751619
3,1751619
I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)
– coldspeed
Nov 21 '18 at 6:39
This will work ifStock
names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.
– pygo
Nov 21 '18 at 6:42
add a comment |
I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)
– coldspeed
Nov 21 '18 at 6:39
This will work ifStock
names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.
– pygo
Nov 21 '18 at 6:42
I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)
– coldspeed
Nov 21 '18 at 6:39
I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)
– coldspeed
Nov 21 '18 at 6:39
This will work if
Stock
names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.– pygo
Nov 21 '18 at 6:42
This will work if
Stock
names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.– pygo
Nov 21 '18 at 6:42
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402012%2fuse-column-combinations-to-find-data-mismatch-in-rows-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Do you have stock names constant between both the DataFrames or
Stock
name also be mismatched?– pygo
Nov 21 '18 at 4:31
Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.
– Greedy Coder
Nov 21 '18 at 14:24
@ Greedy coder, then my given answer fits into your solution to get the match as you want.
– pygo
Nov 21 '18 at 14:26
Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.
– pygo
Nov 21 '18 at 14:38