Use column combinations to find data mismatch in rows pandas

What's the best way to get all cell values based on a combination of column values?

Sample dataframe One:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Sample dataframe Two:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Branch of America Corporation  300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.

I'm using Pandas/Python3.7

Sample Output:

BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

asked Nov 20 '18 at 21:42

Greedy Coder

6118

Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

– pygo
Nov 21 '18 at 4:31

Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

– Greedy Coder
Nov 21 '18 at 14:24

@ Greedy coder, then my given answer fits into your solution to get the match as you want.

– pygo
Nov 21 '18 at 14:26

Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

– pygo
Nov 21 '18 at 14:38

add a comment |

What's the best way to get all cell values based on a combination of column values?

Sample dataframe One:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Sample dataframe Two:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Branch of America Corporation  300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

I'm using Pandas/Python3.7

Sample Output:

BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

asked Nov 20 '18 at 21:42

Greedy Coder

6118

Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

– pygo
Nov 21 '18 at 4:31

Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

– Greedy Coder
Nov 21 '18 at 14:24

@ Greedy coder, then my given answer fits into your solution to get the match as you want.

– pygo
Nov 21 '18 at 14:26

Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

– pygo
Nov 21 '18 at 14:38

add a comment |

What's the best way to get all cell values based on a combination of column values?

Sample dataframe One:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Sample dataframe Two:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Branch of America Corporation  300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

I'm using Pandas/Python3.7

Sample Output:

BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

asked Nov 20 '18 at 21:42

Greedy Coder

6118

What's the best way to get all cell values based on a combination of column values?

Sample dataframe One:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Sample dataframe Two:

  Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Branch of America Corporation  300

3   AAPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

I'm using Pandas/Python3.7

Sample Output:

BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300

python python-3.x pandas dataframe merge

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

asked Nov 20 '18 at 21:42

Greedy Coder

6118

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

asked Nov 20 '18 at 21:42

Greedy Coder

6118

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

edited Nov 20 '18 at 21:50

coldspeed

135k23145230

asked Nov 20 '18 at 21:42

Greedy Coder

6118

asked Nov 20 '18 at 21:42

Greedy Coder

6118

asked Nov 20 '18 at 21:42

Greedy Coder

6118

Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

– pygo
Nov 21 '18 at 4:31

Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

– Greedy Coder
Nov 21 '18 at 14:24

@ Greedy coder, then my given answer fits into your solution to get the match as you want.

– pygo
Nov 21 '18 at 14:26

Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

– pygo
Nov 21 '18 at 14:38

add a comment |

Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

– pygo
Nov 21 '18 at 4:31

Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

– Greedy Coder
Nov 21 '18 at 14:24

@ Greedy coder, then my given answer fits into your solution to get the match as you want.

– pygo
Nov 21 '18 at 14:26

Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

– pygo
Nov 21 '18 at 14:38

Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

– pygo
Nov 21 '18 at 4:31

Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

– Greedy Coder
Nov 21 '18 at 14:24

@ Greedy coder, then my given answer fits into your solution to get the match as you want.

– pygo
Nov 21 '18 at 14:26

Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

– pygo
Nov 21 '18 at 14:38

add a comment |

3 Answers
3

active

oldest

votes

Perhaps, a FULL INNER JOIN using merge + query?

df1.merge(df2, on='Stock').query('Name_x != Name_y')



  Stock                       Name_x  Price_x                         Name_y  Price_y

2   BAC  Bank of America Corporation      300  Branch of America Corporation      300

Or, a slightly different solution with map, you can use to get the stock symbols:

m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)

symbols = df1.loc[m, 'Stock']



print(symbols)

2    BAC

Name: Stock, dtype: object

And then access each DataFrame row by stock symbol:

df1[df1.Stock.isin(symbols)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300



df2[df2.Stock.isin(symbols)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

edited Nov 20 '18 at 21:48

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

1

may the Name same but the Stock does not ?

– Wen-Ben
Nov 20 '18 at 21:46

@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

– coldspeed
Nov 20 '18 at 21:48

That's incorrect.

– coldspeed
Nov 21 '18 at 6:40

The Name field can be different/have errors/typos - that's what I want to catch.

– Greedy Coder
Nov 21 '18 at 14:25

add a comment |

If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:

import pandas as pd



df1 = pd.DataFrame({

    "Ticker_y": list("qwerty"),

    "Name_y": list("asdfgh"),

    "Ticker_x": list("qw3r7y"),

    "Name_x": list("as6f8h")

})



mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]

The last line just says "the df only where these conditions are met."

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

add a comment |

We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values

First DataFrame

>>> df1

   Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   APPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Second DataFrame

>>> df2

   Stock                           Name  Price

0    AMD         Advanced Micro Devices    100

1     GE       General Electric Company    200

2    BAC  Branch of America Corporation    300

3   APPL                     Apple Inc.    500

4   MSFT          Microsoft Corporation   1000

5  GOOGL                  Alphabet Inc.   2000

Here you can go..

>>> df2[~df2.Name.isin(df1.Name.values)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

>>> df1[~df1.Name.isin(df2.Name.values)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300

answered Nov 21 '18 at 4:23

pygo

3,1751619

I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

– coldspeed
Nov 21 '18 at 6:39

This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

– pygo
Nov 21 '18 at 6:42

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402012%2fuse-column-combinations-to-find-data-mismatch-in-rows-pandas%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Perhaps, a FULL INNER JOIN using merge + query?

df1.merge(df2, on='Stock').query('Name_x != Name_y')



  Stock                       Name_x  Price_x                         Name_y  Price_y

2   BAC  Bank of America Corporation      300  Branch of America Corporation      300

Or, a slightly different solution with map, you can use to get the stock symbols:

m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)

symbols = df1.loc[m, 'Stock']



print(symbols)

2    BAC

Name: Stock, dtype: object

And then access each DataFrame row by stock symbol:

df1[df1.Stock.isin(symbols)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300



df2[df2.Stock.isin(symbols)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

edited Nov 20 '18 at 21:48

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

1

may the Name same but the Stock does not ?

– Wen-Ben
Nov 20 '18 at 21:46

@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

– coldspeed
Nov 20 '18 at 21:48

That's incorrect.

– coldspeed
Nov 21 '18 at 6:40

The Name field can be different/have errors/typos - that's what I want to catch.

– Greedy Coder
Nov 21 '18 at 14:25

add a comment |

Perhaps, a FULL INNER JOIN using merge + query?

df1.merge(df2, on='Stock').query('Name_x != Name_y')



  Stock                       Name_x  Price_x                         Name_y  Price_y

2   BAC  Bank of America Corporation      300  Branch of America Corporation      300

Or, a slightly different solution with map, you can use to get the stock symbols:

m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)

symbols = df1.loc[m, 'Stock']



print(symbols)

2    BAC

Name: Stock, dtype: object

And then access each DataFrame row by stock symbol:

df1[df1.Stock.isin(symbols)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300



df2[df2.Stock.isin(symbols)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

edited Nov 20 '18 at 21:48

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

1

may the Name same but the Stock does not ?

– Wen-Ben
Nov 20 '18 at 21:46

@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

– coldspeed
Nov 20 '18 at 21:48

That's incorrect.

– coldspeed
Nov 21 '18 at 6:40

The Name field can be different/have errors/typos - that's what I want to catch.

– Greedy Coder
Nov 21 '18 at 14:25

add a comment |

Perhaps, a FULL INNER JOIN using merge + query?

df1.merge(df2, on='Stock').query('Name_x != Name_y')



  Stock                       Name_x  Price_x                         Name_y  Price_y

2   BAC  Bank of America Corporation      300  Branch of America Corporation      300

Or, a slightly different solution with map, you can use to get the stock symbols:

m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)

symbols = df1.loc[m, 'Stock']



print(symbols)

2    BAC

Name: Stock, dtype: object

And then access each DataFrame row by stock symbol:

df1[df1.Stock.isin(symbols)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300



df2[df2.Stock.isin(symbols)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

edited Nov 20 '18 at 21:48

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

Perhaps, a FULL INNER JOIN using merge + query?

df1.merge(df2, on='Stock').query('Name_x != Name_y')



  Stock                       Name_x  Price_x                         Name_y  Price_y

2   BAC  Bank of America Corporation      300  Branch of America Corporation      300

Or, a slightly different solution with map, you can use to get the stock symbols:

m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)

symbols = df1.loc[m, 'Stock']



print(symbols)

2    BAC

Name: Stock, dtype: object

And then access each DataFrame row by stock symbol:

df1[df1.Stock.isin(symbols)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300



df2[df2.Stock.isin(symbols)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

edited Nov 20 '18 at 21:48

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

edited Nov 20 '18 at 21:48

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

answered Nov 20 '18 at 21:45

coldspeed

135k23145230

1

may the Name same but the Stock does not ?

– Wen-Ben
Nov 20 '18 at 21:46

@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

– coldspeed
Nov 20 '18 at 21:48

That's incorrect.

– coldspeed
Nov 21 '18 at 6:40

The Name field can be different/have errors/typos - that's what I want to catch.

– Greedy Coder
Nov 21 '18 at 14:25

add a comment |

1

may the Name same but the Stock does not ?

– Wen-Ben
Nov 20 '18 at 21:46

@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

– coldspeed
Nov 20 '18 at 21:48

That's incorrect.

– coldspeed
Nov 21 '18 at 6:40

The Name field can be different/have errors/typos - that's what I want to catch.

– Greedy Coder
Nov 21 '18 at 14:25

may the Name same but the Stock does not ?

– Wen-Ben
Nov 20 '18 at 21:46

@W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

– coldspeed
Nov 20 '18 at 21:48

That's incorrect.

– coldspeed
Nov 21 '18 at 6:40

The Name field can be different/have errors/typos - that's what I want to catch.

– Greedy Coder
Nov 21 '18 at 14:25

add a comment |

If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:

import pandas as pd



df1 = pd.DataFrame({

    "Ticker_y": list("qwerty"),

    "Name_y": list("asdfgh"),

    "Ticker_x": list("qw3r7y"),

    "Name_x": list("as6f8h")

})



mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]

The last line just says "the df only where these conditions are met."

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

add a comment |

If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:

import pandas as pd



df1 = pd.DataFrame({

    "Ticker_y": list("qwerty"),

    "Name_y": list("asdfgh"),

    "Ticker_x": list("qw3r7y"),

    "Name_x": list("as6f8h")

})



mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]

The last line just says "the df only where these conditions are met."

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

add a comment |

If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:

import pandas as pd



df1 = pd.DataFrame({

    "Ticker_y": list("qwerty"),

    "Name_y": list("asdfgh"),

    "Ticker_x": list("qw3r7y"),

    "Name_x": list("as6f8h")

})



mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]

The last line just says "the df only where these conditions are met."

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:

import pandas as pd



df1 = pd.DataFrame({

    "Ticker_y": list("qwerty"),

    "Name_y": list("asdfgh"),

    "Ticker_x": list("qw3r7y"),

    "Name_x": list("as6f8h")

})



mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]

The last line just says "the df only where these conditions are met."

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

answered Nov 20 '18 at 21:52

Charles Landau

2,7031216

add a comment |

We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values

First DataFrame

>>> df1

   Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   APPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Second DataFrame

>>> df2

   Stock                           Name  Price

0    AMD         Advanced Micro Devices    100

1     GE       General Electric Company    200

2    BAC  Branch of America Corporation    300

3   APPL                     Apple Inc.    500

4   MSFT          Microsoft Corporation   1000

5  GOOGL                  Alphabet Inc.   2000

Here you can go..

>>> df2[~df2.Name.isin(df1.Name.values)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

>>> df1[~df1.Name.isin(df2.Name.values)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300

answered Nov 21 '18 at 4:23

pygo

3,1751619

I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

– coldspeed
Nov 21 '18 at 6:39

This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

– pygo
Nov 21 '18 at 6:42

add a comment |

We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values

First DataFrame

>>> df1

   Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   APPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Second DataFrame

>>> df2

   Stock                           Name  Price

0    AMD         Advanced Micro Devices    100

1     GE       General Electric Company    200

2    BAC  Branch of America Corporation    300

3   APPL                     Apple Inc.    500

4   MSFT          Microsoft Corporation   1000

5  GOOGL                  Alphabet Inc.   2000

Here you can go..

>>> df2[~df2.Name.isin(df1.Name.values)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

>>> df1[~df1.Name.isin(df2.Name.values)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300

answered Nov 21 '18 at 4:23

pygo

3,1751619

I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

– coldspeed
Nov 21 '18 at 6:39

This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

– pygo
Nov 21 '18 at 6:42

add a comment |

We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values

First DataFrame

>>> df1

   Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   APPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Second DataFrame

>>> df2

   Stock                           Name  Price

0    AMD         Advanced Micro Devices    100

1     GE       General Electric Company    200

2    BAC  Branch of America Corporation    300

3   APPL                     Apple Inc.    500

4   MSFT          Microsoft Corporation   1000

5  GOOGL                  Alphabet Inc.   2000

Here you can go..

>>> df2[~df2.Name.isin(df1.Name.values)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

>>> df1[~df1.Name.isin(df2.Name.values)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300

answered Nov 21 '18 at 4:23

pygo

3,1751619

We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values

First DataFrame

>>> df1

   Stock                         Name  Price

0    AMD       Advanced Micro Devices    100

1     GE     General Electric Company    200

2    BAC  Bank of America Corporation    300

3   APPL                   Apple Inc.    500

4   MSFT        Microsoft Corporation   1000

5  GOOGL                Alphabet Inc.   2000

Second DataFrame

>>> df2

   Stock                           Name  Price

0    AMD         Advanced Micro Devices    100

1     GE       General Electric Company    200

2    BAC  Branch of America Corporation    300

3   APPL                     Apple Inc.    500

4   MSFT          Microsoft Corporation   1000

5  GOOGL                  Alphabet Inc.   2000

Here you can go..

>>> df2[~df2.Name.isin(df1.Name.values)]

  Stock                           Name  Price

2   BAC  Branch of America Corporation    300

>>> df1[~df1.Name.isin(df2.Name.values)]

  Stock                         Name  Price

2   BAC  Bank of America Corporation    300

answered Nov 21 '18 at 4:23

pygo

3,1751619

answered Nov 21 '18 at 4:23

pygo

3,1751619

answered Nov 21 '18 at 4:23

pygo

3,1751619

answered Nov 21 '18 at 4:23

pygo

3,1751619

I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

– coldspeed
Nov 21 '18 at 6:39

This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

– pygo
Nov 21 '18 at 6:42

add a comment |

I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

– coldspeed
Nov 21 '18 at 6:39

This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

– pygo
Nov 21 '18 at 6:42

I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

– coldspeed
Nov 21 '18 at 6:39

This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

– pygo
Nov 21 '18 at 6:42

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk