Use column combinations to find data mismatch in rows pandas












0















What's the best way to get all cell values based on a combination of column values?



Sample dataframe One:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


Sample dataframe Two:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.



I'm using Pandas/Python3.7



Sample Output:




BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300











share|improve this question

























  • Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

    – pygo
    Nov 21 '18 at 4:31











  • Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

    – Greedy Coder
    Nov 21 '18 at 14:24











  • @ Greedy coder, then my given answer fits into your solution to get the match as you want.

    – pygo
    Nov 21 '18 at 14:26











  • Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

    – pygo
    Nov 21 '18 at 14:38
















0















What's the best way to get all cell values based on a combination of column values?



Sample dataframe One:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


Sample dataframe Two:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.



I'm using Pandas/Python3.7



Sample Output:




BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300











share|improve this question

























  • Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

    – pygo
    Nov 21 '18 at 4:31











  • Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

    – Greedy Coder
    Nov 21 '18 at 14:24











  • @ Greedy coder, then my given answer fits into your solution to get the match as you want.

    – pygo
    Nov 21 '18 at 14:26











  • Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

    – pygo
    Nov 21 '18 at 14:38














0












0








0








What's the best way to get all cell values based on a combination of column values?



Sample dataframe One:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


Sample dataframe Two:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.



I'm using Pandas/Python3.7



Sample Output:




BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300











share|improve this question
















What's the best way to get all cell values based on a combination of column values?



Sample dataframe One:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Bank of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


Sample dataframe Two:



  Stock                         Name  Price
0 AMD Advanced Micro Devices 100
1 GE General Electric Company 200
2 BAC Branch of America Corporation 300
3 AAPL Apple Inc. 500
4 MSFT Microsoft Corporation 1000
5 GOOGL Alphabet Inc. 2000


For example: I want to use (Stock and Name) as key columns and then compare the datasets. The goal is to print the mismatch entries between the two datasets with the Stock+Name columns used as a combination key.



I'm using Pandas/Python3.7



Sample Output:




BAC Bank of America Corporation 300 --- BAC Branch of America
Corporation 300








python python-3.x pandas dataframe merge






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 '18 at 21:50









coldspeed

135k23145230




135k23145230










asked Nov 20 '18 at 21:42









Greedy CoderGreedy Coder

6118




6118













  • Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

    – pygo
    Nov 21 '18 at 4:31











  • Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

    – Greedy Coder
    Nov 21 '18 at 14:24











  • @ Greedy coder, then my given answer fits into your solution to get the match as you want.

    – pygo
    Nov 21 '18 at 14:26











  • Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

    – pygo
    Nov 21 '18 at 14:38



















  • Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

    – pygo
    Nov 21 '18 at 4:31











  • Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

    – Greedy Coder
    Nov 21 '18 at 14:24











  • @ Greedy coder, then my given answer fits into your solution to get the match as you want.

    – pygo
    Nov 21 '18 at 14:26











  • Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

    – pygo
    Nov 21 '18 at 14:38

















Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

– pygo
Nov 21 '18 at 4:31





Do you have stock names constant between both the DataFrames or Stock name also be mismatched?

– pygo
Nov 21 '18 at 4:31













Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

– Greedy Coder
Nov 21 '18 at 14:24





Stock names are consistent between the dataframes - but other columns associated with it can be different - which I want to identify.

– Greedy Coder
Nov 21 '18 at 14:24













@ Greedy coder, then my given answer fits into your solution to get the match as you want.

– pygo
Nov 21 '18 at 14:26





@ Greedy coder, then my given answer fits into your solution to get the match as you want.

– pygo
Nov 21 '18 at 14:26













Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

– pygo
Nov 21 '18 at 14:38





Greedy coder, you can upvote and mark the accepted answer which fits into your requirement this is how it will be moved from the un-answered queue.

– pygo
Nov 21 '18 at 14:38












3 Answers
3






active

oldest

votes


















1














Perhaps, a FULL INNER JOIN using merge + query?



df1.merge(df2, on='Stock').query('Name_x != Name_y')

Stock Name_x Price_x Name_y Price_y
2 BAC Bank of America Corporation 300 Branch of America Corporation 300




Or, a slightly different solution with map, you can use to get the stock symbols:



m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
symbols = df1.loc[m, 'Stock']

print(symbols)
2 BAC
Name: Stock, dtype: object


And then access each DataFrame row by stock symbol:



df1[df1.Stock.isin(symbols)]
Stock Name Price
2 BAC Bank of America Corporation 300

df2[df2.Stock.isin(symbols)]
Stock Name Price
2 BAC Branch of America Corporation 300





share|improve this answer





















  • 1





    may the Name same but the Stock does not ?

    – Wen-Ben
    Nov 20 '18 at 21:46











  • @W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

    – coldspeed
    Nov 20 '18 at 21:48











  • That's incorrect.

    – coldspeed
    Nov 21 '18 at 6:40











  • The Name field can be different/have errors/typos - that's what I want to catch.

    – Greedy Coder
    Nov 21 '18 at 14:25



















0














If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:



import pandas as pd

df1 = pd.DataFrame({
"Ticker_y": list("qwerty"),
"Name_y": list("asdfgh"),
"Ticker_x": list("qw3r7y"),
"Name_x": list("as6f8h")
})

mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]


The last line just says "the df only where these conditions are met."






share|improve this answer































    0














    We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values



    First DataFrame



    >>> df1
    Stock Name Price
    0 AMD Advanced Micro Devices 100
    1 GE General Electric Company 200
    2 BAC Bank of America Corporation 300
    3 APPL Apple Inc. 500
    4 MSFT Microsoft Corporation 1000
    5 GOOGL Alphabet Inc. 2000


    Second DataFrame



    >>> df2
    Stock Name Price
    0 AMD Advanced Micro Devices 100
    1 GE General Electric Company 200
    2 BAC Branch of America Corporation 300
    3 APPL Apple Inc. 500
    4 MSFT Microsoft Corporation 1000
    5 GOOGL Alphabet Inc. 2000


    Here you can go..



    >>> df2[~df2.Name.isin(df1.Name.values)]
    Stock Name Price
    2 BAC Branch of America Corporation 300


    OR



    >>> df1[~df1.Name.isin(df2.Name.values)]
    Stock Name Price
    2 BAC Bank of America Corporation 300





    share|improve this answer
























    • I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

      – coldspeed
      Nov 21 '18 at 6:39











    • This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

      – pygo
      Nov 21 '18 at 6:42













    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402012%2fuse-column-combinations-to-find-data-mismatch-in-rows-pandas%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Perhaps, a FULL INNER JOIN using merge + query?



    df1.merge(df2, on='Stock').query('Name_x != Name_y')

    Stock Name_x Price_x Name_y Price_y
    2 BAC Bank of America Corporation 300 Branch of America Corporation 300




    Or, a slightly different solution with map, you can use to get the stock symbols:



    m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
    symbols = df1.loc[m, 'Stock']

    print(symbols)
    2 BAC
    Name: Stock, dtype: object


    And then access each DataFrame row by stock symbol:



    df1[df1.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Bank of America Corporation 300

    df2[df2.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Branch of America Corporation 300





    share|improve this answer





















    • 1





      may the Name same but the Stock does not ?

      – Wen-Ben
      Nov 20 '18 at 21:46











    • @W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

      – coldspeed
      Nov 20 '18 at 21:48











    • That's incorrect.

      – coldspeed
      Nov 21 '18 at 6:40











    • The Name field can be different/have errors/typos - that's what I want to catch.

      – Greedy Coder
      Nov 21 '18 at 14:25
















    1














    Perhaps, a FULL INNER JOIN using merge + query?



    df1.merge(df2, on='Stock').query('Name_x != Name_y')

    Stock Name_x Price_x Name_y Price_y
    2 BAC Bank of America Corporation 300 Branch of America Corporation 300




    Or, a slightly different solution with map, you can use to get the stock symbols:



    m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
    symbols = df1.loc[m, 'Stock']

    print(symbols)
    2 BAC
    Name: Stock, dtype: object


    And then access each DataFrame row by stock symbol:



    df1[df1.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Bank of America Corporation 300

    df2[df2.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Branch of America Corporation 300





    share|improve this answer





















    • 1





      may the Name same but the Stock does not ?

      – Wen-Ben
      Nov 20 '18 at 21:46











    • @W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

      – coldspeed
      Nov 20 '18 at 21:48











    • That's incorrect.

      – coldspeed
      Nov 21 '18 at 6:40











    • The Name field can be different/have errors/typos - that's what I want to catch.

      – Greedy Coder
      Nov 21 '18 at 14:25














    1












    1








    1







    Perhaps, a FULL INNER JOIN using merge + query?



    df1.merge(df2, on='Stock').query('Name_x != Name_y')

    Stock Name_x Price_x Name_y Price_y
    2 BAC Bank of America Corporation 300 Branch of America Corporation 300




    Or, a slightly different solution with map, you can use to get the stock symbols:



    m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
    symbols = df1.loc[m, 'Stock']

    print(symbols)
    2 BAC
    Name: Stock, dtype: object


    And then access each DataFrame row by stock symbol:



    df1[df1.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Bank of America Corporation 300

    df2[df2.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Branch of America Corporation 300





    share|improve this answer















    Perhaps, a FULL INNER JOIN using merge + query?



    df1.merge(df2, on='Stock').query('Name_x != Name_y')

    Stock Name_x Price_x Name_y Price_y
    2 BAC Bank of America Corporation 300 Branch of America Corporation 300




    Or, a slightly different solution with map, you can use to get the stock symbols:



    m = df1.Stock.map(df2.set_index('Stock').Name).ne(df1.Name)
    symbols = df1.loc[m, 'Stock']

    print(symbols)
    2 BAC
    Name: Stock, dtype: object


    And then access each DataFrame row by stock symbol:



    df1[df1.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Bank of America Corporation 300

    df2[df2.Stock.isin(symbols)]
    Stock Name Price
    2 BAC Branch of America Corporation 300






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 20 '18 at 21:48

























    answered Nov 20 '18 at 21:45









    coldspeedcoldspeed

    135k23145230




    135k23145230








    • 1





      may the Name same but the Stock does not ?

      – Wen-Ben
      Nov 20 '18 at 21:46











    • @W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

      – coldspeed
      Nov 20 '18 at 21:48











    • That's incorrect.

      – coldspeed
      Nov 21 '18 at 6:40











    • The Name field can be different/have errors/typos - that's what I want to catch.

      – Greedy Coder
      Nov 21 '18 at 14:25














    • 1





      may the Name same but the Stock does not ?

      – Wen-Ben
      Nov 20 '18 at 21:46











    • @W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

      – coldspeed
      Nov 20 '18 at 21:48











    • That's incorrect.

      – coldspeed
      Nov 21 '18 at 6:40











    • The Name field can be different/have errors/typos - that's what I want to catch.

      – Greedy Coder
      Nov 21 '18 at 14:25








    1




    1





    may the Name same but the Stock does not ?

    – Wen-Ben
    Nov 20 '18 at 21:46





    may the Name same but the Stock does not ?

    – Wen-Ben
    Nov 20 '18 at 21:46













    @W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

    – coldspeed
    Nov 20 '18 at 21:48





    @W-B Good question, it may be possible but looking at this stock data, I don't know whether it will be an issue... will wait for OP :)

    – coldspeed
    Nov 20 '18 at 21:48













    That's incorrect.

    – coldspeed
    Nov 21 '18 at 6:40





    That's incorrect.

    – coldspeed
    Nov 21 '18 at 6:40













    The Name field can be different/have errors/typos - that's what I want to catch.

    – Greedy Coder
    Nov 21 '18 at 14:25





    The Name field can be different/have errors/typos - that's what I want to catch.

    – Greedy Coder
    Nov 21 '18 at 14:25













    0














    If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:



    import pandas as pd

    df1 = pd.DataFrame({
    "Ticker_y": list("qwerty"),
    "Name_y": list("asdfgh"),
    "Ticker_x": list("qw3r7y"),
    "Name_x": list("as6f8h")
    })

    mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]


    The last line just says "the df only where these conditions are met."






    share|improve this answer




























      0














      If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:



      import pandas as pd

      df1 = pd.DataFrame({
      "Ticker_y": list("qwerty"),
      "Name_y": list("asdfgh"),
      "Ticker_x": list("qw3r7y"),
      "Name_x": list("as6f8h")
      })

      mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]


      The last line just says "the df only where these conditions are met."






      share|improve this answer


























        0












        0








        0







        If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:



        import pandas as pd

        df1 = pd.DataFrame({
        "Ticker_y": list("qwerty"),
        "Name_y": list("asdfgh"),
        "Ticker_x": list("qw3r7y"),
        "Name_x": list("as6f8h")
        })

        mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]


        The last line just says "the df only where these conditions are met."






        share|improve this answer













        If they are in two dataframes, merging them without condition is pretty straightforward with .concat. Once they are joined, here's one way to get the mismatch:



        import pandas as pd

        df1 = pd.DataFrame({
        "Ticker_y": list("qwerty"),
        "Name_y": list("asdfgh"),
        "Ticker_x": list("qw3r7y"),
        "Name_x": list("as6f8h")
        })

        mismatch = df1[(df1["Ticker_y"] != df1["Ticker_x"]) & (df1["Name_y"] != df1["Name_x"])]


        The last line just says "the df only where these conditions are met."







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 20 '18 at 21:52









        Charles LandauCharles Landau

        2,7031216




        2,7031216























            0














            We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values



            First DataFrame



            >>> df1
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Bank of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Second DataFrame



            >>> df2
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Branch of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Here you can go..



            >>> df2[~df2.Name.isin(df1.Name.values)]
            Stock Name Price
            2 BAC Branch of America Corporation 300


            OR



            >>> df1[~df1.Name.isin(df2.Name.values)]
            Stock Name Price
            2 BAC Bank of America Corporation 300





            share|improve this answer
























            • I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

              – coldspeed
              Nov 21 '18 at 6:39











            • This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

              – pygo
              Nov 21 '18 at 6:42


















            0














            We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values



            First DataFrame



            >>> df1
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Bank of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Second DataFrame



            >>> df2
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Branch of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Here you can go..



            >>> df2[~df2.Name.isin(df1.Name.values)]
            Stock Name Price
            2 BAC Branch of America Corporation 300


            OR



            >>> df1[~df1.Name.isin(df2.Name.values)]
            Stock Name Price
            2 BAC Bank of America Corporation 300





            share|improve this answer
























            • I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

              – coldspeed
              Nov 21 '18 at 6:39











            • This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

              – pygo
              Nov 21 '18 at 6:42
















            0












            0








            0







            We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values



            First DataFrame



            >>> df1
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Bank of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Second DataFrame



            >>> df2
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Branch of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Here you can go..



            >>> df2[~df2.Name.isin(df1.Name.values)]
            Stock Name Price
            2 BAC Branch of America Corporation 300


            OR



            >>> df1[~df1.Name.isin(df2.Name.values)]
            Stock Name Price
            2 BAC Bank of America Corporation 300





            share|improve this answer













            We can use isin using the sequence of values to test as it ensures each element in the DataFrame is contained in values



            First DataFrame



            >>> df1
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Bank of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Second DataFrame



            >>> df2
            Stock Name Price
            0 AMD Advanced Micro Devices 100
            1 GE General Electric Company 200
            2 BAC Branch of America Corporation 300
            3 APPL Apple Inc. 500
            4 MSFT Microsoft Corporation 1000
            5 GOOGL Alphabet Inc. 2000


            Here you can go..



            >>> df2[~df2.Name.isin(df1.Name.values)]
            Stock Name Price
            2 BAC Branch of America Corporation 300


            OR



            >>> df1[~df1.Name.isin(df2.Name.values)]
            Stock Name Price
            2 BAC Bank of America Corporation 300






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 21 '18 at 4:23









            pygopygo

            3,1751619




            3,1751619













            • I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

              – coldspeed
              Nov 21 '18 at 6:39











            • This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

              – pygo
              Nov 21 '18 at 6:42





















            • I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

              – coldspeed
              Nov 21 '18 at 6:39











            • This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

              – pygo
              Nov 21 '18 at 6:42



















            I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

            – coldspeed
            Nov 21 '18 at 6:39





            I don't think this is right. The idea is to find all rows with the same symbol but different name. This is only going to find those names that are not common across both DataFrames (not the same thing)

            – coldspeed
            Nov 21 '18 at 6:39













            This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

            – pygo
            Nov 21 '18 at 6:42







            This will work if Stock names constant between both the DataFrames, i have asked that to OP to clarify as it looks similar in the POST.

            – pygo
            Nov 21 '18 at 6:42




















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402012%2fuse-column-combinations-to-find-data-mismatch-in-rows-pandas%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Guess what letter conforming each word

            Port of Spain

            Run scheduled task as local user group (not BUILTIN)