Compare dataframe columns with conditions











up vote
1
down vote

favorite












I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2









share|improve this question
























  • You do not have list in df2
    – W-B
    Nov 22 at 1:25










  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
    – Osceria
    Nov 22 at 9:54










  • Edited the Question
    – Osceria
    Nov 22 at 11:46















up vote
1
down vote

favorite












I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2









share|improve this question
























  • You do not have list in df2
    – W-B
    Nov 22 at 1:25










  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
    – Osceria
    Nov 22 at 9:54










  • Edited the Question
    – Osceria
    Nov 22 at 11:46













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2









share|improve this question















I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2






python pandas dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 at 11:45

























asked Nov 21 at 23:54









Osceria

479




479












  • You do not have list in df2
    – W-B
    Nov 22 at 1:25










  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
    – Osceria
    Nov 22 at 9:54










  • Edited the Question
    – Osceria
    Nov 22 at 11:46


















  • You do not have list in df2
    – W-B
    Nov 22 at 1:25










  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
    – Osceria
    Nov 22 at 9:54










  • Edited the Question
    – Osceria
    Nov 22 at 11:46
















You do not have list in df2
– W-B
Nov 22 at 1:25




You do not have list in df2
– W-B
Nov 22 at 1:25












list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54




list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54












Edited the Question
– Osceria
Nov 22 at 11:46




Edited the Question
– Osceria
Nov 22 at 11:46












2 Answers
2






active

oldest

votes

















up vote
0
down vote



accepted










Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer





















  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
    – Osceria
    Nov 22 at 14:37












  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
    – Osceria
    Nov 22 at 14:41










  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
    – jezrael
    Nov 22 at 14:43






  • 1




    yeah, it works in this way too
    – Osceria
    Nov 22 at 15:07


















up vote
0
down vote













Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer





















  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
    – Osceria
    Nov 22 at 10:23










  • Edited the Question
    – Osceria
    Nov 22 at 11:46










  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
    – leoburgy
    Nov 22 at 12:01










  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?
    – lieblos
    Nov 22 at 12:47












  • If I run what I answered with the dataframes above, it seems like it works.
    – lieblos
    Nov 22 at 12:49











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote



accepted










Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer





















  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
    – Osceria
    Nov 22 at 14:37












  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
    – Osceria
    Nov 22 at 14:41










  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
    – jezrael
    Nov 22 at 14:43






  • 1




    yeah, it works in this way too
    – Osceria
    Nov 22 at 15:07















up vote
0
down vote



accepted










Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer





















  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
    – Osceria
    Nov 22 at 14:37












  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
    – Osceria
    Nov 22 at 14:41










  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
    – jezrael
    Nov 22 at 14:43






  • 1




    yeah, it works in this way too
    – Osceria
    Nov 22 at 15:07













up vote
0
down vote



accepted







up vote
0
down vote



accepted






Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer












Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 22 at 12:08









jezrael

310k21246321




310k21246321












  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
    – Osceria
    Nov 22 at 14:37












  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
    – Osceria
    Nov 22 at 14:41










  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
    – jezrael
    Nov 22 at 14:43






  • 1




    yeah, it works in this way too
    – Osceria
    Nov 22 at 15:07


















  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
    – Osceria
    Nov 22 at 14:37












  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
    – Osceria
    Nov 22 at 14:41










  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
    – jezrael
    Nov 22 at 14:43






  • 1




    yeah, it works in this way too
    – Osceria
    Nov 22 at 15:07
















m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37






m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37














code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41




code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41












@Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43




@Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43




1




1




yeah, it works in this way too
– Osceria
Nov 22 at 15:07




yeah, it works in this way too
– Osceria
Nov 22 at 15:07












up vote
0
down vote













Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer





















  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
    – Osceria
    Nov 22 at 10:23










  • Edited the Question
    – Osceria
    Nov 22 at 11:46










  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
    – leoburgy
    Nov 22 at 12:01










  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?
    – lieblos
    Nov 22 at 12:47












  • If I run what I answered with the dataframes above, it seems like it works.
    – lieblos
    Nov 22 at 12:49















up vote
0
down vote













Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer





















  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
    – Osceria
    Nov 22 at 10:23










  • Edited the Question
    – Osceria
    Nov 22 at 11:46










  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
    – leoburgy
    Nov 22 at 12:01










  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?
    – lieblos
    Nov 22 at 12:47












  • If I run what I answered with the dataframes above, it seems like it works.
    – lieblos
    Nov 22 at 12:49













up vote
0
down vote










up vote
0
down vote









Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer












Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 22 at 0:20









lieblos

1029




1029












  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
    – Osceria
    Nov 22 at 10:23










  • Edited the Question
    – Osceria
    Nov 22 at 11:46










  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
    – leoburgy
    Nov 22 at 12:01










  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?
    – lieblos
    Nov 22 at 12:47












  • If I run what I answered with the dataframes above, it seems like it works.
    – lieblos
    Nov 22 at 12:49


















  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
    – Osceria
    Nov 22 at 10:23










  • Edited the Question
    – Osceria
    Nov 22 at 11:46










  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
    – leoburgy
    Nov 22 at 12:01










  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?
    – lieblos
    Nov 22 at 12:47












  • If I run what I answered with the dataframes above, it seems like it works.
    – lieblos
    Nov 22 at 12:49
















When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23




When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23












Edited the Question
– Osceria
Nov 22 at 11:46




Edited the Question
– Osceria
Nov 22 at 11:46












@Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01




@Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01












It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 at 12:47






It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 at 12:47














If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49




If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Trompette piccolo

Slow SSRS Report in dynamic grouping and multiple parameters

Simon Yates (cyclisme)