Compare Df1 Column 1 To All Columns In Df2 Returning The Index Of Df2
I'm new to pandas so likely overlooking something but I've been searching and haven't found anything helpful yet. What I'm trying to do is this. I have 2 dataframes. df1 has only
Solution 1:
I think you can use isin for testing matching of Series created from df2 by stack with Series created from one column df1 by squeeze. Last reshape by unstack:
df3 = df2.stack().isin(df1.squeeze()).unstack()
print (df3)
12345678302813476FalseFalseFalseFalseFalseFalseFalse8302813477FalseFalseFalseFalseFalseFalseFalse8302813478FalseFalseTrueFalseFalseFalseFalseThen get find all values where at least one True by any:
a = df3.any(axis=1)
print (a)
8302813476False8302813477False8302813478True
dtype: boolAnd last boolean indexing:
print (a[a].index)
Int64Index([8302813478], dtype='int64')
Another solution is instead squeeze use df1['col'].unique(), thank you Ted Petrou:
df3 = df2.stack().isin(df1['col'].unique()).unstack()
print (df3)
12345678302813476FalseFalseFalseFalseFalseFalseFalse8302813477FalseFalseFalseFalseFalseFalseFalse8302813478FalseFalseTrueFalseFalseFalseFalse---
I like squeeze more, but same output is simple selecting column of df1:
df3 = df2.stack().isin(df1['col']).unstack()
print (df3)
12345678302813476FalseFalseFalseFalseFalseFalseFalse8302813477FalseFalseFalseFalseFalseFalseFalse8302813478FalseFalseTrueFalseFalseFalseFalseSolution 2:
As an interesting numpy alternative
l1 = df1.values.ravel()
l2 = df2.values.ravel()
pd.DataFrame(
np.equal.outer(l1, l2).any(0).reshape(df2.values.shape),
df2.index, df2.columns
)
or using set, list and comprehension
l1 = set(df1.values.ravel().tolist())
l2 = df2.values.ravel().tolist()
pd.DataFrame(
np.array([bool(l1.intersection([d])) for d in l2]).reshape(df2.values.shape),
df2.index, df2.columns
)

Post a Comment for "Compare Df1 Column 1 To All Columns In Df2 Returning The Index Of Df2"