Skip to content Skip to sidebar Skip to footer

Compare Df1 Column 1 To All Columns In Df2 Returning The Index Of Df2

I'm new to pandas so likely overlooking something but I've been searching and haven't found anything helpful yet. What I'm trying to do is this. I have 2 dataframes. df1 has only

Solution 1:

I think you can use isin for testing matching of Series created from df2 by stack with Series created from one column df1 by squeeze. Last reshape by unstack:

df3 = df2.stack().isin(df1.squeeze()).unstack()
print (df3)
                12345678302813476FalseFalseFalseFalseFalseFalseFalse8302813477FalseFalseFalseFalseFalseFalseFalse8302813478FalseFalseTrueFalseFalseFalseFalse

Then get find all values where at least one True by any:

a = df3.any(axis=1)
print (a)
8302813476False8302813477False8302813478True
dtype: bool

And last boolean indexing:

print (a[a].index)
Int64Index([8302813478], dtype='int64')

Another solution is instead squeeze use df1['col'].unique(), thank you Ted Petrou:

df3 = df2.stack().isin(df1['col'].unique()).unstack()
print (df3)
                12345678302813476FalseFalseFalseFalseFalseFalseFalse8302813477FalseFalseFalseFalseFalseFalseFalse8302813478FalseFalseTrueFalseFalseFalseFalse

---

I like squeeze more, but same output is simple selecting column of df1:

df3 = df2.stack().isin(df1['col']).unstack()
print (df3)
                12345678302813476FalseFalseFalseFalseFalseFalseFalse8302813477FalseFalseFalseFalseFalseFalseFalse8302813478FalseFalseTrueFalseFalseFalseFalse

Solution 2:

As an interesting numpy alternative

l1 = df1.values.ravel()
l2 = df2.values.ravel()

pd.DataFrame(
    np.equal.outer(l1, l2).any(0).reshape(df2.values.shape),
    df2.index, df2.columns
)

or using set, list and comprehension

l1 = set(df1.values.ravel().tolist())
l2 = df2.values.ravel().tolist()

pd.DataFrame(
    np.array([bool(l1.intersection([d])) for d in l2]).reshape(df2.values.shape),
    df2.index, df2.columns
)

enter image description here

Post a Comment for "Compare Df1 Column 1 To All Columns In Df2 Returning The Index Of Df2"