Skip to content Skip to sidebar Skip to footer

How To Label Rows By Unique Pairs Of Other Rows In Pandas 0.19.2

I have a dataframe df like this but much larger. ID_0 ID_1 location 0 a b 1 1 a c 1 2 a b 0 3 d c 0 4 a c 0 5 a c 1

Solution 1:

You need GroupBy.ngroup, new in 0.20.2:

df['group_ID'] = df.groupby(['ID_0', 'ID_1']).ngroup()
print (df)
  ID_0 ID_1  location  group_ID
0    a    b         1         0
1    a    c         1         1
2    a    b         0         0
3    d    c         0         2
4    a    c         0         1
5    a    c         1         1

df['group_ID'] = df.groupby(['ID_0', 'ID_1']).grouper.group_info[0]
print (df)
  ID_0 ID_1  location  group_ID
0    a    b         1         0
1    a    c         1         1
2    a    b         0         0
3    d    c         0         2
4    a    c         0         1
5    a    c         1         1

Solution 2:

This should do the trick without using the GroupBy.ngroup which is only supported in newer pandas versions:

df['group_ID'] = df.groupby(['ID_0', 'ID_1']).grouper.group_info[0]

    ID_0    ID_1    location    group_ID
0   a       b       1           0
1   a       c       1           1
2   a       b       0           0
3   d       c       0           2
4   a       c       0           1

Find more information at this SO post: Python Pandas: How can I group by and assign an id to all the items in a group?

Post a Comment for "How To Label Rows By Unique Pairs Of Other Rows In Pandas 0.19.2"