Shuffle Rows By A Column In Pandas

October 07, 2024 Post a Comment

I have the following example of dataframe. c1 c2 0 1 a 1 2 b 2 3 c 3 4 d 4 5 e Given a template c1 = [3, 2, 5, 4, 1], I want to cha

Solution 1:

If values are unique in list and also in c1 column use reindex:

df = df.set_index('c1').reindex(c1).reset_index()
print (df)
   c1 c2
0   3  c
1   2  b
2   5  e
3   4  d
4   1  a

General solution working with duplicates in list and also in column:

c1 = [3, 2, 5, 4, 1, 3, 2, 3]

#create df from list 
list_df = pd.DataFrame({'c1':c1})
print (list_df)
   c1
0   3
1   2
2   5
3   4
4   1
5   3
6   2
7   3

#helper column for count duplicates valuesdf['g'] = df.groupby('c1').cumcount()
list_df['g'] = list_df.groupby('c1').cumcount()

#merge together, create index from column and remove g columndf = list_df.merge(df).drop('g', axis=1)
print (df)
   c1 c2
0   3  c
1   2  b
2   5  e
3   4  d
4   1  a
5   3  c

Solution 2:

`merge`

You can create a dataframe with the column specified in the wanted order then merge. One advantage of this approach is that it gracefully handles duplicates in either df.c1 or the list c1. If duplicates not wanted then care must be taken to handle them prior to reordering.

d1 = pd.DataFrame({'c1': c1})

d1.merge(df)

   c1 c2
0   3  c
1   2  b
2   5  e
3   4  d
4   1  a

`searchsorted`

This is less robust but will work if df.c1 is:

already sorted
one-to-one mapping

df.iloc[df.c1.searchsorted(c1)]

   c1 c2
23  c
12b45  e
34  d
01a

Python Channel

Shuffle Rows By A Column In Pandas

Solution 1:

Solution 2:

`merge`

`searchsorted`

Post a Comment for "Shuffle Rows By A Column In Pandas"