Skip to content Skip to sidebar Skip to footer

Shuffle Rows By A Column In Pandas

I have the following example of dataframe. c1 c2 0 1 a 1 2 b 2 3 c 3 4 d 4 5 e Given a template c1 = [3, 2, 5, 4, 1], I want to cha

Solution 1:

If values are unique in list and also in c1 column use reindex:

df = df.set_index('c1').reindex(c1).reset_index()
print (df)
   c1 c2
0   3  c
1   2  b
2   5  e
3   4  d
4   1  a

General solution working with duplicates in list and also in column:

c1 = [3, 2, 5, 4, 1, 3, 2, 3]

#create df from list 
list_df = pd.DataFrame({'c1':c1})
print (list_df)
   c1
0   3
1   2
2   5
3   4
4   1
5   3
6   2
7   3

#helper column for count duplicates valuesdf['g'] = df.groupby('c1').cumcount()
list_df['g'] = list_df.groupby('c1').cumcount()

#merge together, create index from column and remove g columndf = list_df.merge(df).drop('g', axis=1)
print (df)
   c1 c2
0   3  c
1   2  b
2   5  e
3   4  d
4   1  a
5   3  c

Solution 2:

merge

You can create a dataframe with the column specified in the wanted order then merge. One advantage of this approach is that it gracefully handles duplicates in either df.c1 or the list c1. If duplicates not wanted then care must be taken to handle them prior to reordering.

d1 = pd.DataFrame({'c1': c1})

d1.merge(df)

   c1 c2
0   3  c
1   2  b
2   5  e
3   4  d
4   1  a

searchsorted

This is less robust but will work if df.c1 is:

  • already sorted
  • one-to-one mapping

df.iloc[df.c1.searchsorted(c1)]

   c1 c2
23  c
12b45  e
34  d
01a

Post a Comment for "Shuffle Rows By A Column In Pandas"