Shuffle Rows By A Column In Pandas
I have the following example of dataframe. c1 c2 0 1 a 1 2 b 2 3 c 3 4 d 4 5 e Given a template c1 = [3, 2, 5, 4, 1], I want to cha
Solution 1:
If values are unique in list and also in c1
column use reindex
:
df = df.set_index('c1').reindex(c1).reset_index()
print (df)
c1 c2
0 3 c
1 2 b
2 5 e
3 4 d
4 1 a
General solution working with duplicates in list and also in column:
c1 = [3, 2, 5, 4, 1, 3, 2, 3]
#create df from list
list_df = pd.DataFrame({'c1':c1})
print (list_df)
c1
0 3
1 2
2 5
3 4
4 1
5 3
6 2
7 3
#helper column for count duplicates valuesdf['g'] = df.groupby('c1').cumcount()
list_df['g'] = list_df.groupby('c1').cumcount()
#merge together, create index from column and remove g columndf = list_df.merge(df).drop('g', axis=1)
print (df)
c1 c2
0 3 c
1 2 b
2 5 e
3 4 d
4 1 a
5 3 c
Solution 2:
merge
You can create a dataframe with the column specified in the wanted order then merge
.
One advantage of this approach is that it gracefully handles duplicates in either df.c1
or the list c1
. If duplicates not wanted then care must be taken to handle them prior to reordering.
d1 = pd.DataFrame({'c1': c1})
d1.merge(df)
c1 c2
0 3 c
1 2 b
2 5 e
3 4 d
4 1 a
searchsorted
This is less robust but will work if df.c1
is:
- already sorted
- one-to-one mapping
df.iloc[df.c1.searchsorted(c1)]
c1 c2
23 c
12b45 e
34 d
01a
Post a Comment for "Shuffle Rows By A Column In Pandas"