Get Column Names For Max Values Over A Certain Row In A Pandas DataFrame
In the DataFrame import pandas as pd df=pd.DataFrame({'col1':[1,2,3],'col2':[3,2,1],'col3':[1,1,1]},index= ['row1','row2','row3']) print df col1 col2 col3 row1 1
Solution 1:
If not duplicates, you can use idxmax
, but it return only first column of max
value:
print (df.idxmax(1))
row1 col2
row2 col1
row3 col1
dtype: object
def get_column_name_for_max_values_of(row):
return df.idxmax(1).ix[row]
print (get_column_name_for_max_values_of('row2'))
col1
But with duplicates use boolean indexing
:
print (df.ix['row2'] == df.ix['row2'].max())
col1 True
col2 True
col3 False
Name: row2, dtype: bool
print (df.ix[:,df.ix['row2'] == df.ix['row2'].max()])
col1 col2
row1 1 3
row2 2 2
row3 3 1
print (df.ix[:,df.ix['row2'] == df.ix['row2'].max()].columns)
Index(['col1', 'col2'], dtype='object')
And function is:
def get_column_name_for_max_values_of(row):
return df.ix[:,df.ix[row] == df.ix[row].max()].columns.tolist()
print (get_column_name_for_max_values_of('row2'))
['col1', 'col2']
Solution 2:
you could also use apply and create a method such has:
def returncolname(row, colnames):
return colnames[np.argmax(row.values)]
df['colmax'] = df.apply(lambda x: returncolname(x, df.columns), axis=1)
Out[62]:
row1 col2
row2 col1
row3 col1
dtype: object
an you can use df.max(axis=1) to extract maxes
df.max(axis=1)
Out[69]:
row1 3
row2 2
row3 3
Post a Comment for "Get Column Names For Max Values Over A Certain Row In A Pandas DataFrame"