Skip to content Skip to sidebar Skip to footer

Get Unique Values And Their Occurrence Out Of One Dataframe Into A New Dataframe Using Pandas Dataframe

I want to turn my dataframe with non-distinct values underneath each column header into a dataframe with distinct values underneath each column header with next to it their occurre

Solution 1:

The difficult part is keeping values of columns in each row aligned. To do this, you need to construct a new dataframe from unique, and pd.concat on with value_counts map to each column of this new dataframe.

new_df =(pd.DataFrame([df[c].unique()forcin df], index=df.columns).T
            .dropna(how='all'))

df_final = pd.concat([new_df,*[new_df[c].map(df[c].value_counts()).rename(f'{c}_Count')forcin  df]], axis=1).reset_index(drop=True)

Out[1580]:
     A      B    C   D  A_Count  B_Count  C_Count  D_Count
00    CEN   T2  562.04.04.0112  DECEN   T1  451.01.03.0123  ONBEK  NaN842.01.0NaN13NaNNaNNaN59NaNNaNNaN14NaNNaNNaN87NaNNaNNaN15NaNNaNNaN98NaNNaNNaN16NaNNaNNaN23NaNNaNNaN17NaNNaNNaN65NaNNaNNaN1

If you only need to keep alignment between each pair of column and its count such as A - A_Count, B - B_Count..., it simply just use value_counts with reset_index some commands to change axis names

cols = df.columns.tolist()+(df.columns +'_Count').tolist()
new_df = pd.concat([df[col].value_counts(sort=False).rename_axis(col).reset_index(name=f'{col}_Count')for col in df], axis=1).reindex(new_cols, axis=1)

Out[1501]:
     A      B    C     D  A_Count  B_Count  C_Count  D_Count
00.0  ONBEK   T2  56.02.01.04.0112.0    CEN   T1  45.01.04.03.0123.0  DECEN  NaN84.02.01.0NaN13NaNNaNNaN59.0NaNNaNNaN14NaNNaNNaN87.0NaNNaNNaN15NaNNaNNaN98.0NaNNaNNaN16NaNNaNNaN23.0NaNNaNNaN17NaNNaNNaN65.0NaNNaNNaN1

Post a Comment for "Get Unique Values And Their Occurrence Out Of One Dataframe Into A New Dataframe Using Pandas Dataframe"