Skip to content Skip to sidebar Skip to footer

How To Do Group By And Take Count Of Unique And Count Of Some Value As Aggregate On Same Column In Python Pandas?

My question is related to my previous Question but it's different. So I am asking the new question. In above question see the answer of @jezrael. df = pd.DataFrame({'col1':[1,1,1],

Solution 1:

Do some preprocessing by including the col4==3 as a column ahead of time. Then use aggregate

df.assign(result_col=df.col4.eq(3).astype(int)).groupby(
    ['col1', 'col2']
).agg(dict(col3='size', col4='nunique', result_col='sum'))

           col3  result_col  col4
col1 col2                        
142216101

old answers

g = df.groupby(['col1', 'col2'])
g.agg({'col3':'size','col4': 'nunique'}).assign(
    result_col=g.col4.apply(lambda x: x.eq(3).sum()))

           col3  col4  result_col
col1 col2                        
142126110

slightly rearranged

g = df.groupby(['col1', 'col2'])
final_df = g.agg({'col3':'size','col4': 'nunique'})
final_df.insert(1, 'result_col', g.col4.apply(lambda x: x.eq(3).sum()))
final_df

           col3  result_col  col4
col1 col2                        
142216101

Solution 2:

I think you need aggregate with list of function in dict for column col4.

If need count 3 values the simpliest is sumTrue values in x == 3:

df1 = df.groupby(['col1','col2'])
        .agg({'col3':'size','col4': ['nunique', lambda x: (x == 3).sum()]})
df1 = df1.rename(columns={'<lambda>':'count_3'})
df1.columns = ['{}_{}'.format(x[0], x[1]) for x in df1.columns]
print (df1)
           col4_nunique  col4_count_3  col3_size
col1 col2                                       
141226101

Post a Comment for "How To Do Group By And Take Count Of Unique And Count Of Some Value As Aggregate On Same Column In Python Pandas?"