Skip to content Skip to sidebar Skip to footer

Python Cumsum Increment Every Time New Value Is Encountered

Coming from R, the code would be x <- data.frame(vals = c(100,100,100,100,100,100,200,200,200,200,200,200,200,300,300,300,300,300)) x$state <- cumsum(c(1, diff(x$vals)

Solution 1:

Using diff and cumsum, as in your R example:

df['state'] = (df['vals'].diff()!= 0).cumsum()

This uses the fact that True has integer value 1

Bonus question

df_grouped = df_all.groupby("filename")
df_all["state"] = (df_grouped['Fit'].diff() != 0).cumsum()

I think you misunderstand what groupby does. All groupby does is create groups based on the criterium (filename in this instance). You then need to tell add another operation to tell what needs to happen with this group. Common operations are mean, sum, or more advanced as apply and transform. You can find more information here or here

If you can explain more in detail what you want to achieve with the groupby I can help you find the correct method. If you want to perform the above operation per filename, you probably need something like this:

def get_state(group):
    return (group.diff()!= 0).cumsum()

df_all['state'] = df_all.groupby('filename')['Fit'].transform(get_state)

Post a Comment for "Python Cumsum Increment Every Time New Value Is Encountered"