Python Cumsum Increment Every Time New Value Is Encountered
Coming from R, the code would be x <- data.frame(vals = c(100,100,100,100,100,100,200,200,200,200,200,200,200,300,300,300,300,300)) x$state <- cumsum(c(1, diff(x$vals)
Solution 1:
Using diff
and cumsum
, as in your R
example:
df['state'] = (df['vals'].diff()!= 0).cumsum()
This uses the fact that True
has integer value 1
Bonus question
df_grouped = df_all.groupby("filename")
df_all["state"] = (df_grouped['Fit'].diff() != 0).cumsum()
I think you misunderstand what groupby
does. All groupby does is create groups based on the criterium (filename
in this instance). You then need to tell add another operation to tell what needs to happen with this group.
Common operations are mean
, sum
, or more advanced as apply
and transform
.
You can find more information here or here
If you can explain more in detail what you want to achieve with the groupby I can help you find the correct method. If you want to perform the above operation per filename, you probably need something like this:
def get_state(group):
return (group.diff()!= 0).cumsum()
df_all['state'] = df_all.groupby('filename')['Fit'].transform(get_state)
Post a Comment for "Python Cumsum Increment Every Time New Value Is Encountered"