Skip to content Skip to sidebar Skip to footer

Counting Changes Of Value In Each Column In A Data Frame In Pandas

Is there any neat way to count the number of changes of value in each column in a data frame in pandas? I don't want to have to loop myself over each column, e.g.: import pandas as

Solution 1:

If the values are numeric you could take the differences between adjacent rows and test if the difference is non-zero. Then take a sum down each column to count the number of changes in value:

In [48]: (frame.diff(axis=0) != 0).sum(axis=0)
Out[48]: 
time    3
X1      3
X2      2
dtype: int64

If the values are not necessarily numeric, then a more general way would be to compare the frame against itself shift-ed down by one row -- this is similar to the code you posted, except the operation is done on the entire DataFrame instead of column-by-column:

In [50]: (frame != frame.shift(axis=0)).sum(axis=0)
Out[50]: 
time    3
X1      3
X2      2
dtype: int64

The numeric version is faster, the shifted version is more robust.

Post a Comment for "Counting Changes Of Value In Each Column In A Data Frame In Pandas"