Skip to content Skip to sidebar Skip to footer

Update Pandas.DataFrame Within A Group After .groupby()

I have the following pandas.DataFrame: time offset ts op 0.000000 2015-1

Solution 1:

I'd use xs (cross-section) to do this:

In [11]: df1.xs("Compress", level="op")
Out[11]:
                                     time
offset   ts
0.000000 2015-10-27 18:31:40.318  253.649
4.960683 2015-10-27 18:36:37.959  280.747

In [12]: df1.xs("BuildIndex", level="op")
Out[12]:
                                     time
offset   ts
0.000000 2015-10-27 18:31:40.318  282.604
4.960683 2015-10-27 18:36:37.959  312.249

In [13]: df1.xs("BuildIndex", level="op") - df1.xs("Compress", level="op")
Out[13]:
                                    time
offset   ts
0.000000 2015-10-27 18:31:40.318  28.955
4.960683 2015-10-27 18:36:37.959  31.502

The subtraction works on the index labels (in this case offset and ts), so no need to group.


Solution 2:

Thanks a lot! .xs() solves the problem. Here's how I use it:

diff = df.xs("BuildIndex", level="op") - df.xs("Compress", level="op")
diff['op'] = 'BuildIndex'
diff = diff.reset_index().groupby(['offset', 'ts', 'op']).agg(lambda x: x)
df.update(diff)

The code looks quite ugly, though. Can someone suggest more elegant solution?


Solution 3:

Most elegant solution found! Just three lines of code:

df = df.unstack("op")
df['time', 'BuildIndex'] -= df['time', 'Compress']
df = df.stack()

(Here's the Discussion)


Post a Comment for "Update Pandas.DataFrame Within A Group After .groupby()"