Skip to content Skip to sidebar Skip to footer

How To Delete Row Based On Row Above? Python Pandas

I have a dataset which looks like this: df = pd.DataFrame({'a': [1,1,1, 2, 3, 3, 4], 'b': [1,np.nan, np.nan, 2, 3, np.nan, 4]}) I'm looking to delete all rows which have np.nan in

Solution 1:

You want to find all the rows that have a np.nan in the next row. Use shift for that:

df.shift().isnull()

       a      b
0   True   True
1  False  False
2  False   True
3  False   True
4  False  False
5  False  False
6  False   True

Then you want to figure out if anything in that row was nan, so you want to reduce this to a single boolean mask.

df.shift().isnull().any(axis=1)

0     True
1    False
2     True
3     True
4    False
5    False
6     True
dtype: bool

Then just drop the columns:

df.drop(df.shift().isnull().any(axis=1))

   a   b
2  1 NaN
3  2   2
4  3   3
5  3 NaN
6  4   4

Solution 2:

Yes you can create a mask which will remove unwanted rows by combining df.notnull and df.shift:

notnull = df.notnull().all(axis=1)
df = df[notnull.shift(-1)]

Solution 3:

Test whether the rows are null with notnull:

In [11]: df.notnull()
Out[11]:
      a      b
0  True   True
1  True  False
2  True  False
3  True   True
4  True   True
5  True  False
6  True   True

In [12]: df.notnull().all(1)
Out[12]:
0     True
1    False
2    False
3     True
4     True
5    False
6     True
dtype: bool

In [13]: df[df.notnull().all(1)]
Out[13]:
   a  b
0  1  1
3  2  2
4  3  3
6  4  4

You can shift down to get whether the above row was NaN:

In [14]: df.notnull().all(1).shift().astype(bool)
Out[14]:
0     True
1     True
2    False
3    False
4     True
5     True
6    False
dtype: bool

In [15]: df[df.notnull().all(1).shift().astype(bool)]
Out[15]:
   a   b
0  1   1
1  1 NaN
4  3   3
5  3 NaN

Note: You can shift upwards with shift(-1).


Post a Comment for "How To Delete Row Based On Row Above? Python Pandas"