Validating Dataframe Column Data
I have a below pseudocode which I need to write using pandas. if group_min_size && group_max_size if group_min_size == 0 && group_max_size > 0 if
Solution 1:
Just answer your questions step by step. Begin by creating your booleans:
min_equal_0 = df['group_min_size'] == 0min_above_0 = df['group_min_size'] > 0min_above_equal_2 = df['group_min_size'] >= 2min_below_2 = df['group_min_size'] < 2max_equal_0 = df['group_max_size'] == 0max_above_0 = df['group_max_size'] > 0max_above_equal_2 = df['group_max_size'] >= 2max_below_2 = df['group_max_size'] < 2
Now we can look at creating our masks according to the pseudo-code:
first_mask = ~(min_equal_0 & max_above_0 & (max_below_2 | max_above_equal_2))
second_mask = ~(max_equal_0 & min_above_0 & (min_below_2 | min_above_equal_2))
If we combine the two:
>> first_mask & second_mask
0False1True2False3False4True5True6True7True8True
dtype: bool
If you want to treat NaN
as False
, just add them:
min_is_not_null = df['group_min_size'].notnull()
max_is_not_null = df['group_max_size'].notnull()
>> min_is_not_null & max_is_not_null & first_mask & second_mask
0False1True2False3False4False5True6True7True8True
dtype: bool
Post a Comment for "Validating Dataframe Column Data"