Home > Mobile >  Python Pandas drop
Python Pandas drop

Time:06-03

I build a script with Python and i use Pandas. I'm trying to delete line from a dataframe. I want to delete lines that contains empty values into two specific columns. If one of those two column is regularly completed but not the other one, the line is preserved. So i have build this code that works. But i'm beginner and i am sure that i can simplify my work. I'm sure i don't need loop "for" in my function. I think there is a way with a good method. I read the doc on internet but i found nothing. I try my best but i need help. Also for some reasons i don't want to use numpy.

So here my code :

import pandas as pnd


def drop_empty_line(df):
    a = df[(df["B"].isna()) & (df["C"].isna())].index
    for i in a:
        df = df.drop([i])
    return df
    
    
def main():
    df = pnd.DataFrame({
            "A": [5, 0, 4, 6, 5], 
            "B": [pnd.NA, 4, pnd.NA, pnd.NA, 5], 
            "C": [pnd.NA, pnd.NA, 9, pnd.NA, 8], 
            "D": [5, 3, 8, 5, 2], 
            "E": [pnd.NA, 4, 2, 0, 3]
            })
    
    print(drop_empty_line(df))
    
    
if __name__ == '__main__':
    main()

Thank you for your help.

CodePudding user response:

You indeed don't need a loop. You don't even need a custom function, there is already dropna:

df = df.dropna(subset=['B', 'C'], how='all')
# or in place:
# df.dropna(subset=['B', 'C'], how='all', inplace=True)

output:

   A     B     C  D  E
1  0     4  <NA>  3  4
2  4  <NA>     9  8  2
4  5     5     8  2  3
  • Related