Delete rows in CSV based on specific value-CodePudding

I want to delete specific rows in my CSV with Python. The CSV has multiple rows and columns.

import numpy as np

np.df2 = pd.read_csv('C:/Users/.../Data.csv', delimiter='\t')

np.df_2=np.df2[['Colum.X', 'Colum.Y']]

Python should open Data.csv and then delete every (complete) rows where the value of Colum.X > 5 or the value of Colum.Y > 20 in Data.csv.

CodePudding user response：

You can accomplish this with Pandas, no need for Numpy. I assume the columns in your csv are actually named 'Colum.X' and 'Colum.Y'.

import pandas as pd

df = pd.read_csv('C:/Users/.../Data.csv', delimiter='\t')
df = df.loc[df['Colum.X'] <= 5] # Take only the rows where Colum.X <= 5
df = df.loc[df['Colum.Y'] <= 20] # Take only the rows where Colum.Y <= 20 
df.to_csv('C:/Users/.../Data.csv', index=False) # Export back to csv (with comma's)

CodePudding user response：

Not entirely sure what you're doing with np.df2, but the following will work:

import pandas as pd

df = pd.read_csv('C:/Users/.../Data.csv', delimiter='\t')

df2 = df[(df['X'] <= 5) & (df['Y'] <= 20)]

You might have to add columns=['X', 'Y'] to the read_csv call, depending on what your CSV data looks like.

You can then overwrite the original file with:

df2.to_csv('C:/Users/.../Data.csv')