Home > Back-end >  I'm just learning python and would like some help please with removing data from a CSV Please
I'm just learning python and would like some help please with removing data from a CSV Please

Time:02-08

The CSV File I'm trying to change has 2 columns

Part Number.    Qty
h36fsde (1)      6
gr3jgr  (1)      2
qmdd12  (2)      9

It is basically a list of part numbers in the first column followed by quantities in the second column. I need to remove the (number in brackets) in column one as above (1) (1) (2) because this relates to page numbers and it causes a problem when I try to import it. Idealy my list should be-

Part Number.    Qty
h36fsde          6
gr3jgr           2
qmdd12           9

So far the code I have allows me to choose the file I wish to change with Tkinter, then Im trying to replace "(1)" with " " and then create a new directory if it doesn't exist called quotations and write the csv to that directory. So far it loads the file okay and writes a file new5.csv to the quotations folder, however it does not replace the "(1)" with a " " so please can someone let me know what I am doing wrong? I either need to simply remove the bracketed numbers all together or replace them with a " "

Please see my code so far below. I will try to tidy it up with def functions, however I'd like to get it working first and then see how I can improve it and make it more modular.

import tkinter as tk
from tkinter.filedialog import askopenfilename
import pandas as pd
import os  
from pathlib import Path  

root = tk.Tk()
root.withdraw() #Prevents the Tkinter window to come up
csvpath = askopenfilename()
root.destroy()

df = pd.read_csv(csvpath)
df.replace('(1)', ' ', inplace=True)

filepath = Path('quotations/new5.csv')  
filepath.parent.mkdir(parents=True, exist_ok=True)  
df.to_csv(filepath)  

Once I have it working I will need to set it to remove (numbers in brackets) from (1) to (20). Plus is there an easy way for me to write the edited file straight back over the top of the file that was opened rather than keep saving the edited file as "New5.csv" and then renaming it in explorer?

Any help is greatly appreciated, Thanks in advance.

CodePudding user response:

df.replace finds and replaces whole cells, not text within them, unless the string to find is given as a regular expression:

df.replace(r" \(1\)", "", regex=True, inplace=True)

Alternatively, use Series.str.replace instead.

df["Part Number."] = df["Part Number."].str.replace('(1)', '')

To replace all the bracketed numbers in one go, use a more general regex:

df["Part Number."] = df["Part Number."].str.replace(r" \(\d\d?\)", '', regex=True)
  •  Tags:  
  • Related