Remove part contents in csv file using python-CodePudding

I have a CSV file that has a description column. In that description the string ends in 'CAD' then some numbers.

Here is what the description looks like.

|Description|
|Case Number and Notes: CAD ID: 21-012886|
|Case Number and Notes: 21-12882 , check a safety hazardCAD ID: 21-012882|
|Case Number and Notes: 21-12877 , Follow up|

What I want to do is delete the characters in those cells that is before 'CAD'.

Is this possible, and if so, how is it possible?

CodePudding user response：

Going off on the sparse information of your question I would solve this like such:

import csv

with (
    open('input.csv') as input_file,
    open('output.csv', 'w', newline='') as output_file
):
    reader = csv.DictReader(input_file)
    data = []
    for row in reader:
        row['description'] = 'CAD'   row['description'].split('CAD')[-1]
        data.append(row)

    writer = csv.DictWriter(output_file, fieldnames=data[0].keys())
    writer.writeheader()
    writer.writerows(data)

The newline='' seems to be required on Windows to avoid double newlines in the output file. If you aren't using Windows you may need to remove that keyword argument in order to have proper newlines in the output file.

EDIT:

With the new information you provided you might want the loop to be something closer to:

for row in reader:
    if 'CAD' in row['description']:
        row['description'] = 'CAD'   row['description'].split('CAD')[-1]
    else:
        row['description'] = ''
    data.append(row)

That way lines that don't have a CAD ID will be cleared entirely.