Home > database >  Dropping Columns with if statements, then adding exception
Dropping Columns with if statements, then adding exception

Time:07-06

I have this code that goes through a csv, finds meaningful columns for me, and then drops columns that are not in the list. It works perfectly, but I want it to drop all columns not in found, except one called "MATNR." What can I add to the drop statement that will allow me to drop all of the undesired columns still, except "MATNR"?

# Import Data Quality Rules (useful attributes)
rexp = re.compile('\.([A-Z] )')
found = []

with open('DataRules.csv') as f:
    for line in f:
        found.extend(rexp.findall(line))

# Get rid of columns that are not mentioned in rules (except MATNR)
df.drop(columns=([col for col in df if col not in found]), inplace=True)

# Get rid of duplicated rows
df = df.drop_duplicates()

CodePudding user response:

# Import Data Quality Rules (useful attributes)
rexp = re.compile('\.([A-Z] )')
found = []

with open('DataRules.csv') as f:
    for line in f:
        found.extend(rexp.findall(line))

# Get rid of columns that are not mentioned in rules (except MATNR)
df.drop(columns=([col for col in df if col not in found and col != 'MATNR']), inplace=True)

# Get rid of duplicated rows
df = df.drop_duplicates()
  • Related