Home > front end >  How do I create an if statement that checks for sheetnames that startwith a certain string in Python
How do I create an if statement that checks for sheetnames that startwith a certain string in Python

Time:05-05

My final objective is to create a column called 'Status' that indicates if active or cancelled based on the name of a sheet. I need it to check if the sheetname start with the word 'Full Member List'. If so then Active, else the Status column should be Cancelled. How do I do this below? I only need help with the one line in this code where I have commented #need help with below line. I get an invalid syntax error for that line

My attempt-

import pandas as pd
import os
from openpyxl import load_workbook

cols_to_drop =  ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']              
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'} 

for file in os.listdir("C:/Users/hh/Desktop/autotranscribe/python/Matching"):
    if file.startswith("PSI"):
        dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
        output = dict()
        for ws, df in dfs.items():
            if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
                continue  
                temp = df
                #need help with below line
                temp['Status'] = "Active" if any(ws.startwith(x) for x in == "Full Member List" else "Cancelled" )   
            #drop unneeded columns
            temp = df.drop(cols_to_drop, errors="ignore", axis=1)
            #rename columns
            temp = temp.rename(columns=column_name_update_map)
            #drop empty columns
            temp = temp.dropna(how="all", axis=1)
            temp['Partner'] = "PSI"
            output[ws] = temp
        writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
        for ws, df in output.items():
            df.to_excel(writer, index=None, sheet_name=ws)
        writer.save()
        writer.close()

CodePudding user response:

I think you need to:

  • fix the indenting in your code on the line reading temp = df
  • fix the typo "startwith" to be startswith
  • consider adding logic to ignore files containing (updated headers)
  • change the line you are asking about to this:
                temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"

An updated version of your code looks like this:

import pandas as pd
import os
from openpyxl import load_workbook

cols_to_drop =  ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']              
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'} 

for file in os.listdir("."):
    if file.startswith("PSI") and "(updated headers)" not in file:
        dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
        output = dict()
        for ws, df in dfs.items():
            if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
                continue  
            temp = df
            temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"   
            #drop unneeded columns
            temp = df.drop(cols_to_drop, errors="ignore", axis=1)
            #rename columns
            temp = temp.rename(columns=column_name_update_map)
            #drop empty columns
            temp = temp.dropna(how="all", axis=1)
            temp['Partner'] = "PSI"
            output[ws] = temp
        writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
        for ws, df in output.items():
            df.to_excel(writer, index=None, sheet_name=ws)
        writer.save()
        writer.close()

To test this, I created an xlsx file named PSI 001.xlsx with a sheet named Full Member List 001 containing the following anchored at cell A1:

will skip
will skip
will skip
will skip
will skip
foo
1
2
3

The output was stored to a file named PSI 001 (updated headers).xlsx with a sheet named as above with the following contents anchored at cell A1:

foo Status  Partner
1   Active  PSI
2   Active  PSI
3   Active  PSI

CodePudding user response:

In case you want to check for full string "Full Member List" in start of your sheetname.

temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"

To check, if either of words "Full", "Member", "List" appears in the sheetname:

for x in "Full Member List".split(" "):
    if ws.startswith(x):
        temp["Status"] = "Active"
        break

if temp["Status"] != "Active":
    temp["Status"] = "Cancelled"
  • Related