Home > other >  If/Else statement with Pandas Dataframe based on first character
If/Else statement with Pandas Dataframe based on first character

Time:03-04

I have a lengthy For loop that is doing some appending and modifications to specific columns in my pandas dataframe. What I seem to be struggling with is checking if the first letter of a column has the letter W, and if it does, do some appending in the for loop. I can see the rows that have W in the suffix column by doing:

 df.loc[df['Suffix'].str[:1]=='W']

And what I want to be able to do is something along the lines of: if w or W is the first letter of the Suffix column, commonized.append(df['Base'][i] '-' df['Suffix'][i])

Below is a snippet of the for loop. Any help is appreciated

commonized = []
for i in range(len(df)):
    if pd.isnull(df['Prefix'][i]) and pd.isnull(df['Suffix'][i]):
        commonized.append(df['Base'][i])

EDIT: Here's a better example (hopefully)

So lets say Suffix is WU, Base is 123 and Prefix is ABC. (I also have cases where Suffix will not start with a W)

commonized = []
for i in range(len(df)):
    if df['Suffix'][i] == 'W': # This is the part where I need to check if Suffix starts with W
        commonized.append(df['Prefix'][i]  '-'   df['Base'][i]   '-'   df['Suffix'][i])
    else:
        'NULL'

I want to scan the entire dataframe, and if there is a Suffix that starts with the letter W, the record in the Suffix column would then change from WU to ABC-123-WU. Hope this makes sense.

CodePudding user response:

Since I can't see sample df, here is something that might help you.

df['Suffix'].apply(lambda x: True if x[0].lower()=='w' else False)

This will return a series of True (wherever the first suffix is W or w) and False otherwise. Hope this helps to design your solution. Do post a sample df if you need more help.

CodePudding user response:

I'm not one hundred percent sure what you're going for here, but I think this probably does what you're looking for:

data = {
    'Suffix': ['wow', 'Woah', 'hello', 'howa'],
    'Prefix': ['zzzz', 'adfas', 'asfdsaf', 'prefix'],
    'Base': ['Wozza', 'bozza', 'wow', 'hello']
}

df_words = pd.DataFrame.from_dict(data)
print(df_words)
commonized: list[str] = []
for (prefix, suffix, base) in zip(df_words['Prefix'], df_words['Suffix'], df_words['Base']):
    if suffix[0] == 'W' or suffix[0] == 'w':
        commonized.append(f"{base}-{suffix}")
print(commonized)

Output:

  Suffix   Prefix   Base
0    wow     zzzz  Wozza
1   Woah    adfas  bozza
2  hello  asfdsaf    wow
3   howa   prefix  hello
['Wozza-wow', 'bozza-Woah']

It may be the case, for example, that your null checks were very important and they need to be re-introduced.

EDIT:

based on the edit, you could do something like:

data = {
    'Suffix': ['wow', 'Woah', 'hello', 'howa'],
    'Prefix': ['zzzz', 'adfas', 'asfdsaf', 'prefix'],
    'Base': ['Wozza', 'bozza', 'wow', 'hello']
}

df_words = pd.DataFrame.from_dict(data)
print(df_words)
commonized: list[str] = []
for (prefix, suffix, base) in zip(df_words['Prefix'], df_words['Suffix'], df_words['Base']):
    if suffix[0] == 'W' or suffix[0] == 'w':
        commonized.append(f"{prefix}-{base}-{suffix}")
    else:
        commonized.append(suffix)
df_words['Suffix'] = commonized
print(df_words)

Output:

  Suffix   Prefix   Base
0    wow     zzzz  Wozza
1   Woah    adfas  bozza
2  hello  asfdsaf    wow
3   howa   prefix  hello
             Suffix   Prefix   Base
0    zzzz-Wozza-wow     zzzz  Wozza
1  adfas-bozza-Woah    adfas  bozza
2             hello  asfdsaf    wow
3              howa   prefix  hello

Alternatively, all in one spot:

for row, (prefix, suffix, base) in enumerate(zip(df_words['Prefix'], df_words['Suffix'], df_words['Base'])):
    if suffix[0] == 'W' or suffix[0] == 'w':
        df_words.at[row, 'Suffix'] = f"{prefix}-{base}-{suffix}"
print(df_words)

(the above produces the same output)

  • Related