I have a lengthy For loop that is doing some appending and modifications to specific columns in my pandas dataframe. What I seem to be struggling with is checking if the first letter of a column has the letter W, and if it does, do some appending in the for loop. I can see the rows that have W in the suffix column by doing:
df.loc[df['Suffix'].str[:1]=='W']
And what I want to be able to do is something along the lines of: if w or W is the first letter of the Suffix column, commonized.append(df['Base'][i] '-' df['Suffix'][i])
Below is a snippet of the for loop. Any help is appreciated
commonized = []
for i in range(len(df)):
if pd.isnull(df['Prefix'][i]) and pd.isnull(df['Suffix'][i]):
commonized.append(df['Base'][i])
EDIT: Here's a better example (hopefully)
So lets say Suffix is WU, Base is 123 and Prefix is ABC. (I also have cases where Suffix will not start with a W)
commonized = []
for i in range(len(df)):
if df['Suffix'][i] == 'W': # This is the part where I need to check if Suffix starts with W
commonized.append(df['Prefix'][i] '-' df['Base'][i] '-' df['Suffix'][i])
else:
'NULL'
I want to scan the entire dataframe, and if there is a Suffix that starts with the letter W, the record in the Suffix column would then change from WU to ABC-123-WU. Hope this makes sense.
CodePudding user response:
Since I can't see sample df, here is something that might help you.
df['Suffix'].apply(lambda x: True if x[0].lower()=='w' else False)
This will return a series of True (wherever the first suffix is W or w) and False otherwise. Hope this helps to design your solution. Do post a sample df if you need more help.
CodePudding user response:
I'm not one hundred percent sure what you're going for here, but I think this probably does what you're looking for:
data = {
'Suffix': ['wow', 'Woah', 'hello', 'howa'],
'Prefix': ['zzzz', 'adfas', 'asfdsaf', 'prefix'],
'Base': ['Wozza', 'bozza', 'wow', 'hello']
}
df_words = pd.DataFrame.from_dict(data)
print(df_words)
commonized: list[str] = []
for (prefix, suffix, base) in zip(df_words['Prefix'], df_words['Suffix'], df_words['Base']):
if suffix[0] == 'W' or suffix[0] == 'w':
commonized.append(f"{base}-{suffix}")
print(commonized)
Output:
Suffix Prefix Base
0 wow zzzz Wozza
1 Woah adfas bozza
2 hello asfdsaf wow
3 howa prefix hello
['Wozza-wow', 'bozza-Woah']
It may be the case, for example, that your null checks were very important and they need to be re-introduced.
EDIT:
based on the edit, you could do something like:
data = {
'Suffix': ['wow', 'Woah', 'hello', 'howa'],
'Prefix': ['zzzz', 'adfas', 'asfdsaf', 'prefix'],
'Base': ['Wozza', 'bozza', 'wow', 'hello']
}
df_words = pd.DataFrame.from_dict(data)
print(df_words)
commonized: list[str] = []
for (prefix, suffix, base) in zip(df_words['Prefix'], df_words['Suffix'], df_words['Base']):
if suffix[0] == 'W' or suffix[0] == 'w':
commonized.append(f"{prefix}-{base}-{suffix}")
else:
commonized.append(suffix)
df_words['Suffix'] = commonized
print(df_words)
Output:
Suffix Prefix Base
0 wow zzzz Wozza
1 Woah adfas bozza
2 hello asfdsaf wow
3 howa prefix hello
Suffix Prefix Base
0 zzzz-Wozza-wow zzzz Wozza
1 adfas-bozza-Woah adfas bozza
2 hello asfdsaf wow
3 howa prefix hello
Alternatively, all in one spot:
for row, (prefix, suffix, base) in enumerate(zip(df_words['Prefix'], df_words['Suffix'], df_words['Base'])):
if suffix[0] == 'W' or suffix[0] == 'w':
df_words.at[row, 'Suffix'] = f"{prefix}-{base}-{suffix}"
print(df_words)
(the above produces the same output)