I'm trying to do some string manipulations with Pandas and I would deeply appreciate your help! Here's my problem: I loaded a list of words from a csv file into a pandas dataframe called df, so that it looks as follows (here, I created the df manually):
data = {'Keyword': ['Apple', 'Banana', 'Peach', 'Strawberry', 'Blueberry'], 'Kategory': ['A', 'A', 'A', 'B', 'B']}
df = pd.DataFrame(data)
Now what I would like to do is some string manipulation based on the following conditions shown below. The output of the string manipulation should be saved to a new column.
# new column to store the results
output = []
# set up the conditions
for Keyword in df:
if df[Kategory] == 'A':
output.append(Keyword 'first choice')
print(Keyword 'first choice')
else:
output.append(Keyword 'second choice')
print(Keyword 'second choice')
Thank you very much for your help!!
CodePudding user response:
You can try np.where
df['col'] = np.where(df['Kategory'].eq('A'), df['Keyword'].add(' first choice'), df['Keyword'].add(' second choice'))
print(df)
Keyword Kategory col
0 Apple A Apple first choice
1 Banana A Banana first choice
2 Peach A Peach first choice
3 Strawberry B Strawberry second choice
4 Blueberry B Blueberry second choice
CodePudding user response:
data = {'Keyword': ['Apple', 'Banana', 'Peach', 'Strawberry', 'Blueberry'], 'Kategory': ['A', 'A', 'A', 'B', 'B']}
df = pd.DataFrame(data)
output = []
for idx, rows in df.iterrows():
if rows['Kategory'] == 'A':
output.append(rows['Keyword'] " " 'first choice')
# print(Keyword 'first choice')
else:
output.append(rows['Keyword'] " " 'second choice')
# print(Keyword 'second choice')
df['output'] = output
print(df)
Keyword Kategory output
0 Apple A Apple first choice
1 Banana A Banana first choice
2 Peach A Peach first choice
3 Strawberry B Strawberry second choice
4 Blueberry B Blueberry second choice
I have tried to replicate your approach , but you can use np.where , to iterate on a dataframe you have to use index and rows
CodePudding user response:
I imagine your error would be something like key error: Kategory
Kategory does not exist
. This is because the variable Kategory doesn't actually exist. When accessing keys in dictionaries, you must treat them as strings not variables.
Like this:
# new column to store the results
output = []
# set up the conditions
for Keyword in df:
if df["Kategory"] == 'A':
output.append(Keyword 'first choice')
print(Keyword 'first choice')
else:
output.append(Keyword 'second choice')
print(Keyword 'second choice')
Good luck.