I am rather new to Python. I am trying to create a conditional column in a pandas dataframe. My dataframe looks like this:
PayByPhone Location ID Location Name
59 Warner Road
59 Warner Road
69 Warner Road
59 Warner Road
59 Warner Road
69 Warner Road
59 Warner Road
59 Warner Road
59 Warner Road
59 Warner Road
59 Warner Road
59 Warner Road
It is part of a much larger dataset with various other location names. In this example I am trying to make 'PayByPhone Location ID' 59 when the Location Name is 'Warner Road'. I am a novice in Python but I have made an attempt:
import pandas as pd
path=r"C:\Users\H\Desktop\File.xlsx"
df1=pd.read_excel(path, sheet_name = 0)
if df[PayByPhone Location ID] != 59 and df[Location Name] = 'Warner Road'
df[PayByPhone Location ID] = 59
Unfortunately I am getting an 'invalid syntax' error.
CodePudding user response:
There were several syntax errors in this including that when you are referencing the columns of a dataframe you must put the column name in ticks for example
df['PayByPhone Location ID']
Additionally, when you are looking to compare if something is equal to something you must use double ==. A single set of = means you are setting something to equal something else for example:
#This is looking for if Name is equal to John
Name == 'John'
#This is setting the name variable to equal John
Name = 'John'
To go one step further, however, if you correct all of these problems you will still receive an error because your results are ambiguous meaning there are several instances where the answer is correct. In order to achieve the results you are expecting you can use a np.where()
df['PayByPhone Location ID'] = np.where((df['PayByPhone Location ID'] != 59) & (df['Location Name'] == 'Warner Road'), 59, df['PayByPhone Location ID'])
The above example will look to ensure both the conditions you specified in your post are met and will change anything that meets those criteria into 59, otherwise it will leave the row as it is currently.