I have a DataFrame with four columns: "date", "time_gap", "count" and "average_speed".
I'd like to set values to the count column when requirements are met based on the "date" and "time_gap" columns.
So, for example, if I'm running this query:
random_row = df.query("date == '2018-12-07' & time_gap == 86")
It's returning this as output:
date time_gap count average_speed
282 2018-12-07 86 0 0
Let's say I want to change the value in the count columns with 12, how could I do it?
I've tried this:
random_row = df.query("date == '2018-12-07' & time_gap == 86")["count"].replace(0, 12)
Which returns this:
282 12
Name: count, dtype: int64
But when I'm having a look at the df:
df.iloc[282]
I still have my row where the "count" is equal to 0:
date 2018-12-07 00:00:00
time_gap 86
count 0
average_speed 0
Name: 282, dtype: object
How can I do it?
CodePudding user response:
You can do it with loc
, if you don't want to use NumPy:
df.loc[ (df.date.eq('07/12/2018')) & (df.time_gap.eq(86)), 'count' ] = 12
prints:
date time_gap count average_speed
0 07/12/201
8 86 12 0
Yes, but in order to do that you have to use eval
, which takes the expression passed in query
, and evaluates it:
qr = "date == '07/12/2018' & time_gap == 86"
df.loc[df.eval(qr), 'count'] = 12
prints:
date time_gap count average_speed
0 07/12/2018 86 12 0
You can see practical applications of eval
here.
CodePudding user response:
Use np.where:
import numpy as np
df["count"] = np.where((df["date"] == '2018-12-07') & (df["time_gap" == 86), 0, df["count"])
CodePudding user response:
using the pd.loc to identify the row to update and the updating the column with the desire value
df.loc[(df['date'] == '2018-12-07' ) & (df['time_gap'] == 86) , 'count'] = 12
df
CodePudding user response:
Try
df["count"] [(df["date"] == '2018-12-07') & (df["time_gap"] == 86)] = 12
It was basically what you were doing, but using the square brackets to filter instead of using a query.