I've to write different values in a specific column under a condition. Here a visual example of the problem:
In the column D, I have to add a value XX if the time range is between 10:00 and 10:30
I've tried in this way
def function (df, name=['column'], name_app1=['column1']):
for row in df[name]:
if [(df[name].dt.hour >= 10) & (df[name].dt.hour <= 11)]:
df[name_app1].append('0.400')
but I obtain always an error
TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid
CodePudding user response:
df[name_app1].append('0.400')
In this line you are trying to write single value to complete column. This is the error.
Use this
def function (df, name=['column'], name_app1=['column1']):
for i,row in df.iterrows():
if [(df.loc[i,name].dt.hour >= 00) & (df.loc[i,name].dt.hour <= 3)]:
df.loc[i, name_app1] = '0.400'
Let me know if you need further help
CodePudding user response:
Seems to me like you want to essentially do something like:
mask = (
(pd.Timestamp('2021-05-16') <= df['column'])
& (df['column'] <= pd.Timestamp('2021-05-30'))
& (df['column'].dt.hour <= 3)
)
df.loc[mask, 'column1'] = '0.400'
(Since hours are per definition >= 0
you don't need the first condition.)
There are a couple of rather odd things in your code:
name=['column'], name_app1=['column1']
: This makesname
a list and thereforedf[name]
a dataframe and not a series - sodf[name].dt.hour
isn't supposed to work? Do you meanname='column', name_app1='column1'
instead?[(df[name].dt.hour >= 00) & (df[name].dt.hour <= 3)]
is a non-empty list, therefore its truthiness isTrue
, and thereforeif [(df[name].dt.hour >= 00) & (df[name].dt.hour <= 3)]
always met?- You check the conditions on the whole dataframe in each step of the loop - that's not very efficient (and probably not intended)? You don't need to loop, use native Pandas methods (as proposed).
.append()
concatenates dateframes/series to a dataframe - not a string?
So your function should (as far as I can tell) look more like:
def function(df, name='column', name_app1='column1'):
mask = (
(pd.Timestamp('2021-05-16') <= df[name])
& (df[name] <= pd.Timestamp('2021-05-30'))
& (df[name].dt.hour <= 3)
)
df.loc[mask, name_app1] = '0.400'