Home > Software design >  how to write a value in a column under condition?
how to write a value in a column under condition?

Time:04-21

I've to write different values in a specific column under a condition. Here a visual example of the problem:

In the column D, I have to add a value XX if the time range is between 10:00 and 10:30

I've tried in this way

def function (df, name=['column'], name_app1=['column1']): 
    for row in df[name]:
        if [(df[name].dt.hour >= 10) & (df[name].dt.hour <= 11)]:
          df[name_app1].append('0.400') 

but I obtain always an error TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid

CodePudding user response:

df[name_app1].append('0.400') 

In this line you are trying to write single value to complete column. This is the error.

Use this

def function (df, name=['column'], name_app1=['column1']): 
    for i,row in df.iterrows():
        if [(df.loc[i,name].dt.hour >= 00) & (df.loc[i,name].dt.hour <= 3)]:
          df.loc[i, name_app1] = '0.400'

Let me know if you need further help

CodePudding user response:

Seems to me like you want to essentially do something like:

mask = (
    (pd.Timestamp('2021-05-16') <= df['column'])
    & (df['column'] <= pd.Timestamp('2021-05-30'))
    & (df['column'].dt.hour <= 3)
)
df.loc[mask, 'column1'] = '0.400'

(Since hours are per definition >= 0 you don't need the first condition.)

There are a couple of rather odd things in your code:

  • name=['column'], name_app1=['column1']: This makes name a list and therefore df[name] a dataframe and not a series - so df[name].dt.hour isn't supposed to work? Do you mean name='column', name_app1='column1' instead?
  • [(df[name].dt.hour >= 00) & (df[name].dt.hour <= 3)] is a non-empty list, therefore its truthiness is True, and therefore if [(df[name].dt.hour >= 00) & (df[name].dt.hour <= 3)] always met?
  • You check the conditions on the whole dataframe in each step of the loop - that's not very efficient (and probably not intended)? You don't need to loop, use native Pandas methods (as proposed).
  • .append() concatenates dateframes/series to a dataframe - not a string?

So your function should (as far as I can tell) look more like:

def function(df, name='column', name_app1='column1'):
    mask = (
        (pd.Timestamp('2021-05-16') <= df[name])
        & (df[name] <= pd.Timestamp('2021-05-30'))
        & (df[name].dt.hour <= 3)
    )
    df.loc[mask, name_app1] = '0.400'
  • Related