import pandas as pd
from datetime import timedelta
import numpy as np
df = pd.DataFrame({
'open_local_data':['2022-08-24 15:00:00','2022-08-24 18:00:00'],
'result':['WINNER','']
})
df['open_local_data'] = pd.to_datetime(df['open_local_data'])
df['clock_now'] = np.where(
df['result'] != '',
df['open_local_data'] timedelta(minutes=150),
''
)
print(df[['open_local_data','clock_now']])
Since I must work using conditions and only later decide whether to handle changes in a column, what should I do in case I receive this error:
df['clock_now'] = np.where(
File "<__array_function__ internals>", line 180, in where
TypeError: The DType <class 'numpy.dtype[datetime64]'> could not be promoted by <class 'numpy.dtype[str_]'>. This means that no common DType exists for the given inputs. For example they cannot be stored in a single array unless the dtype is `object`. The full list of DTypes is: (<class 'numpy.dtype[datetime64]'>, <class 'numpy.dtype[str_]'>)
CodePudding user response:
You can .astype(str)
the addition so that NumPy is happy but at the end you'll have strings. Instead, you can use df.where
:
df["clock_now"] = df["result"].where(df["result"].eq(""),
other=df["open_local_data"].add(pd.Timedelta("150min")))
- keep the "result" values as is where they are equal to empty string
- and put local_data 150minutes to the other places
to get
>>> df
open_local_data result clock_now
0 2022-08-24 15:00:00 WINNER 2022-08-24 17:30:00
1 2022-08-24 18:00:00
where df.at[0, "clock_now"]
is actually a Timestamp, not string.
CodePudding user response:
Try this:
import pandas as pd
from datetime import timedelta
import numpy as np
df = pd.DataFrame({
'open_local_data':['2022-08-24 15:00:00','2022-08-24 18:00:00'],
'result':['WINNER','']
})
df['open_local_data'] = pd.to_datetime(df['open_local_data'])
df['clock_now'] = df.apply(lambda row: row.open_local_data timedelta(minutes=150) if row.result != '' else np.nan,axis=1)
df