I'm having some problems iteratively filling a pandas DataFrame with two different types of values. As a simple example, please consider the following initialization:
IN:
df = pd.DataFrame(data=np.nan,
index=range(5),
columns=['date', 'price'])
df
OUT:
date price
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
When I try to fill one row of the DataFrame, it won't adjust the value in the date
column. Example:
IN:
df.iloc[0]['date'] = '2022-05-06'
df.iloc[0]['price'] = 100
df
OUT:
date price
0 NaN 100.0
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
I'm suspecting it has something to do with the fact that the default np.nan
value cannot be replaced by a str
type value, but I'm not sure how to solve it. Please note that changing the date
column's type to str
does not seem to make a difference.
CodePudding user response:
This doesn't work because df.iloc[0]
creates a temporary Series, which is what you update, not the original DataFrame.
If you need to mix positional and label indexing you can use:
df.loc[df.index[0], 'date'] = '2022-05-06'
df.loc[df.index[0], 'price'] = 100
output:
date price
0 2022-05-06 100.0
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
CodePudding user response:
Using loc()
as shown below may work better:
import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.nan,
index=range(5),
columns=['date', 'price'])
print(df)
df.loc[0, 'date'] = '2022-05-06'
df.loc[0, 'price'] = 100
print(df)
Output:
date price
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
date price
0 2022-05-06 100.0
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN