I'm trying to build a dataframe using for loop, below start works perfectly:
import pandas as pd
df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])
for i in range(2):
this_column = df.columns[i]
df[this_column] = [i, i 1]
df
And I got the correct one:
Then I tried to make my implemetation as below:
import pandas as pd
df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])
SLS = [58, 100]
row = 0
for _, slc in enumerate(SLS):
for single_date in daterange(start_date, end_date):
df[row] = [single_date.strftime("%Y-%m-%d"), slc]
row = row 1
print(type(row), type(df))
df
But the result I got was a horizontal dataframe, not a vertical one
Even the data in the main hedears got posted as NAN
?
I tried using enforced header type declaration, but gave same result:
import pandas as pd
import numpy as np
#Create empty DataFrame with specific column names & types
# Using NumPy
dtypes = np.dtype(
[
('DATE1',np.datetime64),
('SLS_CNTR_ID', int),
]
)
df = pd.DataFrame(np.empty(0, dtype=dtypes))
#df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])
print(df)
SLS = [58, 100]
row = 0
for _, slc in enumerate(SLS):
for single_date in daterange(start_date, end_date):
df[row] = [single_date.strftime("%Y-%m-%d"), slc]
row = row 1
print(type(row), type(df))
df
CodePudding user response:
Use df.loc[row]
instead of df[row]
to set the rows.
Though I'd rather implement this using a merge instead of the loops:
(pd.DataFrame({"DATE1": pd.date_range("2020-01-01", "2020-02-01")})
.merge(pd.Series(SLS, name="SLS_CNTR_ID"), how="cross"))
Or leverage itertools
to obtain the cross-product:
import itertools
dates = pd.date_range("2020-01-01", "2020-02-01")
SLS = [58, 100]
pd.DataFrame(itertools.product(SLS, dates), columns=["SLS_CNTR_ID", "DATE1"])
SLS_CNTR_ID DATE1
0 58 2020-01-01
1 58 2020-01-02
2 58 2020-01-03
3 58 2020-01-04
4 58 2020-01-05
.. ... ...
59 100 2020-01-28
60 100 2020-01-29
61 100 2020-01-30
62 100 2020-01-31
63 100 2020-02-01
[64 rows x 2 columns]