Group by the data based on the one columns and then build the rows with the values in data frame-CodePudding

I have a data frame with multiple id in column id. For each day, I have 5 time steps. (6:00, 6:15, 6:30, 6:45, 7:00) However, some days does not have 5. And I want to fill the missing value as Nan.. Let see the following example,

import pandas as pd
df = pd.DataFrame()
df['id'] =   [1, 1, 1, 1, 1, 2, 2, 2,3, 3]
df['val'] = [11, 10, 12, 3, 4, 5, 125, 45,31, -2]
df['date'] = ['2019-03-31 06:00:00','2019-03-31 06:15:00', '2019-03-31 06:30:00', '2019-03-31 06:45:00', '2019-03-31 07:00:00', '2019-03-31 06:00:00', '2019-03-31 06:30:00',
              '2019-03-31 06:45:00', '2019-03-31 06:00:00', '2019-03-31 06:15:00']
df

For example, for id=1 we have 5 time steps.

For id=2, we have 3 time steps.

for id=3, we have 2 time steps.

So,

I want to sticks values in one rows and add only the day of the time to that row.

Here is the desired output for my data frame:

Can you help me with this? Thank you so much.

CodePudding user response：

One way using pandas.DataFrame.pivot:

df["dates"] = pd.to_datetime(df["date"]).dt.date
new_df = df.pivot(index=["id", "dates"], columns="date", values="val")
new_df.columns = [f"val{i 1}" for i in range(new_df.shape[1])]
new_df.reset_index()

Output:

   id       dates  val1  val2   val3  val4  val5
0   1  2019-03-31  11.0  10.0   12.0   3.0   4.0
1   2  2019-03-31   5.0   NaN  125.0  45.0   NaN
2   3  2019-03-31  31.0  -2.0    NaN   NaN   NaN