How to combine diagonal data?-CodePudding

I am new to python, apologies if I do not explain well or provide partial solutions yet...

I have a dataframe as below: a key, some dates (distributed in rows), and many other columns (same key, same value)

Key	Date 1	Date 2	Date 3	Column X	Column Y
Key 1	2022-01-01			X11111111	Y11111111
Key 1		2022-01-02		X11111111	Y11111111
Key 1			2022-01-03	X11111111	Y11111111
Key 2	2022-12-01			X22222222	Y22222222
Key 2		2022-12-02		X22222222	Y22222222
Key 2			2022-12-03	X22222222	Y22222222

And I want to aggregate them like below, where the dates are aggregate, other columns keep the same

Key	Date 1	Date 2	Date 3	Column X	Column Y
Key 1	2022-01-01	2022-01-02	2022-01-03	X11111111	Y11111111
Key 2	2022-12-01	2022-12-02	2022-12-03	X22222222	Y22222222

What would be the most efficient way of doing it? Thank you.

I have tried normal pivot and aggregation but did not work as I want ...

CodePudding user response：

Assuming the empty cells are NaN, use groupby.first:

out = df.groupby('Key', as_index=False).first()

NB. If the empty cells are empty strings, use df.replace('', float('nan')).groupby('Key', as_index=False).first().

Output:

     Key      Date 1      Date 2      Date 3   Column X   Column Y
0  Key 1  2022-01-01  2022-01-02  2022-01-03  X11111111  Y11111111
1  Key 2  2022-12-01  2022-12-02  2022-12-03  X22222222  Y22222222

CodePudding user response：

Another possible solution:

(df.groupby('Key', group_keys=True)
 .apply(lambda g: g.ffill().bfill())
 .drop_duplicates()
 .reset_index(drop=True))

Output:

     Key      Date 1      Date 2      Date 3   Column X   Column Y
0  Key 1  2022-01-01  2022-01-02  2022-01-03  X11111111  Y11111111
1  Key 2  2022-12-01  2022-12-02  2022-12-03  X22222222  Y22222222