How do I change this:
Date | URL | Description | Category |
---|---|---|---|
2022-06-17 14:24:52 | /XYBkLO | public | A |
2022-06-17 14:24:52 | /XYBkLO | public | B |
2022-06-17 14:24:52 | /XYBkLO | public | C |
2022-06-17 14:25:05 | /ZWrTVu | public | A |
2022-06-17 14:25:05 | /ZWrTVu | public | B |
2022-06-17 14:25:05 | /ZWrTVu | public | C |
To this:
Date | URL | Description | Category | Date | URL | Description | Category | Date | URL | Description | Category |
---|---|---|---|---|---|---|---|---|---|---|---|
2022-06-17 14:24:52 | /XYBkLO | public | A | 2022-06-17 14:24:52 | /XYBkLO | public | B | 2022-06-17 14:24:52 | /XYBkLO | public | C |
2022-06-17 14:25:05 | /ZWrTVu | public | A | 2022-06-17 14:25:05 | /ZWrTVu | public | B | 2022-06-17 14:25:05 | /ZWrTVu | public | C |
I would like to keep everything with the same URL on one row, but I don't know how I could implement this using pandas. Is there perhaps another way or another library I should use? Could really use some help
CodePudding user response:
You can try:
from itertools import cycle, count, islice
from collections import defaultdict
def fn(x):
d = defaultdict(lambda: count(1))
names = cycle(x.columns)
vals = x.values.ravel()
return pd.DataFrame(
[vals],
columns=[f"{n}.{next(d[n])}" for n in islice(names, len(vals))],
)
x = df.groupby("URL").apply(fn).reset_index(drop=True)
print(x)
Prints:
Date.1 URL.1 Description.1 Category.1 Date.2 URL.2 Description.2 Category.2 Date.3 URL.3 Description.3 Category.3
0 2022-06-17 14:24:52 /XYBkLO public A 2022-06-17 14:24:52 /XYBkLO public B 2022-06-17 14:24:52 /XYBkLO public C
1 2022-06-17 14:25:05 /ZWrTVu public A 2022-06-17 14:25:05 /ZWrTVu public B 2022-06-17 14:25:05 /ZWrTVu public C
CodePudding user response:
You can look into pandas transpose function. transpose
Not 100% sure it will fit your use case but it seems like a good place to start.
CodePudding user response:
Here is another way:
(pd.concat(
[df.set_index('URL',drop=False) for _,df in df.groupby('Category')]
,axis=1)
.reset_index(drop=True))
Output:
Date URL Description Category Date URL Description Category Date URL Description Category
0 6/17/2022 14:24 /XYBkLO public A 6/17/2022 14:24 /XYBkLO public B 6/17/2022 14:24 /XYBkLO public C
1 6/17/2022 14:25 /ZWrTVu public A 6/17/2022 14:25 /ZWrTVu public B 6/17/2022 14:25 /ZWrTVu public C