Pandas horizontal pivot-CodePudding

I have a table with diagnoses icd_code and patients subject_id as seen in the screenshot below.

One patient can have several diagnoses. I would like to arrange the table such that subject_id is the index and include a list of all diagnoses of that patient horizontally in separate columns. Is there a way I can do that with pandas? Also, how can I count how many diagnoses each patient has?

Thanks and best regards

CodePudding user response：

Try with groupby and pivot:

df["idx"] = df.groupby("subject_id").cumcount() 1
output = df.pivot("subject_id", "idx", "icd_code")
output["Count"] = output.count(axis=1)

>>> output
idx             1      2     3      4      5  Count
subject_id                                         
11442057    65971  64231  V270  64511    NaN      4
15734973     2825  V0251  V270  64891  66481      5