How to convert the values in the row with same id to one list? (Pandas in python)-CodePudding

I uploaded the csv file

#Open the first dataset
train=pd.read_csv("order_products__train.csv",index_col="order_id")

The data looks like:

         product_id
order_id                 
1              1
1              2
1              3
1              4
2              1
2              2
2              3
2              4
2              5
2              6

What I want is the data frame looks like,

order_id       product_id
1              1,2,3,4
2              1,2,3,4,5,6

Since I want to generate a list like

[[1,2,3,4],[1,2,3,4,5,6]]

Could anyone help?

CodePudding user response：

You can use the the function .groupby() to do that

train = train.groupby(['order_id'])['product_id'].apply(list)

That would give you expected output :

order_id
1       [1, 2, 3, 4]
2    [1, 2, 3, 4, 5]

Finally, you can cast this to a DataFrame or directly to a list to get what you want :

train = train.to_frame()  # To pd.DataFrame
# Or
train = train.to_list()  # To nested lists [[1,2,3,4],[1,2,3,4,5]]

CodePudding user response：

There must be better ways but I guess you can simply do the following:

list_product = []
for i in train["order_id"].unique():
  tmp = train[train["order_id"] == i]
  list_product.append(tmp["product_id"].to_list())