I am using the following code to group a pandas df, and pass the values grouped to a list
df.groupby(['Column1','Column2'])['Column3'].apply(list)
This returns all values from column3 in the list.
How can i get the list to include only first N examples only from column3 ?
CodePudding user response:
Use indexing with lambda function:
df.groupby(['Column1','Column2'])['Column3'].apply(lambda x: list(x)[:N])
Or:
df.groupby(['Column1','Column2'])['Column3'].apply(lambda x: list(x.iloc[:N]))
EDIT:
df.groupby(['Column1','Column2'])['Column3'].apply(lambda x: list(x.unique()[:N]))
CodePudding user response:
You could use str
accessor to get the first N elements from lists:
df.groupby(['Column1','Column2'])['Column3'].agg(list).str[:N]