Get a list of values from a pandas dataframe-CodePudding

I have a DataFrame like this

df.head()
>>>
Date Region   Manager   SalesMan    Item         Units  Unit_price  Sale_amt
 0   East     Martha    Alexander   Television   ...      ...         ...
 1   Central  Hermann   Shelli      Home Theater ...      ...         ...
 2   Central  Hermann   Luis        Television   ...      ...         ...
 3   Central  Timothy   David       CellPhone    ...      ...         ...
 4   West     Timothy   Stephen     Television   ...      ...         ...

Here are the unique Managers and SalesMen

df['Manager'].unique()
array(['Martha', 'Hermann', 'Timothy', 'Douglas'], dtype=object)


df['SalesMan'].unique()
array(['Alexander', 'Shelli', 'Luis', 'David', 'Stephen', 'Steven',
       'Michael', 'Sigal', 'Diana', 'Karen', 'John'], dtype=object)

I want a dataframe that contains Unique Managers and the list of unique Salesmen under those managers For example, for the above dataframe, I want an output like:

Manager     list_of_salesmen
Martha      [ALexander]
Herman      [Shelli, Luis]
Timothy     [David, Stephen]

I thought of using groupby and got struck in there! How do I go about solving this problem?

CodePudding user response：

You could use a groupby.agg on Manager, and pass list to SalesMan:

>>> df.groupby('Manager').agg({'SalesMan':list})

                SalesMan
Manager                  
Hermann    [Shelli, Luis]
Martha        [Alexander]
Timothy  [David, Stephen]

CodePudding user response：

It can be done by making a dict() object that contains data for new dataset and use pandas.DataFrame.from_dict() to convert it to dataframe:

d = {'Manager':list(df['Manager'].unique()), 'SalesMan':[]}

for i in df['Manager'].unique():
    d['SalesMan'].append([i for i in df[df['Manager'] == i]['SalesMan']])

df2 = pd.DataFrame.from_dict(d)