Create different dataframes from a main dataframe and order by frequency-CodePudding

I have a dataframe which looks like this

a     b      name    .....
1     1      abc
2     2      xyz
3     3      abc
4     4      dfg

Now I need to create multiple dataframes based on the frequency of the names like df_abc should have all the data for name "abc" and so on. Tried using for loop but I'm new to python and not able to solve it. Thanks!

df_abc


a    b     name
1    1     abc
3    3     abc

CodePudding user response：

You can use .groupby and list which yields a list of tuples. Using dict-comprehension you can access these dataframes with my_dict["abc"].

df = pd.DataFrame(
    {"a": [1, 2, 3, 4], "b": [1, 2, 3, 4], "name": ["abc", "xyz", "abc", "dfg"]}
)
my_dict={name:df for name, df in list(df.groupby("name")) }

for val, df_val in my_dict.items():
    print(f"df:{df_val}\n")

CodePudding user response：

You could create a dictionary of dataframes which holds the different sets of data filtered with the unique values in 'name' column. Then you can reference each dataframe as you would reference a dictionary:

See below an example:

import pandas as pd
from io import StringIO 

d = """
a     b      name
1     1      abc
2     2      xyz
3     3      abc
4     4      dfg
"""

df=pd.read_csv(StringIO(d),sep=" ")[['a','b','name']]

dfs = {}
for item in df['name']:
    dfs[item] = df.loc[df['name'] == item]

>>> dfs.keys()
dict_keys(['abc', 'xyz', 'dfg'])

>>> dfs['abc']
   a  b name
0  1  1  abc
2  3  3  abc

>>> dfs['xyz']
   a  b name
1  2  2  xyz

>>> dfs['dfg']
   a  b name
3  4  4  dfg