I have a dataframe which looks like this
a b name .....
1 1 abc
2 2 xyz
3 3 abc
4 4 dfg
Now I need to create multiple dataframes based on the frequency of the names like df_abc should have all the data for name "abc" and so on. Tried using for loop but I'm new to python and not able to solve it. Thanks!
df_abc
a b name
1 1 abc
3 3 abc
CodePudding user response:
You can use .groupby
and list
which yields a list of tuples. Using dict-comprehension you can access these dataframes with my_dict["abc"]
.
df = pd.DataFrame(
{"a": [1, 2, 3, 4], "b": [1, 2, 3, 4], "name": ["abc", "xyz", "abc", "dfg"]}
)
my_dict={name:df for name, df in list(df.groupby("name")) }
for val, df_val in my_dict.items():
print(f"df:{df_val}\n")
CodePudding user response:
You could create a dictionary of dataframes which holds the different sets of data filtered with the unique values in 'name' column. Then you can reference each dataframe as you would reference a dictionary:
See below an example:
import pandas as pd
from io import StringIO
d = """
a b name
1 1 abc
2 2 xyz
3 3 abc
4 4 dfg
"""
df=pd.read_csv(StringIO(d),sep=" ")[['a','b','name']]
dfs = {}
for item in df['name']:
dfs[item] = df.loc[df['name'] == item]
>>> dfs.keys()
dict_keys(['abc', 'xyz', 'dfg'])
>>> dfs['abc']
a b name
0 1 1 abc
2 3 3 abc
>>> dfs['xyz']
a b name
1 2 2 xyz
>>> dfs['dfg']
a b name
3 4 4 dfg