Currently I'm trying to count different values a dataframe.. Example: The *.csv I import looks like:
market | availability |
---|---|
dach | available |
dach | available |
nl | offline |
fr | available |
nl | offline |
fr | in_call |
dach | available |
fr | in_call |
So I need to count the amount of availability, per market.. Currently I'm having :
def dach():
dfa = pd.read_csv("./_data/users/availabilities.csv", encoding='ISO-8859-1')
df_dach = dfa.groupby(dfa.market).get_group("DACH")
dach_available = (len(df_dach[df_dach['availability'] == "available"]))
dach_in_call = (len(df_dach[df_dach['availability'] == "in_call"]))
dach_offline = (len(df_dach[df_dach['availability'] == "offline"]))
dach_do_not_disturb = (len(df_dach[df_dach['availability'] == "do_not_disturb"]))
dach_after_call_work = (len(df_dach[df_dach['availability'] == "after_call_work"]))
But this way, for 10 different markets, is hideous to see in my files.. Is there any way how I could have 1 function, that could get an input parameter and get the info all different markets?
So that I get the individual counts of the 5 statusses, per market..
CodePudding user response:
I think you should just be able to do:
dfa = pd.read_csv("./_data/users/availabilities.csv", encoding='ISO-8859-1')
dfa.groupby('market').value_counts()
CodePudding user response:
What about using a Counter?
from collections import Counter
for market, group in df.groupby('market'):
print(market)
print(Counter(group.availability))
print()
dach
Counter({'available': 3})
fr
Counter({'in_call': 2, 'available': 1})
nl
Counter({'offline': 2})
for market, group in df.groupby('market'):
print(market)
print(group.availability.value_counts())
print()
dach
available 3
Name: availability, dtype: int64
fr
in_call 2
available 1
Name: availability, dtype: int64
nl
offline 2
Name: availability, dtype: int64
For a function, I would do something like:
from collections import Counter
def market_status(market, df):
d = Counter(
{
"available":0,
"in_call":0,
"offline":0,
"do_not_disturb":0,
"after_call_work":0
}
)
d.update(Counter(df.loc[df.market == market].availability))
return d
print(market_status('fr', df))
Counter({'in_call': 2, 'available': 1, 'offline': 0, 'after_call_work': 0})
CodePudding user response:
This will give you the grouping I think you are looking for:
dfa = pd.read_csv("groupbycolumn.csv", encoding='ISO-8859-1')
dfa.groupby(["Market", "Availability"])["Availability"].count()
Output:
Market Availability
A Available 1
dont dist 2
in_call 1
B Available 1
dont dist 1
in_call 3
C Available 1
dont dist 2
H Available 1
in_call 3
J in_call 1
Name: Availability, dtype: int64
This is a sample csv file to I created to give an example of how the output will look.