I have this dataframe
Country | Value
England 1
England 1
England 2
USA 3
USA 3
USA 2
Germany 1
Thailand 2
My desired Dataframe
Country | Value Cat 1 cat 2 cat 3
England 1 2 1 0
USA 3 0 1 2
Germany 1 1 0 0
Thailand 2 0 1 0
CodePudding user response:
One simple solution is to count the number of repetitions using a dictionary. You can iterate through the rows of a dataframe using the "iterrows" method.
from collections import defaultdict
# This dictionary will be used to count the number of ocurrences of a given string.
d = defaultdict(int)
for index,row in df.iterrows():
country = row['Country']
value = row['Value']
if value ==1:
d[country] = 1
# We define the column with the following function
df['Cat 1'] = df['Country'].apply(lambda text: d[text])
CodePudding user response:
You can unstack
the inner level of value_counts
out = (df.value_counts(['Country', 'Value'])
.unstack().fillna(0)
.add_prefix('Cat ')
.rename_axis(None, axis=1)
.reset_index())
out['Value'] = df.groupby(['Country'])['Value'].first().values
print(out)
Country Cat 1 Cat 2 Cat 3 Value
0 England 2.0 1.0 0.0 1
1 Germany 1.0 0.0 0.0 1
2 Thailand 0.0 1.0 0.0 2
3 USA 0.0 1.0 2.0 3