I am trying to create mutliple horizontal barplots for a dataset. The data deals with race times from a running race.
Dataframe has the following columns: Name, Age Group, Finish Time, Finish Place, Hometown. Sample data below.
Name | Age Group | Finish Time | Finish Place | Hometown | Times Ran The Race |
---|---|---|---|---|---|
John | 30-39 | 15.5 | 1 | New York City | 2 |
Mike | 30-39 | 17.2 | 2 | Denver | 1 |
Travis | 40-49 | 20.4 | 1 | Louisville | 3 |
James | 40-49 | 22.1 | 2 | New York City | 1 |
I would like to create a bar plot similar to what is shown below. There would be 1 bar chart per age group, fastest runner on bottom of chart, runner name with city and number of times ran the race below their name.
Do I need a for loop or would a simple groupby work? The number and sizing of each age group can be dynamic based off the race so it is not a constant, but would be dependent on the dataframe that is used for each race.
CodePudding user response:
I employed a looping process. I use the extraction by age group as a temporary data frame, and then accumulate label information for multiple x-axis to prepare for reuse. The accumulated label information is decomposed into strings and stored in a new list. Next, draw a horizontal bar graph and update the labels on the x-axis.
for ag in df['Age Group'].unique():
label_all = []
tmp = df[df['Age Group'] == ag]
labels = [[x,y,z] for x,y,z in zip(tmp.Name.values, tmp.Hometown.values, tmp['Times Ran The Race'].values)]
for k in range(len(labels)):
label_all.append(labels[k])
l_all = []
for l in label_all:
lbl = l[0] '\n' l[1] '\n' str(l[2]) ' Time'
l_all.append(lbl)
ax = tmp[['Name', 'Finish Time']].plot(kind='barh', legend=False)
ax.set_title(ag ' Age Group')
ax.set_yticklabels([l_all[x] for x in range(len(l_all))])
ax.grid(axis='x')
for i in ['top','bottom','left','right']:
ax.spines[i].set_visible(False)
CodePudding user response:
Here's a quite compact solution. Only tricky part is the ordinal number, if you really want to have that. I copied the lambda solution from Ordinal numbers replacement
Give this a try and please mark the answer with Up-button if you like it.
import matplotlib.pyplot as plt
ordinal = lambda n: "{}{}".format(n,"tsnrhtdd"[(n/10%10!=1)*(n%10<4)*n%10::4])
for i, a in enumerate(df['Age Group'].unique()):
plt.figure(i)
dfa = df.loc[df['Age Group'] == a].copy()
dfa['Info'] = dfa.Name '\n' dfa.Hometown '\n' \
[ordinal(row) for row in dfa['Times Ran The Race']] ' Time'
plt.barh(dfa.Info, dfa['Finish Time'])
plt.title(f'{a} Age Group')
plt.xlabel("Time (Minutes)")