How to remove some labels from a pie chart-CodePudding

I have a dataframe that looks like this, but larger:

title_of_the_novel                author          publishing_year   mentioned_cities   
0   Beasts and creatures        Bruno Ivory             1850           London 
0   Monsters                    Renata Mcniar           1866           New York 
0   At risk                     Charles Dobi            1870           New York   
0   Manuela and Ricardo         Lucas Zacci             1889           Rio de Janeiro
0   War against the machine     Angelina Trotter        1854           Paris


df_1880_1890 = pd.DataFrame({'title_of_the_novel': [Beasts and creatures, Monsters],
                   'author': [Bruno Ivory, Renata Mcniar]},
                   'publishing_year': ['1850','1866'] 
                   'mentioned_cities': ['London','New York']

I have successfully plotted it on a pie chart using the following code:

1880s_data = result[df_1880_1890].groupby(['mentioned_cities']).sum().plot(
    kind='pie', y='publishing_year', autopct='%1.1f%%', radius=12, ylabel='', shadow=True)

1880s_data.legend().remove()

1880s_data_image = 1880s_data.get_figure()
1880s_data_image.savefig("1880s_pie_chart.pdf", bbox_inches='tight')

However, as my dataframe has many values, some of the labels on the pie chart represent only 0,5% or 1%. My objective is to remove all percentages below 4% from this pie chart. Can someone help me, please?

The picture of the output:

to fully remove wedges from a pie chart based on their percentage, we can add 2 lines to our previous code to iterate over the wedges at the same time when we iterate over the text labels and percentage labels. In our filtering condition we simply make the wedge itself invisible and remove its label so its not added to the legend.

import matplotlib.pyplot as plt

fig, ax = plt.subplots()

totals_df = df.groupby(['mentioned_cities']).sum()
wedges, texts, autotexts = ax.pie(totals_df['publishing_year'], labels=totals_df.index, autopct='%1.1f%%')

threshold = 20
for wedge, label, pct_label in zip(wedges, texts, autotexts):
    pct_value = pct_label.get_text().rstrip('%')
    if float(pct_value) < threshold:
        label.set_text('')       # remove text label
        pct_label.set_text('')   # remove percentage label
        wedge.set_visible(False) # remove wedge from pie
        wedge.set_label('')      # ensure wedge label does not go into legend

ax.legend(bbox_to_anchor=(1.2, 1))

To fix the layout of the pie, this turns back into a little bit of a data problem. We need to group all of the below threshold cities together and then remove them from the pie post-hoc.

totals = df.groupby(['mentioned_cities'])['publishing_year'].sum()
proportions = totals_df / totals_df.sum()

threshold = 0.2
below_thresh_mask = proportions < threshold
plot_data = proportions[~below_thresh_mask]
plot_data.loc[''] = proportions[below_thresh_mask].sum()

fig, ax = plt.subplots()
wedges, texts, autotexts = ax.pie(
    plot_data, labels=plot_data.index, autopct='%1.1f%%'
)

for w, alab in zip(wedges, autotexts):
    if w.get_label() == '':
        w.set_visible(False)
        alab.set_visible(False)

ax.legend(bbox_to_anchor=(1.2, 1))

Though it may be better to simply group those cities into an "other" category.

totals = df.groupby(['mentioned_cities'])['publishing_year'].sum()
proportions = totals_df / totals_df.sum()

threshold = 0.2
below_thresh_mask = proportions < threshold
plot_data = proportions[~below_thresh_mask]
plot_data.loc['other'] = proportions[below_thresh_mask].sum()

fig, ax = plt.subplots()
wedges, texts, autotexts = ax.pie(
    plot_data, labels=plot_data.index, autopct='%1.1f%%'
)

ax.legend(bbox_to_anchor=(1.2, 1))