I have a column o values named exceeded_amount
and I would like to plot it as a histogram. The way I do it, the plot is not legible.
How could I group the values in bins in different ranges? For example 1-100, 101-500, 501-20000 ?
Alternatively, please advise what is the best way to visualize exceeded_amount
column.
tr['exceeded_amount'].plot()
UPDATE as per JohanC
However, I'd like to have specific range amounts shown on the axes rather than 10^2, 10^3...etc.
ax = sns.boxenplot(x=tr['exceeded_amount']) ax.set_xscale('log')
[19.0, 193.0, 4928.0, 1956.0, 171.0, 163.7, 231.0, 5.0, 878.5, 190.46, 89.0, 4.0, 35.0, 393.0, 171.0, 546.0, 99.98, 93.36, 0.82, 419.14, 181.0, 42.27, 2807.0, 116.0, 1199.0, 16.0, 128.0, 412.0, 100.0, 1070.4, 461.0, 377.0, 266.0, 930.0, 625.99, 237.5, 157.67, 58.0, 870.88, 329.5, 1418.0, 391.0, 329.0, 182.81, 329.5, 98.0, 211.0, 1.0, 557.0, 1284.04, 131.0, 113.33, 64.0, 46.66, 598.48, 149.0, 561.0, 14.83, 209.0, 454.7, 273.33, 21.0, 724.0, 2226.0, 209.0, 23.0, 853.56, 89.0, 63.25, 28.0, 41.0, 303.5, 103.82, 162.01, 1763.0, 8.0, 2359.0, 1171.0, 194.68, 1031.0, 362.0, 333.0, 312.0, 854.65, 630.0, 833.0, 691.0, 227.0, 139.47, 277.56, 1642.0, 27.0, 166.0, 931.0, 968.7, 27.33, 338.0, 201.0, 77.0, 7547.04, 0.49, 568.0, 307.07, 203.0, 167.56, 1138.78, 111.0, 51.0, 423.0, 504.62, 353.97, 51.0, 416.0, 68.05, 16.0, 7.39, 631.0, 551.0, 596.0, 89.63, 777.0, 207.0, 167.56, 246.0, 503.99, 22.0, 65.79, 21.0, 747.0, 5058.26, 1673.0, 275.92, 108.66, 99.5, 893.0, 67.0, 49.0, 663.0, 72.6, 1824.66, 127.0, 239.71, 1306.0, 815.62, 100.88, 253.0, 636.0, 600.5, 321.0, 111.0, 545.3, 312.0, 17.0, 343.61, 5933.0, 310.0, 356.0, 284.0, 139.0, 877.0, 48.95, 715.0, 126.33, 1275.0, 149.0, 8.99, 71.0, 241.0, 116.0, 225.0, 882.07, 81.0, 121.0, 53.93, 496.0, 2636.85, 71.0, 81.0, 8222.0, 52.33, 114.0, 437.0, 95.0, 28967.0, 142.0, 1.0, 1271.2, 683.76, 184.0, 220.4, 182.0, 618.0, 119.67, 661.85, 71.0, 22.37, 570.4, 388.88, 113.0, 290.0, 137.03, 3879.0, 619.0, 720.45, 961.5, 11.0, 101.0, 14.0, 1189.0, 1038.0, 246.0, 422.0, 153.4, 6999.4, 288.4, 707.28, 22681.0, 698.0, 305.0, 1097.0, 91.0, 147.0, 4793.26, 26.0, 309.0, 37.66, 59.2, 422.0, 417.13, 344.99, 29.0, 437.0, 545.0, 695.0, 39.66, 380.0, 709.1, 291.0, 1596.0, 920753.0, 115.68, 145.19, 81.0, 764.0, 751.63, 766.93, 2141.0, 327.0, 1358.3, 381.0, 115.0, 116.0, 571.0, 84.0, 697.0, 33.0, 1589.0, 123.05, 11.5, 1297.0, 71.0, 427.99, 63.0, 153.99, 197.99, 168.99, 1271.2, 30.0, 671.0, 582.33, 445.08, 378.0, 114.5, 512.0, 739.5, 411.0, 58.0, 1263.0, 436.69, 26.53, 14467.99, 1.0, 1659.82, 50.0, 103.07, 364.0, 191.2, 761.0, 225.0, 645.0, 129.0, 185.0, 22.44, 292.06, 342.4, 3347.0, 76.0, 217.5, 870.99, 54.0, 1218.0, 210.51, 111.0, 252.0, 1597.4, 123.08, 556.0, 148.0, 131.0, 356.0, 178.12, 99341.0, 422.0, 163.0, 551.0, 1992.0, 176.0, 366.0, 263.0, 156.0, 213.0, 177.0, 1095.38, 83.0, 375.32, 750.0, 203.66, 554.0, 201.72, 225.0, 267.0, 637.95, 89.0, 76.0, 189.48, 1072.21, 13.0, 284.0, 86.0, 336.99, 33.53, 117.66, 100.99, 854.0, 2985.95, 157.99, 5.01, 322.0, 51.0, 408.0, 1331.0, 312.0, 281.0, 296.18, 287.0, 197.0, 557.08, 141.0, 556.0, 16.8, 1511.36, 27.35, 225.0, 841.0, 380.0, 1211.1, 1068.11, 529.31, 4372.0, 46.0, 181.0, 225.0, 135.0, 1655.66, 3865.0, 172.0, 286.0, 143.0, 1391.0, 65.0, 76.0, 1316.0, 2419.0, 893.0, 165.0, 196.0, 15.99, 537.27, 38.0, 51.0, 380.0, 265.0, 341.0, 276.38, 135.0, 716.0, 4915.0, 59.0, 130.0, 557.08, 3178.0, 1043.8, 473.0, 1938.99, 486.0, 2272.0, 61.0, 141.27, 312.0, 252.0, 79.0, 441.0, 21.0, 71.18, 44.0, 113.0, 2294.0, 1259.0, 120.08, 881.0, 280.39, 6.0, 18.0, 42.0, 209.0, 462.0, 152.0, 301.0, 244.0, 1110.0, 149.0, 877.0, 711.0, 1978.0, 184.95, 666.0, 322.0, 205.0, 309.0, 476.0, 3178.0, 1328.0, 428.0, 183.51, 63.0, 684.0, 254.42, 354.0, 116.0, 135.0, 144.67, 31.0, 136.0, 361.0, 272.09, 737.0, 3347.0, 363.74, 506.0, 209.99, 4827.0, 545.0, 412.0, 1636.0, 96.0, 238.0, 422.0, 109.0, 44.0, 287.0, 327.99, 349.19, 28.99, 279.0, 181.0, 629.0, 137.75, 71.0, 2357.0, 493.0, 340.0, 177.16, 71.2, 4819.74, 22.0, 71.0, 73.73, 343.0, 121.0, 2272.0, 201.56, 1831.0, 158.98, 493.0, 576.8, 260.97, 847.0, 73.0, 5.0, 251.0, 207.0, 174.0, 82.86, 131.0, 1053.0, 353.0, 101.0, 854.0, 259.77, 12.37, 385.0, 9.27, 286.0, 85.0, 98.14, 21.0, 31.0, 71.0, 178.0, 63.0, 517.38, 118.0, 2350.0, 143.0, 88.0, 61.0, 297.0, 64.15, 20.56, 117.0, 189.0, 177.0, 630.0, 2997.0, 9961.0, 236.0, 240.0, 459.99, 3.0, 608.0, 341.0, 11.0, 1052.0, 42.0, 341.0, 21.0, 395.0, 575.0, 635.99, 539.83, 30.0, 570.0, 75.0, 503.99, 3774.0, 446.0, 87.0, 113.66, 217.5, 489.0, 41.0, 626.99, 461.0, 514.88, 813.99, 43.62, 1663.0, 96.0, 276.06, 73.75, 302.0, 68.0, 651.0, 25.0, 34.0]
CodePudding user response:
You can explicitly set your own bin edges, and convert the x-axis to log scale:
from matplotlib import pyplot as plt
from matplotlib.ticker import ScalarFormatter
import seaborn as sns
import numpy as np
values = [19.0, 193.0, 4928.0, 1956.0, 171.0, 163.7, 231.0, 5.0, 878.5, 190.46, 89.0, 4.0, 35.0, 393.0, 171.0, 546.0, 99.98, 93.36, 0.82, 419.14, 181.0, 42.27, 2807.0, 116.0, 1199.0, 16.0, 128.0, 412.0, 100.0, 1070.4, 461.0, 377.0, 266.0, 930.0, 625.99, 237.5, 157.67, 58.0, 870.88, 329.5, 1418.0, 391.0, 329.0, 182.81, 329.5, 98.0, 211.0, 1.0, 557.0, 1284.04, 131.0, 113.33, 64.0, 46.66, 598.48, 149.0, 561.0, 14.83, 209.0, 454.7, 273.33, 21.0, 724.0, 2226.0, 209.0, 23.0, 853.56, 89.0, 63.25, 28.0, 41.0, 303.5, 103.82, 162.01, 1763.0, 8.0, 2359.0, 1171.0, 194.68, 1031.0, 362.0, 333.0, 312.0, 854.65, 630.0, 833.0, 691.0, 227.0, 139.47, 277.56, 1642.0, 27.0, 166.0, 931.0, 968.7, 27.33, 338.0, 201.0, 77.0, 7547.04, 0.49, 568.0, 307.07, 203.0, 167.56, 1138.78, 111.0, 51.0, 423.0, 504.62, 353.97, 51.0, 416.0, 68.05, 16.0, 7.39, 631.0, 551.0, 596.0, 89.63, 777.0, 207.0, 167.56, 246.0, 503.99, 22.0, 65.79, 21.0, 747.0, 5058.26, 1673.0, 275.92, 108.66, 99.5, 893.0, 67.0, 49.0, 663.0, 72.6, 1824.66, 127.0, 239.71, 1306.0, 815.62, 100.88, 253.0, 636.0, 600.5, 321.0, 111.0, 545.3, 312.0, 17.0, 343.61, 5933.0, 310.0, 356.0, 284.0, 139.0, 877.0, 48.95, 715.0, 126.33, 1275.0, 149.0, 8.99, 71.0, 241.0, 116.0, 225.0, 882.07, 81.0, 121.0, 53.93, 496.0, 2636.85, 71.0, 81.0, 8222.0, 52.33, 114.0, 437.0, 95.0, 28967.0, 142.0, 1.0, 1271.2, 683.76, 184.0, 220.4, 182.0, 618.0, 119.67, 661.85, 71.0, 22.37, 570.4, 388.88, 113.0, 290.0, 137.03, 3879.0, 619.0, 720.45, 961.5, 11.0, 101.0, 14.0, 1189.0, 1038.0, 246.0, 422.0, 153.4, 6999.4, 288.4, 707.28, 22681.0, 698.0, 305.0, 1097.0, 91.0, 147.0, 4793.26, 26.0, 309.0, 37.66, 59.2, 422.0, 417.13, 344.99, 29.0, 437.0, 545.0, 695.0, 39.66, 380.0, 709.1, 291.0, 1596.0, 920753.0, 115.68, 145.19, 81.0, 764.0, 751.63, 766.93, 2141.0, 327.0, 1358.3, 381.0, 115.0, 116.0, 571.0, 84.0, 697.0, 33.0, 1589.0, 123.05, 11.5, 1297.0, 71.0, 427.99, 63.0, 153.99, 197.99, 168.99, 1271.2, 30.0, 671.0, 582.33, 445.08, 378.0, 114.5, 512.0, 739.5, 411.0, 58.0, 1263.0, 436.69, 26.53, 14467.99, 1.0, 1659.82, 50.0, 103.07, 364.0, 191.2, 761.0, 225.0, 645.0, 129.0, 185.0, 22.44, 292.06, 342.4, 3347.0, 76.0, 217.5, 870.99, 54.0, 1218.0, 210.51, 111.0, 252.0, 1597.4, 123.08, 556.0, 148.0, 131.0, 356.0, 178.12, 99341.0, 422.0, 163.0, 551.0, 1992.0, 176.0, 366.0, 263.0, 156.0, 213.0, 177.0, 1095.38, 83.0, 375.32, 750.0, 203.66, 554.0, 201.72, 225.0, 267.0, 637.95, 89.0, 76.0, 189.48, 1072.21, 13.0, 284.0, 86.0, 336.99, 33.53, 117.66, 100.99, 854.0, 2985.95, 157.99, 5.01, 322.0, 51.0, 408.0, 1331.0, 312.0, 281.0, 296.18, 287.0, 197.0, 557.08, 141.0, 556.0, 16.8, 1511.36, 27.35, 225.0, 841.0, 380.0, 1211.1, 1068.11, 529.31, 4372.0, 46.0, 181.0, 225.0, 135.0, 1655.66, 3865.0, 172.0, 286.0, 143.0, 1391.0, 65.0, 76.0, 1316.0, 2419.0, 893.0, 165.0, 196.0, 15.99, 537.27, 38.0, 51.0, 380.0, 265.0, 341.0, 276.38, 135.0, 716.0, 4915.0, 59.0, 130.0, 557.08, 3178.0, 1043.8, 473.0, 1938.99, 486.0, 2272.0, 61.0, 141.27, 312.0, 252.0, 79.0, 441.0, 21.0, 71.18, 44.0, 113.0, 2294.0, 1259.0, 120.08, 881.0, 280.39, 6.0, 18.0, 42.0, 209.0, 462.0, 152.0, 301.0, 244.0, 1110.0, 149.0, 877.0, 711.0, 1978.0, 184.95, 666.0, 322.0, 205.0, 309.0, 476.0, 3178.0, 1328.0, 428.0, 183.51, 63.0, 684.0, 254.42, 354.0, 116.0, 135.0, 144.67, 31.0, 136.0, 361.0, 272.09, 737.0, 3347.0, 363.74, 506.0, 209.99, 4827.0, 545.0, 412.0, 1636.0, 96.0, 238.0, 422.0, 109.0, 44.0, 287.0, 327.99, 349.19, 28.99, 279.0, 181.0, 629.0, 137.75, 71.0, 2357.0, 493.0, 340.0, 177.16, 71.2, 4819.74, 22.0, 71.0, 73.73, 343.0, 121.0, 2272.0, 201.56, 1831.0, 158.98, 493.0, 576.8, 260.97, 847.0, 73.0, 5.0, 251.0, 207.0, 174.0, 82.86, 131.0, 1053.0, 353.0, 101.0, 854.0, 259.77, 12.37, 385.0, 9.27, 286.0, 85.0, 98.14, 21.0, 31.0, 71.0, 178.0, 63.0, 517.38, 118.0, 2350.0, 143.0, 88.0, 61.0, 297.0, 64.15, 20.56, 117.0, 189.0, 177.0, 630.0, 2997.0, 9961.0, 236.0, 240.0, 459.99, 3.0, 608.0, 341.0, 11.0, 1052.0, 42.0, 341.0, 21.0, 395.0, 575.0, 635.99, 539.83, 30.0, 570.0, 75.0, 503.99, 3774.0, 446.0, 87.0, 113.66, 217.5, 489.0, 41.0, 626.99, 461.0, 514.88, 813.99, 43.62, 1663.0, 96.0, 276.06, 73.75, 302.0, 68.0, 651.0, 25.0, 34.0]
bins = 10.0 ** np.arange(-1, 7)
plt.figure(figsize=(12, 5))
ax = sns.histplot(x=values, bins=bins, edgecolor='k', linewidth=2)
ax.set_xscale('log')
ax.set_xticks(bins)
ax.xaxis.set_major_formatter(ScalarFormatter())
ax.ticklabel_format(axis='x', useOffset=False, style='plain')
plt.tight_layout()
plt.show()
Here is a version with edges at powers of 10, multiplied by 1,2 or 5.
bins = np.outer(10.0 ** np.arange(-1, 7), [1, 2, 5]).ravel()[:-2]
plt.figure(figsize=(12, 5))
ax = sns.histplot(x=values, bins=bins, edgecolor='k', linewidth=2)
ax.set_xscale('log')
ax.set_xticks(bins)
ax.xaxis.set_major_formatter(lambda x, pos: f'{x:.1f}' if x < 1 else f'{x:.0f}' if x < 10000 else f'{x/1000:.0f}K')
ax.margins(x=0.01)
sns.despine()
plt.tight_layout()
plt.show()
The same information can be shown as a bar plot, using np.histogram to calculate the values:
bins = np.outer(10.0 ** np.arange(-1, 7), [1, 2, 5]).ravel()[:-2]
plt.figure(figsize=(16, 5))
heights, _ = np.histogram(values, bins=bins)
labels = [
f'{x0:.1f}-{x1:.1f}' if x0 < 1 else f'{x0:.0f}-{x1:.0f}' if x0 < 10000 else f'{x0 / 1000:.0f}-{x1 / 1000:.0f}K'
for x0, x1 in zip(bins[:-1], bins[1:])]
ax = sns.barplot(x=[lbl for lbl, h in zip(labels, heights) if h > 0], y=heights[heights > 0])
ax.margins(x=0.01)
sns.despine()
plt.tight_layout()
plt.show()