Home > Enterprise >  Custom Range in numpy histogram
Custom Range in numpy histogram

Time:12-19

trying to output histogram data using numpy;

NUMBER_OF_PRICE_BRACKETS = 8
HISTOGRAM_EDGE_RANGE = (0, 1_000_000)

hist, bin_edges = numpy.histogram(price_list, bins=NUMBER_OF_PRICE_BRACKETS, range=HISTOGRAM_EDGE_RANGE)

I get the following output using above code

hist: [0, 6, 6, 0, 0, 0, 0, 0],
bin_edges: [0.0, 125000.0, 250000.0, 375000.0, 500000.0, 625000.0, 750000.0, 875000.0, 1000000.0]

The edges are automatically calculated. Is there any option to force the edges to be created like the example output like below?

hist: [0, 6, 6, 0, 0, 0, 0, 0]
bin_edges: [0.0, 100000.0, 150000.0, 300000.0, 450000.0, 600000.0, 750000.0, 900000.0, 1000000.0]

Maybe using range option like

range=(0, 1_000_000, 150)

CodePudding user response:

Defining only the bins parameter to an integer value automatically chooses the value range. However, you can also give a list/numpy array as argument to bins, e.g.,

hist, bin_edges = np.histogram(price_list, bins=np.linspace(0, 1000000, 10) )

results in

bin_edges = [0.,  111111.11111111,  222222.22222222, 333333.33333333,  444444.44444444,  555555.55555556, 666666.66666667,  777777.77777778,  888888.88888889, 1000000. ]

Note, that I did not use the range parameter there.

CodePudding user response:

You have two options, since histogram will always split your data into equally spaced bins, as if with

np.linspace(*HISTOGRAM_EDGE_RANGE, NUMBER_OF_PRICE_BRACKETS   1)

Option 1: Supply the uneven bins manually:

HISTOGRAM_EDGES = np.array([
    0, 150_000, 300_000, 450_000, 600_000,
    750_000, 900_000, 1_000_000])
hist, bin_edges = numpy.histogram(price_list, bins=HISTOGRAM_EDGES)

Option 2: Adjust your range so it does split evenly into the number of bins you want:

NUMBER_OF_PRICE_BRACKETS = 8
HISTOGRAM_EDGE_RANGE = (0, 1_050_000)

hist, bin_edges = numpy.histogram(price_list, bins=NUMBER_OF_PRICE_BRACKETS, range=HISTOGRAM_EDGE_RANGE)
  • Related