I have two datasets:
data1 = np.array([11.59877469, 12.52525819, 11.30371594, 9.09666023, 9.69178239,
15.50122531, 16.73578474, 11.59361365, 9.33284219, 10.24323511,
12.39877469, 14.22419388, 11.18491025, 9.38917885, 9.56598452,
13.40122531, 13.58126828, 11.66021816, 9.90587666, 10.09661371,
12.69877469, 13.36333954, 11.40602325, 9.04926015, 9.58116652,
14.30122531, 13.41946488, 11.50330911, 9.70846369, 11.57496747,
13.09877469, 15.59128195, 12.38077981, 9.24187201, 9.4187133 ,
14.50122531, 14.00030896, 10.91137571, 10.09368741, 10.59655608,
12.19877469, 12.66493842, 11.151608 , 10.04291564, 10.75044566,
13.90122531, 13.80879138, 12.01213962, 9.77266764, 10.6047969 ,
11.29877469, 11.91469995, 10.49547065, 9.54128679, 10.70684913,
15.80122531, 15.69100383, 11.00648136, 10.26949539, 11.05984311,
13.29877469, 14.36322429, 10.39796445, 9.15126197, 10.19866594,
13.90122531, 13.22035631, 11.47720851, 10.43120555, 11.41316408,
13.09877469, 12.39251915, 11.18971136, 9.9574122 , 10.25744447,
14.20122531, 13.27219366, 11.60801697, 9.72976318, 12.07620948,
13.69877469, 16.08363333, 12.54008941, 9.38917885, 10.24985347,
14.20122531, 13.84544674, 11.29659942, 10.29799654, 10.78807365,
12.39877469, 13.80385357, 11.79124697, 9.85270434, 10.47867621,
15.30122531, 12.68375859, 11.88853282, 10.36460104, 11.52377995,
12.09877469, 12.36280497, 11.03280231, 8.76603826, 11.57464465,
15.70122531, 16.69782975, 11.86483278, 9.75586377, 11.00865558,
14.39877469, 14.31138694, 10.65958136, 10.03571398, 9.75056092,
13.30122531, 13.52878112, 12.46156728, 10.58571404, 12.04584549,
12.69877469, 13.2230095 , 11.41802602, 9.82180264, 9.27339153,
14.50122531, 13.09520827, 10.91617682, 9.59895472, 12.33344676,
13.29877469, 16.05391916, 12.32137696, 9.41767999, 9.83146261,
14.00122531, 14.13174836, 11.67702203, 9.43724456, 11.17869979,
12.19877469, 13.48024677, 10.98060112, 10.22832584, 10.64047961,
14.90122531, 13.34485454, 11.90053559, 10.23379258, 11.59774047,
13.19877469, 12.95059023, 10.88549547, 9.31057157, 11.71497469,
15.40122531, 16.38875513, 12.68748139, 10.10058357, 12.50913252,
14.09877469, 14.37710664, 11.30162088, 10.18782192, 10.10388188,
14.30122531, 13.33032237, 11.20389433, 10.76632313, 12.05343649,
13.09877469, 12.8918117 , 11.89595483, 9.50078288, 10.53745474,
14.70122531, 13.50665794, 11.46071013, 9.62505531, 12.11286483,
14.19877469, 16.84775416, 13.15122724, 10.29252979, 10.48626721,
14.60122531, 14.15322172, 11.65332199, 10.40960057, 12.80367496,
13.99877469, 13.09027045, 10.75948812, 9.74799647, 11.09682545,
14.70122531, 14.32261612, 12.15495084, 11.38675768, 12.20135753])
data2 = np.array([ 8.29102276, 8.81896403, 8.34976878, 6.42352427, 7.60717448,
11.70897724, 12.33303488, 8.46800077, 5.99812922, 7.01110385,
9.19102276, 10.2088251 , 7.81963878, 5.4700672 , 5.9739585 ,
8.70897724, 8.96487767, 7.99967417, 6.66146823, 6.83476828,
7.69102276, 8.39992335, 8.16675914, 6.29511636, 6.39364899,
10.30897724, 8.45040312, 8.19228603, 6.36654906, 8.26193451,
8.59102276, 11.18658667, 8.97980554, 5.79348751, 6.00367267,
9.60897724, 8.72976357, 7.91417073, 6.37615127, 6.71721123,
7.49102276, 8.6868748 , 7.18480092, 6.0359 , 5.92277097,
9.60897724, 8.545837 , 8.47760298, 5.90062301, 6.21032767,
7.39102276, 7.81907928, 7.17999982, 5.40106214, 6.25461858,
10.80897724, 11.00889351, 7.81186342, 5.67230835, 5.96003158,
8.19102276, 8.73112115, 6.52866357, 5.52706949, 4.4671902 ,
8.10897724, 8.19316585, 6.86320746, 6.08603321, 6.16608132,
6.79102276, 7.77483293, 7.52712017, 5.8504898 , 6.12882072,
8.50897724, 8.6267387 , 6.7990035 , 6.30474566, 7.22539441,
8.19102276, 10.96600474, 7.90754278, 5.22765471, 4.40776186,
8.50897724, 8.12679632, 6.83230576, 5.4844976 , 5.91578523,
6.69102276, 8.20840578, 7.03269298, 5.21565194, 5.05562527,
7.30897724, 7.55289343, 7.27693232, 5.92672361, 5.96003158,
6.99102276, 6.87038206, 6.68557263, 4.45240618, 5.20354631,
9.70897724, 10.90586864, 7.84036457, 5.95522475, 6.16543151,
6.89102276, 8.53201258, 5.92952852, 5.11574518, 4.783206 ,
7.70897724, 6.74452625, 6.08106276, 5.64620776, 5.95244058,
6.99102276, 7.40697979, 6.33365116, 4.83522934, 4.65111677,
8.50897724, 7.28871498, 6.29977521, 5.09927389, 6.87966444,
6.99102276, 10.09820922, 7.30360661, 5.13735016, 4.28326363,
8.40897724, 7.5750166 , 6.05496217, 5.00656879, 4.58470269,
5.59102276, 6.14985776, 6.17674211, 4.13858807, 4.790797 ,
6.60897724, 7.14838493, 6.35917805, 5.61290551, 5.48221238,
6.79102276, 6.94369276, 6.13863875, 4.45480673, 5.46078358,
8.90897724, 9.97235342, 7.59315097, 4.81395693, 4.91590048,
6.39102276, 7.53212783, 5.79631951, 4.70922198, 4.02602635,
7.50897724, 7.39109003, 6.52568932, 5.16827895, 5.08529488,
6.09102276, 7.06884081, 7.31560938, 4.59041629, 3.87181395,
7.40897724, 6.81783696, 6.42578256, 4.81875804, 5.69585312,
6.69102276, 10.16457875, 6.46205907, 4.9708389 , 3.51849298,
6.50897724, 7.2665918 , 5.5437311 , 4.37173093, 5.07011288,
5.19102276, 7.60543854, 6.30755057, 4.07198357, 3.76878908,
7.30897724, 6.88420648, 6.77290291, 5.13017559, 6.2691062 ])
and I'd like to plot two histograms:
x1,y1,_ = plt.hist(data1, alpha=0.4, bins=20)
x2,y2,_ = plt.hist(data2, alpha=0.4, bins=20)
the result is the following:
How can I calculate the area of intersecting region?
I tried to find similar bins centers (to find zero-crossing), but I wasn't succeed, I also tried to find the minimum between two bins with the same index, but it doesn't work for negative data.
CodePudding user response:
Make sure you use the same range and number of bins in both histograms:
>>> rng = min(data1.min(), data2.min()), max(data1.max(), data2.max())
>>> n1, bins1, _ = plt.hist(data1, alpha=0.4, bins=30, range=rng)
>>> n2, bins2, _ = plt.hist(data2, alpha=0.4, bins=30, range=rng)
Then the intersection can be easily calculated:
>>> intersection = np.minimum(n1, n2)
>>> area = intersection.sum()
>>> area
20.0
>>> plt.bar(bins1[:-1], intersection, width=bins1[1]- bins1[0])
CodePudding user response:
You can use:
bin_edges = np.histogram_bin_edges(np.concatenate([data1, data2]), bins=40)
hist1, _ = np.histogram(data1, bins=bin_edges)
hist2, _ = np.histogram(data2, bins=bin_edges)
np.minimum(hist1, hist2).sum()
Output:
20