Finding peak in a large data set with python-CodePudding

I want to know if there is a way to eliminate points that are not close to the peak. For example if I have a data set with 10 million points and the peak is around 5 million, how could i get rid of points that are no where near close to the peak so i can narrow down where my index point resides

CodePudding user response：

You need first to define what is the range of numbers that are close to the peak. Let's assume you specify a threshold number, so you can keep only the elements that are close to the peak with distance at most threshold by using Numpy with condition. For example:

import numpy as np
data_size = 10000000
max_possible_peak = 5000000
data = np.random.rand(data_size) * max_possible_peak
threshold = 100
peak = max(data)
data_near = data[data > peak-threshold]

CodePudding user response：

A native loop solution:

   # I suppose it is meant that small  numbers are being deleted. If the peak is already known 
     a=0
     i=0
     import array
     ar=array.array("i")
     for ii in range(100000 ): 
         ar.append(ii ) 
     print(len(ar))
     while  a<100000 and i<100000 :
       if ar[i]<90000 /2:
          del ar[i ]
       a =1 
       i =1
      print(len(ar))

       # or, i suppose you want to remove 
       # faraway indices . If I am wrong, update your question.

The updated solution hosted at https://onecompiler.com/python/3xvt9d25h