Home > OS >  Finding peak in a large data set with python
Finding peak in a large data set with python

Time:03-12

I want to know if there is a way to eliminate points that are not close to the peak. For example if I have a data set with 10 million points and the peak is around 5 million, how could i get rid of points that are no where near close to the peak so i can narrow down where my index point resides

CodePudding user response:

You need first to define what is the range of numbers that are close to the peak. Let's assume you specify a threshold number, so you can keep only the elements that are close to the peak with distance at most threshold by using Numpy with condition. For example:

import numpy as np
data_size = 10000000
max_possible_peak = 5000000
data = np.random.rand(data_size) * max_possible_peak
threshold = 100
peak = max(data)
data_near = data[data > peak-threshold]

CodePudding user response:

A native loop solution:

   # I suppose it is meant that small  numbers are being deleted. If the peak is already known 
     a=0
     i=0
     import array
     ar=array.array("i")
     for ii in range(100000 ): 
         ar.append(ii ) 
     print(len(ar))
     while  a<100000 and i<100000 :
       if ar[i]<90000 /2:
          del ar[i ]
       a =1 
       i =1
      print(len(ar))

       # or, i suppose you want to remove 
       # faraway indices . If I am wrong, update your question.

The updated solution hosted at https://onecompiler.com/python/3xvt9d25h

  • Related