Home > Software engineering >  Optimization: Finding the best Simple Moving Average takes too much time
Optimization: Finding the best Simple Moving Average takes too much time

Time:07-12

I've created a simple Spring-Application with a MySQL-DB.

In the DB there are 20 years of stock data (5694 lines): example stock data

The Goal is to find the Best Moving Average (N) for that 20 years of stock data. Inputs are the closing prices of every trading day. The calculated average depends on the N. So p.e. if N=3 the average of a reference day t, is given by ((t-1) (t-2) (t-3)/N). Output is the Best Moving Average (N) and the Result you made with all the buying & selling transactions of the Best N.

I did not find a proper algorithm in the Internet, so I implemented the following:

For every N (249-times) the program does the following steps:

  1. SQL-Query: calculates averages & return list
@Repository
public interface StockRepository extends CrudRepository<Stock, Integer> {
    /*
     * This sql query calculate the moving average of the value n
     */
    @Query(value = "SELECT a.date, a.close, Round( ( SELECT SUM(b.close) / COUNT(b.close) FROM stock AS b WHERE DATEDIFF(a.date, b.date) BETWEEN 0 AND ?1 ), 2 ) AS 'avg' FROM stock AS a ORDER BY a.date", nativeQuery = true)
    List<AverageDTO> calculateAverage(int n); 
  1. Simulate buyings & sellings – > calculate result

  2. Compare result with bestResult

  3. Next N

@RestController
public class ApiController {
    @Autowired
    private StockRepository stockRepository;


    @CrossOrigin(origins = "*")
    @GetMapping("/getBestValue")
    /*
     * This function tries all possible values in the interval [min,max], calculate
     * the moving avg and simulate the gains for each value to choose the best one
     */
    public ResultDTO getBestValue(@PathParam("min") int min, @PathParam("max") int max) {
        Double best = 0.0;
        int value = 0;

        for (int i = min; i <= max; i  ) {
            Double result = simulate(stockRepository.calculateAverage(i));
            if (result > best) {
                value = i;
                best = result;
            }
        }
        return new ResultDTO(value, best);
    }

    /*
     * This function get as input the close and moving average of a stock and
     * simulate the buying/selling process
     */
    public Double simulate(List<AverageDTO> list) {
        Double result = 0.0;
        Double lastPrice = list.get(0).getClose();
        for (int i = 1; i < list.size(); i  ) {
            if (list.get(i - 1).getClose() < list.get(i - 1).getAvg()
                    && list.get(i).getClose() > list.get(i).getAvg()) {
                // buy
                lastPrice = list.get(i).getClose();
            } else if (list.get(i - 1).getClose() > list.get(i - 1).getAvg()
                    && list.get(i).getClose() < list.get(i).getAvg()) {
                // sell
                result  = (list.get(i).getClose() - lastPrice);
                lastPrice = list.get(i).getClose();
            }
        }
        return result;
    }
}

When I put Min=2 and Max=250 it takes 45 minutes to finish. Since, I'm a beginner in Java & Spring I do not know how I can optimize it. I'm happy for every input.

CodePudding user response:

This problem is equivalent with finding the best moving N sum. Simply then divide by N. Having such a slice, then the next slice subtracts the first value and adds a new value to the end. This could lead to an algorithm for finding local growths with a[i N] - a[i] >= 0.

However in this case a simple sequential ordered query with

double[] slice = new double[N];
double sum = 0.0;

suffices. (A skipping algorithm on a database is probably too complicated.)

Simply walk through the table keeping the slice as window, keeping N values and keys, and maintaining the maximum upto now.

Use the primitive type double instead of the object wrapper Double.

If the database transport is a serious factor, a stored procedure would do. Keeping a massive table as really many entities just for a running maximum is unfortunate.

It would be better to have a condensed table or better field with the sum of N values.

  • Related