Home > Software design >  POI single vs multithreaded performance
POI single vs multithreaded performance

Time:05-22

I have a pretty simple use case - read an Excel spreadsheet with POI, populate some values, run cell updates and retrieve calculation outputs.

What puzzles me is performance. If I run single-threaded, Excel file takes a few seconds to get loaded/parsed, but for each subsequent request processing time decreases. If I run exactly the task via multiple threads and then join, the performance is actually 10 times slower on all threads. Where am I messing up?

Here is the test output:

Running Multi Threaded
Thread Thread-8 finished run 7092
Thread Thread-9 finished run 7092
Thread Thread-7 finished run 7092
Thread Thread-10 finished run 7092
Thread Thread-4 finished run 7092
Thread Thread-5 finished run 7092
Thread Thread-3 finished run 7107
Thread Thread-2 finished run 7107
Thread Thread-6 finished run 7107
Thread Thread-1 finished run 7108
Finished in 7113

Running Single Threaded
Thread Thread-11 finished run 591
Thread Thread-12 finished run 192
Thread Thread-13 finished run 173
Thread Thread-14 finished run 149
Thread Thread-15 finished run 126
Thread Thread-16 finished run 133
Thread Thread-17 finished run 159
Thread Thread-18 finished run 124
Thread Thread-19 finished run 131
Thread Thread-20 finished run 121
Finished in 1907
Process finished with exit code 0

Here is the test:

package com.test;

import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.junit.jupiter.api.Test;

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;

public class PerfTest {

    @Test
    public void runSingleThreaded() throws Exception {
        System.out.println("Running Single Threaded");

        long start = System.currentTimeMillis();
        for (int i = 0; i < 10; i  ) {
            new PoiThread().run();
        }
        System.out.println("Finished in "   (System.currentTimeMillis() - start));
    }

    @Test
    public void runMultiThreaded() throws Throwable {
        System.out.println("Running Multi Threaded");

        List<Thread> threads = new ArrayList<>();
        for (int i = 0; i < 10; i  ) {
            threads.add(new PoiThread());
        }

        long start = System.currentTimeMillis();
        threads.forEach(t -> t.start());
        threads.forEach(t -> {
            try {
                t.join();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        });
        System.out.println("Finished in "   (System.currentTimeMillis() - start));
    }

    public static class PoiThread extends Thread {

        @Override
        public void run() {
            try {
                long runStart = System.currentTimeMillis();
                XSSFWorkbook workbook = new XSSFWorkbook(new BufferedInputStream(new FileInputStream("src/main/resources/customer.xlsx")));
                Helper helper = new Helper(workbook, "test", workbook.getCreationHelper().createFormulaEvaluator());
                String ref = helper.getFieldValueCellReference("Inputs", "customer");
                helper.evaluateAllCells();
                Map<String, Double> premiums = helper.getAllNumericFields("Premium Outputs");
                System.out.println(String.format("Thread %s finished run %s", this.getName(), (System.currentTimeMillis() - runStart)));
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }

}

CodePudding user response:

Huge thank you to all who've taken a look

  • Related