Using Java 1.8, org.apache.poi 5.1.0 and org.apache.poi.ooxml 5.1.0. I have an excel file that consist of 54 rows. I read through this file in blocks of 5 lines. If I get to line 47 it skips that line and gives me the first line of the new block while it should give me the first empty line above the now block.
Using the debugger I can see it go from line 46 to line 48 while I would expect line 47. Add a breakpoint at line 51 (See the comment in the java code for the location). And you can see how currentRow attribute 'r' skips from line 46 to 48.
I do not know why this happens but it is ruining my day and renders my program useless.
Below you can find my files. I brought it down to the bare minimum while still making the error reproducible.
My build.gradle file
plugins {
id 'java'
id 'application'
}
group 'nl.karnhuis'
sourceCompatibility = 1.8
application {
mainClass = 'nl.karnhuis.test.Testfile'
}
repositories {
mavenCentral()
maven {
url "https://mvnrepository.com/artifact"
}
}
dependencies {
implementation 'org.apache.poi:poi:5.1.0'
implementation 'org.apache.poi:poi-ooxml:5.1.0'
}
my gradle.settings file
rootProject.name = 'testfile'
My java code
package nl.karnhuis.test;
import java.io.*;
import java.util.*;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;
public class Testfile {
public void run() {
File inputFile = new File("schema.xlsx");
handleFile(inputFile);
}
private void handleFile(File inputFile) {
try {
// Create Workbook instance holding reference to .xlsx file
XSSFWorkbook workbook = new XSSFWorkbook(inputFile);
// Get first/desired sheet from the workbook
Sheet datatypeSheet = workbook.getSheetAt(0);
Iterator<Row> iterator = datatypeSheet.iterator();
Row currentRow = null;
// Go over all rows
while (iterator.hasNext()) {
if (checkForLastLine(currentRow)) {
break;
}
currentRow = iterator.next();
// First two rows can be skipped.
if ((currentRow.getRowNum()) < 2) {
continue;
}
currentRow = iterator.next();
// do something important
currentRow = iterator.next();
// do something important
currentRow = iterator.next();
// do something important
// The next row is empty, so it can be skipped.
currentRow = iterator.next();
System.out.println(currentRow.getRowNum()); //Add breakpoint here
}
} catch (IOException | InvalidFormatException e) {
e.printStackTrace();
}
}
private boolean checkForLastLine(Row currentRow) {
if (currentRow == null) {
return false;
} else {
for (Cell currentCell : currentRow) {
// Reached end of file? Get out of da loop!
return currentCell.getColumnIndex() == 0
&& (currentCell.getStringCellValue().trim().startsWith("primaire")
|| currentCell.getStringCellValue().trim().startsWith("secondaire"));
}
}
return false;
}
public static void main(String[] args) {
Testfile mc = new Testfile();
mc.run();
}
}
The excel file can be downloaded from https://www.karnhuis.nl/schema.xlsx
CodePudding user response:
It seems that the empty rows in the Excel weren't created the same way. Try writing something in first cell of row 47 and running again. The row will be listed correctly in your class. Even after deleting the content and having an empty row again, it will work.
Apache POI has the notion of logical rows (that have or previously had content) and won't return lines that were always empty. If you don't have control over how the Excel files are generated, don't use counting rows. You could for example look for text in first column and then count 4 lines.