I retrieve some content from an excel file, precisely some Ids, officially the delimitor between each Id is a ',' and I need to ignore the line if there are other delimitors or some things that aren't correct like spaces etc ...
Example :
Nominal case : value = "8000,7000,7500,840000,870"
Wrong case : value = "8000;7000;7500,840000 870"
OR
value = "8000 7000 84000 870"
I tought at first to do something like that :
while (rows.hasNext()) {
TableRow row = rows.next();
//second parameter of getCellValueAsList is the delimitor
**definitiveMediaToDeleteList = row.getCellValueAsList("A", ",");
if(definitiveMediaToDeleteList.contains(";") || definitiveMediaToDeleteList.contains("") || definitiveMediaToDeleteList.contains("")){
REPORT.warn("Incorrect delimitors row {}", row);
continue
}**
But I think it's the wrong way to deal with this problem, plus I will never cover all the wrong cases that I can face with what i'm retrieving with row.getCellValueAsList("A", ",")
How can I use a regex or how can I deal with it ?
EDIT : Here are some more informations for what is allowed or not :
I should have ids, each separated by a "," , no spaces, no other delimitors like ";" or "/" or anything else. And I can of course have one ID exactly
CodePudding user response:
You can try out a regex with some input strings like this:
import java.util.regex.Pattern;
public class so73895507 {
static Pattern pattern = Pattern.compile("^(?:\\d ,)*\\d $");
public static void main(String[] args) {
checkString("8000,7000,7500,840000,870"); // nominal many
checkString("8000"); // nominal single
checkString("8000;7000;7500,840000 870"); // wrong 1
checkString("8000 7000 84000 870"); // wrong 2
checkString("8000,"); // wrong 3
}
static void checkString(String str) {
boolean check = pattern.matcher(str).find();
System.out.println(String.format("%-32s -> %s", str, check));
}
}
Output:
8000,7000,7500,840000,870 -> true
8000 -> true
8000;7000;7500,840000 870 -> false
8000 7000 84000 870 -> false
8000, -> false
The discussion of @erik258 and @Carapace has good points, maybe ^(\d ,) \d $
or ^(?:\d ,) \d $
is better suited for your use case - however, both of them would reject a single ID in a cell. But we can only guess what the your input data may look like...
Edit: Updated answer to reflect new info (single values should be accepted).