Home > Software design >  FlatFileItemReader does not pick up lines starting #
FlatFileItemReader does not pick up lines starting #

Time:03-31

i'm trying to read an ATM-EJ file through spring-batch using a flatFileItemReader, it reads everything perfectly except it misses any line starting with #. below is my itemReder

 @Bean
    @Scope(value = "step", proxyMode = ScopedProxyMode.TARGET_CLASS)
    public FlatFileItemReader flatFileItemReader(@Value("#{jobParameters}") Map<String, JobParameter> jobParameters) {
        return new FlatFileItemReaderBuilder<FieldSet>()
                .name("flatFileItemReader")
                .resource(new PathResource(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "inputFilePath")))
                //.lineTokenizer(ejFileTokenizer(null))
                .lineTokenizer(initiateNewTokenizer())
                .fieldSetMapper(new PassThroughFieldSetMapper())
                .linesToSkip(Integer.parseInt(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "linesToSkip")))
                .encoding("Cp1252")
                //.encoding("UTF-8")
                .build();
    }

Line Tokenizer

private LineTokenizer initiateNewTokenizer() {
        return new AbstractLineTokenizer() {
            @Override
            protected List<String> doTokenize(String line) {
                return Arrays.asList(line);
            }
        };
    }

Sample Input

     *TRANSACTION START*
[020t CARD INSERTED
[020tCARD: ****************9847
DATE 29-12-20    TIME 00:04:34
 00:04:36 ATR RECEIVED T=0
[020t 00:04:53 PIN ENTERED
[020t 00:04:59 OPCODE = A C  C B
 00:04:59 GENAC 1 : ARQC
EXTERNAL AUTHENTICATE: NO ARPC
 00:05:02 GENAC 2 : AAC
 00:05:09 ATR RECEIVED T=0
[020t 00:05:11 OPCODE = A C  C B
 00:05:11 GENAC 1 : ARQC
 00:05:14 GENAC 2 : TC
[020t 00:05:20 NOTES STACKED
[020t 00:05:25 CARD TAKEN
[020t 00:05:28 NOTES PRESENTED 0,1,0,0

#29/12/20  00:06  ATM0001
000607934460  1351          29/12/20
XXXXXXXXXXXXXXXXX
                CUR100.00 CashWithdrawal  000
[020t 00:05:29 NOTES TAKEN
[000p[040q(1     *1351*1*E*000010000,M-00,R-10100
[020t 00:05:36 TRANSACTION END

this is the line that is not being read

 #29/12/20  00:06  ATM0001

I'm not sure what's the issue, could this be encoding? or something to do with tokenizer? I tried debugging and saw that below method does not receives the line starting with #

 @Override
            protected List<String> doTokenize(String line) {

CodePudding user response:

"#" is a comment for FlatFileItemReader and that is why you are not receiving that line.

FlatFileItemReader source code contains this:

public static final String[] DEFAULT_COMMENT_PREFIXES = new String[] { "#" };

So if you want to specify a different comment prefixes use in your builder:

.comments("")

In your code:

return new FlatFileItemReaderBuilder<FieldSet>()
                .name("flatFileItemReader")
                .resource(new PathResource(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "inputFilePath")))
                //.lineTokenizer(ejFileTokenizer(null))
                .lineTokenizer(initiateNewTokenizer())
                .fieldSetMapper(new PassThroughFieldSetMapper())
                .linesToSkip(Integer.parseInt(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "linesToSkip")))
                .encoding("Cp1252")
                //.encoding("UTF-8")
                .comments("") // ignore lines that starts with
                .build();

Reference: https://github.com/spring-projects/spring-batch/blob/c4b001b732c8a4127e6a2a99e2fd00fff510f629/spring-batch-infrastructure/src/main/java/org/springframework/batch/item/file/FlatFileItemReader.java

  • Related