Home > Net >  Apache Camel - extract TGZ file which contain multiple CSV's
Apache Camel - extract TGZ file which contain multiple CSV's

Time:07-28

I'm new to Apache Camel and learning its basics. I'm using Yaml DSL I have a TGZ file which includes 2 small CSV files.

I am trying to decompress the file using gzipDeflater, but when I print the body after the extraction, it includes some data about the CSV (filename, my username, some numbers) - that is preventing me from parsing the CSV only by its known columns.

since the extracted file includes lines that were not included in the original CSV, whenever one of those lines is processed, i get an exception.

Is there a way for me to "ignore" those lines, or perhaps another functionality of Apache Camel that will let me access only the content of those CSV's?

Thanks!

CodePudding user response:

You probably have a gzipped tar file, which is a slightly different thing than just a deflate compressed file.

Try this (convert to YAML if you'd like):

from("file:filedir")
    .unmarshal().gzip()
    .split(new TarSplitter())
        // process/unmarshal CSV
  • Related