Home > Software engineering >  Manipulate CSV file using groovy and java
Manipulate CSV file using groovy and java

Time:07-07

Newbie here with a question. I have the following .csv file as an example:

10;06.07.2022;This is test;

08;01.07.2020;This is test;

15;06.07.2021;This is test;

09;06.07.2021;This is test;

So its multiple rows with the same setup. I want to delete each row which have a date earlier then 06.07.2022. So in theory only the first row should still be in the .csv file and the other ones should get deleted.

I want to be able to declare the date as a variable. I already did the following to try to understand:

     private String dateii      = 'test.csv';                               // Filename Input
     private String dateio      = '';                                           // Filename Output


void openInputfile() {
        File outputfile = new File(dateio);
        outputfile.write('');
        
        File inputfile = new File(dateii);
        if (!inputfile.exists()) {
             println("No File")

        }
            
        List data = inputfile.readLines();
        for (String zeile in data) {
            
            
            if (zeile.startsWith('BEREICH')) {
                Header = zeile;
            } else {
                List buffer = zeile.split(";");
                if (zeile.size() > 0) {                                     // Remove Empty rows
                    
                    
                }
            } 
        }
        

EDIT:

So my questions are the following:

  1. How can I delete a complete row?
  2. How can I specify which rows to delete using the date?

Thank you!

CodePudding user response:

This seems to be a schoolwork assignment, so I’ll be brief and omit code.

First, define your target date as a LocalDate object.

LocalDate target = LocalDate.of( 2021 , 6 , 7 ) ;

Your existing code has a list of fields for each row. Your goal is to filter by comparing the second field, the date.

  1. Extract the second field, the date input.
  2. Parse that date input as a LocalDate. Use a DateTimeFormatter object for parsing, with a formatting pattern you define.
  3. Compare dates using LocalDate#isBefore.
  4. If your criterion is met, write that line to the output file. If not met, move on to the next line, omitting this current line from your output file.

All aspects of this have been covered many many times on Stack Overflow. Search to learn more.

Tip: Java NIO.2. Java has old ways of handling files, and new ways. Use the new ways.

You asked:

How can I delete a complete row?

Skip it. Don’t write that line to the output file.

You asked:

How can I specify which rows to delete using the date?

Parse the text of the date field as a LocalDate. Compare to your target date using a LocalDate method such as isAfter or isBefore.


By the way, better naming will make your code easier to read and debug. data is vague. buffer is inaccurate. Header should start with a lowercase letter, header.


In real work we would not parse the CSV file directly. We would use any of the several good CSV parsing libraries available in the Java ecosystem. Example: Apache Commons CSV.

CodePudding user response:

in groovy code could look like this if input file this simple - otherwise better to use csv library

import java.time.LocalDate

//could be replaced with src = new File(source_file_name)
def src = '''
10;30.07.2022;This is test;
08;01.07.2020;This is test;
15;06.07.2021;This is test;
09;06.07.2021;This is test;
'''

//could be replaced with dst = new File(target_file_name)
def dst = System.out
def now = LocalDate.now()

dst.withWriter{writer->
    src.eachLine{line->
        def row = line.split(/;/)
        if( row.size()==3 ){
            if( LocalDate.parse(row[1], 'dd.MM.yyyy').toEpochDay() > now.toEpochDay() ) {
                writer << line << '\n'
            }
        }
    }
}

CodePudding user response:

Given the input file projects/test.csv containing these lines:

10;06.07.2022;This is test;
08;01.07.2020;This is test;
15;06.07.2021;This is test;
09;06.07.2021;This is test;

The following Groovy script:

import java.time.format.DateTimeFormatter
import java.time.LocalDate

final FMT = DateTimeFormatter.ofPattern('dd.MM.yyyy')

def targetDate = LocalDate.parse('06.07.2022', FMT)

def test = new File('projects/test.csv')
def filtered = new File('projects/filtered.csv')
def filteredOutput = filtered.newPrintWriter()

test.filterLine(filteredOutput) { line ->
   def parts = line.split(';')
   def lineDate = LocalDate.parse(parts[1], FMT)
   lineDate >= targetDate
}

Produces the following output in file projects/filtered.csv:

10;06.07.2022;This is test;

Is this what you were looking for?

It takes advantage of idiomatic Groovy code shortcuts and modern java.time date manipulation classes.

  • Related