Home > Net >  Parsing LocalDateTime with day-of-year and differing number of subsecond digits
Parsing LocalDateTime with day-of-year and differing number of subsecond digits

Time:11-06

I have data using year and day-of-year (1-365/366) plus a time-of-day, such as 2018-338T14:02:57.47583, rather than year-month-day.

I am trying to write a function where I input a timestamp, it runs it through a bunch of regexes, and returns a pattern that I can use to parse that timestamp via LocalDateTime.parse. I have a function that does this except with one issue.

public static String getInputFormat(String rawFirst) {
            
    String first = rawFirst.replaceAll("[^-.:/ a-zA-Z0-9]", "");
    
    ImmutableMap<String, String> regexFormatMap = new ImmutableMap.Builder<String, String>()
            .put("(-)?[0-9]{4}(Z?)", "YYYY")
            .put("(-)?[0-9]{4}-((00[1-9])|(0[1-9][0-9])|([1-2][0-9][0-9])|(3(([0-5][0-9])|(6[0-6]))))(T)(([0-1][0-9])|(2[0-3])):[0-5][0-9](Z?)", "yyyy-DDD'T'HH:mm")
            .put("(-)?[0-9]{4}-((00[1-9])|(0[1-9][0-9])|([1-2][0-9][0-9])|(3(([0-5][0-9])|(6[0-6]))))(T)(([0-1][0-9])|(2[0-3])):[0-5][0-9]:(([0-5][0-9])|60)(\\.([0-9]{1,6}))?(Z?)", "yyyy-DDD'T'HH:mm:ss.SSSSS")
            .put("(-)?[0-9]{4}-((00[1-9])|(0[1-9][0-9])|([1-2][0-9][0-9])|(3(([0-5][0-9])|(6[0-6]))))(T)(([0-1][0-9])|(2[0-4]))(Z?)", "yyyy-DDD'T'HH")
            .put("(-)?[0-9]{4}-((00[1-9])|(0[1-9][0-9])|([1-2][0-9][0-9])|(3(([0-5][0-9])|(6[0-6]))))(T)24((:00)|(:00:00))?(Z?)", "yyyy-DDD'T'HH:mm:ss")
            .put("(-)?[0-9]{4}-((00[1-9])|(0[1-9][0-9])|([1-2][0-9][0-9])|(3(([0-5][0-9])|(6[0-6]))))(T)24:00:00(\\.([0]{1,6}))(Z?)", "yyyy-DDD'T'HH:mm:ss")
            .put("(-)?[0-9]{4}-((00[1-9])|(0[1-9][0-9])|([1-2][0-9][0-9])|(3(([0-5][0-9])|(6[0-6]))))(Z?)", "yyyy-DDD")
            .build();
            
    for (String regex : regexFormatMap.keySet()) {
        System.out.println(first.matches(regex)   ": "   first   " fits "   regex);
        if (first.matches(regex)) {
            System.out.println("Returning pattern "   regexFormatMap.get(regex));
            return regexFormatMap.get(regex);
        }
    }
            
    System.out.println("did not match pattern, returning default pattern used with test data, eventually this should just fail.");
    return "yyyy-DDD'T'HH:mm:ss.SSSSS";
}

I can't figure out how to handle an arbitrary number of subsecond digits, i.e.,

  • "2018-338T14:02:57.47583"
  • "2018-338T14:02:57.475835"
  • "2018-338T14:02:57.4758352"
  • "2018-338T14:02:57.47583529"

etc.

I want to do this in as general a way as possible, so ideally I wouldn't be checking for each possibility.

One solution would be to have the output format string have nine subsecond digits, and then pad the input string, but the problem is it's getting quite clunky to check whether I should pad it and by how much. I want this to handle a wide variety of strings, and to be expandable just by adding more entries to the regex map instead of adding complexity and special cases elsewhere.

Maybe I can't get everything I want here, but I'd love a solution if you can think of one. Thanks!

CodePudding user response:

Instead of using a complex and error-prone RegEx solution for this requirement, you can use u-D'T'H:m:s[.[SSSSSSSSS][SSSSSSSS][SSSSSSS][SSSSSS][SSSSS][SSSS][SSS][SS][S]] as the pattern with DateTimeFormatter which allows the optional patterns to be specified within square brackets.

Demo:

import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Locale;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        DateTimeFormatter dtf = DateTimeFormatter.ofPattern(
                "u-D'T'H:m:s[.[SSSSSSSSS][SSSSSSSS][SSSSSSS][SSSSSS][SSSSS][SSSS][SSS][SS][S]]", Locale.ENGLISH);
        
        // Test
        Stream.of(
                "2018-338T14:02:57.47583", 
                "2018-338T14:02:57.475835", 
                "2018-338T14:02:57.4758352",
                "2018-338T14:02:57.47583529"
        ).forEach(s -> System.out.println(LocalDateTime.parse(s, dtf)));
    }
}

Output:

2018-12-04T14:02:57.475830
2018-12-04T14:02:57.475835
2018-12-04T14:02:57.475835200
2018-12-04T14:02:57.475835290

ONLINE DEMO

Learn more about the modern Date-Time API* from Trail: Date Time. Check this answer and this answer to learn how to use java.time API with JDBC.


* If you are working for an Android project and your Android API level is still not compliant with Java-8, check Java 8 APIs available through desugaring. Note that Android 8.0 Oreo already provides support for java.time.

CodePudding user response:

You could use a new DateTimeFormatterBuilder() in order to build a flexible DateTimeFormatter which parses Strings with variable fractions of second, using the method appendFraction(TemporalField, int, int, boolean).

For instance like this, where I used 5 nanos of second up to 8 (which might not be the best idea, but it handles all your example Strings):

DateTimeFormatter ldtVarFracSecFormatter =
        new DateTimeFormatterBuilder()
                .appendPattern("uuuu-DDD'T'HH:mm:ss")
                .appendFraction(ChronoField.NANO_OF_SECOND, 5, 8, true)
                .toFormatter(Locale.ENGLISH);

Here's an example using the Strings from your question:

public static void main(String[] args) throws IOException {
    String a = "2018-338T14:02:57.47583";
    String b = "2018-338T14:02:57.475835"; 
    String c = "2018-338T14:02:57.4758352"; 
    String d = "2018-338T14:02:57.47583529";
    
    DateTimeFormatter ldtVarFracSecFormatter = 
            new DateTimeFormatterBuilder()
                    .appendPattern("uuuu-DDD'T'HH:mm:ss")
                    .appendFraction(ChronoField.NANO_OF_SECOND, 5, 8, true)
                    .toFormatter(Locale.ENGLISH);
    
    LocalDateTime aLdt = LocalDateTime.parse(a, ldtVarFracSecFormatter);
    LocalDateTime bLdt = LocalDateTime.parse(b, ldtVarFracSecFormatter);
    LocalDateTime cLdt = LocalDateTime.parse(c, ldtVarFracSecFormatter);
    LocalDateTime dLdt = LocalDateTime.parse(d, ldtVarFracSecFormatter);

    System.out.println(aLdt);
    System.out.println(bLdt);
    System.out.println(cLdt);           
    System.out.println(dLdt);
}

The output is (implicitly using LocalDateTime.toString()):

2018-12-04T14:02:57.475830
2018-12-04T14:02:57.475835
2018-12-04T14:02:57.475835200
2018-12-04T14:02:57.475835290

CodePudding user response:

I’m not sure what your real goal is. If the point is parsing those strings, don’t use any pattern. Build your formatter from formatters that are already built-in:

private static final DateTimeFormatter PARSER = new DateTimeFormatterBuilder()
        .append(DateTimeFormatter.ISO_ORDINAL_DATE)
        .appendLiteral('T')
        .append(DateTimeFormatter.ISO_LOCAL_TIME)
        .toFormatter(Locale.ROOT);

This parses all of your example strings:

    String[] timestampStrings = {
            "2018-338T14:02:57.47583",
            "2018-338T14:02:57.475835",
            "2018-338T14:02:57.4758352",
            "2018-338T14:02:57.47583529"
    };
    
    for (String tss : timestampStrings) {
        LocalDateTime ldt = LocalDateTime.parse(tss, PARSER);
        System.out.format("%-26s -> %s%n", tss, ldt);
    }

Output:

2018-338T14:02:57.47583    -> 2018-12-04T14:02:57.475830
2018-338T14:02:57.475835   -> 2018-12-04T14:02:57.475835
2018-338T14:02:57.4758352  -> 2018-12-04T14:02:57.475835200
2018-338T14:02:57.47583529 -> 2018-12-04T14:02:57.475835290

Don’t be confused by the fact that LocalDateTIme.toString() always prints groups at 3 decimals (if any decimals on the seconds at all), filling with trailing zeroes as necessary. The numbers, the times printed do agree with your strings.

The ISO 8601 format for an ordinal date, like 2018-338 for the 338th day of year 2018, is not often used, but java.time includes a formatter for it. The commonly used DateTimeFormatter.ISO_LOCAL_TIME accepts from 0 through 9 digits of decimals on the seconds, a range which is more than wide enough to accommodate all of your timestamp strings.

If for some reason that I have not yet understood you do want a pattern, the answer by Arvind Kumar Avinash gives you one.

Obviously like everyone else here I am using and recommending java.time, the modern Java date and time API.

Link

Oracle tutorial: Date Time explaining how to use java.time.

  • Related