Home > Blockchain >  extract valid date from string, string has some extra alpha numeric characters
extract valid date from string, string has some extra alpha numeric characters

Time:10-08

Rule for parsing date from string

  1. Get year from the string followed by "YEAR" string for example "YEAR2001tV"
  2. Get month from string followed by "month" string for example "Umonth04e6G"
  3. Get day from string followed by "day" string for example "5day016849"
  function parseBirthday( remarkString ) {
    var regex = /year(?<year>\d{4}$)|month(?<month>\d{1,2}$)|day(?<day>\d{1,2}$)/gm;


    let m;
    
    var formattedDate = '';
    while ((m = regex.exec(remarkString)) !== null) {
        console.log(m)
      // This is necessary to avoid infinite loops with zero-width matches
      if (m.index === regex.lastIndex) {
          regex.lastIndex  ;
      }
      
      formattedDate  = m[1];
    }
  
    return formattedDate; 
  }
  
  console.log(parseBirthday('YEAR2001tV934Umonth04e6GdNS6Am5day016849'))

Here we have sample input

YEAR2001tV934Umonth04e6GdNS6Am5day016849

Sample output

 2001-04-01

CodePudding user response:

You can use this regex to match your data:

^(?=.*year(\d{4}))(?=.*month(\d\d))(?=.*day(\d\d)).*$

It matches:

  • ^ beginning of string
  • (?=.*year(\d{4})) lookahead for year followed by 4 digits (captured in group 1)
  • (?=.*month(\d\d)) lookahead for month followed by 2 digits (captured in group 2)
  • (?=.*day(\d\d)) lookahead for day followed by 2 digits (captured in group 3)
  • .* any number of characters
  • $ end of string

You don't mention whether year, month and day occur in that order in the string, so I've used lookaheads in this regex to allow for them being in any order.

You can then replace the input string with

$1-$2-$3

const input = 'YEAR2001tV934Umonth04e6GdNS6Am5day016849'

const regex = /^(?=.*year(\d{4}))(?=.*month(\d\d))(?=.*day(\d\d)).*$/i

const date = input.replace(regex, '$1-$2-$3')

console.log(date)

If you have data where you might have only one digit for the month or day, you can use a replacer function to add leading zeros to those values:

const input = 'YEAR2011tV934Umonth4e6GdNS6Am5day9x6849'

const regex = /^(?=.*year(\d{4}))(?=.*month(\d{1,2}))(?=.*day(\d{1,2})).*$/i

const replacer = (_, y, m, d) => `${y}-${m.padStart(2, '0')}-${d.padStart(2, '0')}`

const date = input.replace(regex, replacer)

console.log(date)

  • Related