Home > database >  Javascript loop through JSON, combine dates/times by order of appearance
Javascript loop through JSON, combine dates/times by order of appearance

Time:11-19

I think I may be close to a solution based on the answers to this question, but I am currently a bit stuck on how to go about addressing the layout difference of the JSON I am working with versus the example given in that question. This may be an easier problem than I'm making it out to be; any resources that might point me in the right direction would be helpful.

The JSON I'm working with comes in as follows:

{
  "info": [
    "2021-10-04\nPage visit 09:57:33\n - URL: https://www.google.com/\nPage visit 09:57:50\n - URL: https://www.google.com/blah-blah-blah/\nPage visit 09:56:03\n - URL: https://google.com/random-text/blah/\n\n2021-11-04\nPage visit 13:46:03\n - URL: https://www.google.com/blah/blah-blah/\n\n"
  ]
}

My goal is to loop through this string, extract each URL, date, and timestamp, join the date and timestamp values into a single "yyyy-mm-dd 00:00:00" date time value (ie: "2021-10-04 08:57:23"), then push that combined date time value and the associated URL to a two column array.

I am able to pull the URL and date values using regex, but as a dynamic number of URLs/timestamps are listed separately from their relevant dates, I'm struggling to conceive of a way to associate the timestamps with the correct dates.

//Extract URLs, dates, times
const urlMatches = text.match(/\bhttps?:\/\/\S /gi);
const dateMatches = text.match(/(\d{1,4}([.\-/])\d{1,2}([.\-/])\d{1,4})/g);
const timeMatches = text.match(/\d{1,2}\D\d{1,2}\D(\d{4}|\d{2})/g);

CodePudding user response:

First, .split() the long string by "\n". Than you could use .reduce() on the split array to manipulate it in any way you want, like matching, converting to date and so one.

CodePudding user response:

You can use map, reduce, etc, but I don't know if they're any more readable than plan old forEach. Replace those line breaks with something. You don't have to, but it will be easier. Then use a regex like p1 to split it into date groups. Notice that double line break is easy to check for. Use matchAll and spread to get that into an array.

Iterate that array. Because of the first capturing group, the date is in the first element. Store the date in a variable, then use a second regex to extract out the time/url pairs (from the second capturing group), then iterate that into buffer, combining it with the stored date value.

const data = {
  "info": [
    "2021-10-04\nPage visit 09:57:33\n - URL: https://www.google.com/\nPage visit 09:57:50\n - URL: https://www.google.com/blah-blah-blah/\nPage visit 09:56:03\n - URL: https://google.com/random-text/blah/\n\n2021-11-04\nPage visit 13:46:03\n - URL: https://www.google.com/blah/blah-blah/\n\n"
  ]
};

const text = data.info[0];
const p1 = /(\d{4}-\d{2}-\d{2})~(.*?)~~/g;
let vals = [...text.replaceAll("\n", "~").matchAll(p1)];

const p2 = /page visit\s ([^~]*)~\s -\s url:\s ([^~]*)/ig;
let buffer = [];

vals.forEach(v=>{
  let t = v[1];
  let parsed = [...v[2].matchAll(p2)];
  parsed.forEach(p=>{
    buffer.push({dt: t, time: p[1], url: p[2]});  
  });
});

console.log(buffer);
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

As long as the date groups are separate by a double line break and the date itself is followed by a line break, then that first pattern can be simpler, something like: /([^~]*)~(.*?)~~/g. Notice that the ~~ terminator will only work if the last date is followed by a double line break, but it seems like it is. If it isn't, then you have to use something like (?:~~)? instead of the ~~ by itself.

CodePudding user response:

You can use a combination of .split(), .map() and .reduce() methods as follows:

const jObj = jData.info.map(line =>
    //First split @ \n\n assuming that's where the date might change
    line.split(/[\n]{2,}/).filter(l => l)
    //Then split each new element @ \n, - ,URL: .... to get date followed by time,url pairs
    .map(ln => ln.split(/(?:\n| - |URL: |Page visit ) /).filter(v => v))
    //Manipulate the resulting data to produce timestamp,url pairs
    .map(arr =>
        arr.slice(1).reduce((acc, cur, i, ar) => i % 2 === 0 ? [...acc, ar.slice(i, i   2)] : acc, [])
        .map(([time, url]) => ({
            timestamp: `${arr[0]} ${time}`,
            url
        }))
    )
    //flatten the array or arrays to get an array of objects
    .flat()
);

DEMO

Show code snippet

const jData = {
  "info": [
    "2021-10-04\nPage visit 09:57:33\n - URL: https://www.google.com/\nPage visit 09:57:50\n - URL: https://www.google.com/blah-blah-blah/\nPage visit 09:56:03\n - URL: https://google.com/random-text/blah/\n\n2021-11-04\nPage visit 13:46:03\n - URL: https://www.google.com/blah/blah-blah/\n\n"
  ]
}

const jObj = jData.info.map(line => 
    line.split(/[\n]{2,}/).filter(l => l)
    .map(ln => ln.split(/(?:\n| - |URL: |Page visit ) /).filter(v => v))
    .map(arr => 
        arr.slice(1).reduce((acc,cur,i,ar) => i%2 === 0 ?[...acc,ar.slice(i,i 2)] : acc,[])
        .map(([time, url]) => ({timestamp:`${arr[0]} ${time}`,url}))
    )
    .flat()
);

console.log( jObj );
<iframe name="sif2" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

  • Related