Home > Mobile >  How to parse key-value based data from a text-format by a provided list of property names and write
How to parse key-value based data from a text-format by a provided list of property names and write

Time:06-11

I'm trying to write a JSON file that contains a matched words from a text like this

servicepoint ‏ 200135644 watchid ‏ 7038842

so each servicepoint and watchid will be inserted to the Object table only one time using this code:

function readfile() {
  Tesseract.recognize('form.png', 'ara', {

    logger: m => console.log(m)

  }).then(({ data: { text } }) => {
    console.log(text); /* this line here */
    var obj = {
      table: []
    };
    const info = ['servicepoint', 'watchid'];
    for (k = 0; k < info.length; k  ) {
      var result = text.match(new RegExp(info[k]   '\\s (\\w )'))[1];
      obj.table.push({
        servicepoint: /* Here i want to insert the number after servicepoint*/ ,
        watchid: /*also i want to insert the number after watchid to the Object table*/
      });
    }
    var json = JSON.stringify(obj); /* converting the object table to json file*/
    var fs = require('fs'); /* and then write json file contians the data*/
    fs.writeFile('myjsonfile.json', json, 'utf8', callback);
  })
};

CodePudding user response:

From one of my above comments ...

One could even drop the unicode flag and translate the above pattern back into /servicepoint\W (\w )/ which is pretty close to the OP's original code. The OP just needs to trade the \\s for an \\W

In addition to the proposed pattern change I also would change the OP's thenables into more clear tasks like separating file-reading from just data parsing/extraction from json-conversion/writing.

I also want to state that the OP's desired data.table based format can not be an array but has to be a pure key-value structure (object) since one can just either aggregate a single object (one entry at time while iterating the property names) or push single property only items (one item at time while iterating the property names) into an array. (The OP tries to create a multi-entry object though also pushing it.)

The next provided code shows the approach. The implementation follows the OP's original code. It just uses async-await syntax and mocks/fakes the file reading and writing processes.

async function readFile(fileName) {
  console.log({ fileName });

  // return await Tesseract.recognize(fileName, 'ara', {
  // 
  //   logger: m => console.log(m)
  // });

  // fake it ...
  return (await new Promise(resolve =>
    setTimeout(
      resolve,
      1000,
      { data: { text: 'servicepoint ‏ 200135644 watchid ‏ 7038842'} }
    )
  ));
}
async function parseDataFromTextAndPropertyNames(text, propertyNames) {
  console.log({ text, propertyNames });

  return propertyNames
    .reduce((table, key) =>
      Object.assign(table, {

        [ key ]: RegExp(`${ key }\\W (\\w )`)
          .exec(text)?.[1] ?? ''

      }), {});
}
async function writeParsedTextDataAsJSON(fileName, table) {
  console.log({ table });

  // const fs = require('fs');
  // fs.writeFile(fileName, JSON.stringify({ table }), 'utf8', callback);

  // fake it ...
  return (await new Promise(resolve =>
    setTimeout(() => {

      console.log({ fileName, json: JSON.stringify({ table }) });
      resolve({ success: true });

    }, 1000)
  ));
}

console.log('... running ...');

(async () => {
  const { data: { text } } = await readFile('form.png');

  const data =
    await parseDataFromTextAndPropertyNames(text, ['servicepoint', 'watchid']);

  const result = await writeParsedTextDataAsJSON('myjsonfile.json', data);

  console.log({ result });
})();
.as-console-wrapper { min-height: 100%!important; top: 0; }

CodePudding user response:

You can use String.match to get the required variable values for your two keys: "servicepoint" and "watchid".

I would suggest using this match pattern to get your two data points.

Then you'll need to create and then stringify the JSON, giving you something like: {servicepoint: 1323, watchid: 234}

I assume you have many of these rows, so you'll want to add each JSON key-value to an array. You can then JSON.stringify(dataArray) to generate the valid JSON text to write to a file.

  • Related