Home > Software design >  Regex search for text then extract row
Regex search for text then extract row

Time:11-10

I have JSONL file as text - string, its a very big file and not useful to convert to standard JSON:

{"id":"gid:\/\/shopify\/ProductVariant\/32620848382088","__parentId":"gid:\/\/shopify\/Product\/4632300847240"}
{"id":"gid:\/\/shopify\/Product\/4632300912776"}
{"namespace":"daily_deals","key":"status","value":"inactive","__parentId":"gid:\/\/shopify\/Product\/4632300912776"}
{"namespace":"daily_deals","key":"endtime","value":"1604966400000","__parentId":"gid:\/\/shopify\/Product\/4632300912776"}
{"id":"gid:\/\/shopify\/ProductVariant\/32620848447624","__parentId":"gid:\/\/shopify\/Product\/4632300912776"}
{"id":"gid:\/\/shopify\/Product\/4632301011080"}
{"namespace":"daily_deals","key":"status","value":"inactive","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"namespace":"daily_deals","key":"endtime","value":"1604966400000","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"id":"gid:\/\/shopify\/ProductVariant\/32620848808072","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"id":"gid:\/\/shopify\/ProductVariant\/39402297720968","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"id":"gid:\/\/shopify\/Product\/4673135444104"}

I want to solve problem at frontend so I need to use javascript . How I can using regex to select only rows which contain text: "gid://shopify/Product/4632301011080" and "namespace":"daily_deals" ? So I need whole row from { to } if contain text

Is the best solution to use regex or some other technic? Please suggest? The text JSONL file is average 10mb so I think it wont affect browser memory a lot.

UPDATE: All rows I want to search starts with {"namespace": and other onces I want to ignore because of performance

CodePudding user response:

/gid://shopify/Product/\d{1,}/

CodePudding user response:

You could use this regex:

/^{"namespace":"daily_deals".*?"gid:\/\/shopify\/Product\/4632301011080".*/gm

let content = `{"id":"gid:\/\/shopify\/ProductVariant\/32620848382088","__parentId":"gid:\/\/shopify\/Product\/4632300847240"}
{"id":"gid:\/\/shopify\/Product\/4632300912776"}
{"namespace":"daily_deals","key":"status","value":"inactive","__parentId":"gid:\/\/shopify\/Product\/4632300912776"}
{"namespace":"daily_deals","key":"endtime","value":"1604966400000","__parentId":"gid:\/\/shopify\/Product\/4632300912776"}
{"id":"gid:\/\/shopify\/ProductVariant\/32620848447624","__parentId":"gid:\/\/shopify\/Product\/4632300912776"}
{"id":"gid:\/\/shopify\/Product\/4632301011080"}
{"namespace":"daily_deals","key":"status","value":"inactive","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"namespace":"daily_deals","key":"endtime","value":"1604966400000","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"id":"gid:\/\/shopify\/ProductVariant\/32620848808072","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"id":"gid:\/\/shopify\/ProductVariant\/39402297720968","__parentId":"gid:\/\/shopify\/Product\/4632301011080"}
{"id":"gid:\/\/shopify\/Product\/4673135444104"}`;

let result = content.match(/^{"namespace":"daily_deals".*?"gid:\/\/shopify\/Product\/4632301011080".*/gm);

console.log(result);

CodePudding user response:

Try this: Suppose "namespace":"daily_deals" part always comes before "gid://shopify/Product/4632301011080" this regex will work.

^{"namespace":"daily_deals".*"gid:\/\/shopify\/Product\/4632301011080".*

See live demo.

  • Related