I have an issue where I need to identify this pattern:
"series":[{
"name":"Some variable thing",
"data":[,,,]}
There are line breaks that I'm concerned with, "Some variable thing" will be of arbitrary length, and the number of commas in the "data" brackets will be variable.
I could figure this out with a few hours of effort maybe, but maybe somebody here can give me a kick start.
Edit
Here is a truncated sample of the HTML file that is produced:
<!DOCTYPE html>
<html><body><div class="main-section"></div><script>var options = {
"chart":{
"id":"ChartID-qn5g8ay9",
"height":"350px",
"type":"bar",
"stacked":true},
"title":{
"text":"Fictional Books Sales"},
"legend":{
"show":true,
"position":"top"},
"plotOptions":{
"bar":{
"horizontal":true}},
"dataLabels":{
"enabled":true,
"offsetX":-6,
"style":{
"fontSize":"12px"}},
"series":[{
"name":"Tank Picture",
"data":[,,,,,,]},{
"name":"Bucket Slope",
"data":[53,32,33,52,13,43,32]}],
"xaxis":{
"categories":["2008","2009","2010","2011","2012","2013","2014"]}}
var chart = new ApexCharts(document.querySelector('#ChartID-qn5g8ay9'),
options
);
chart.render();
Note that there will be a variable number of charts contained in the file, but I am only concerned with replacing the first instance of "data":[,,,,,,]},{
that follows each instance of "series":[{
with "data":[0,,,,,,]},{
, where the number of array members (commas) is variable.
CodePudding user response:
Given the mix of data formats in your input (HTML, JavaScript, JSON), which makes extraction and selective modification of the embedded JSON data a challenge, a regex solution is indeed probably simplest:
Use the regex-based -replace
operator:
(Get-Content -Raw file.htm) -replace `
'(?<="series"\s*:\s*\[\{\s*"name"\s*:\s*"[^"] ",\s*"data":\s*\[),',
'0,'
Get-Content
-Raw
reads the input file as a whole, as a single, multi-line string, which enables matching across lines.A positive lookbehind assertion (
(?<=...)
) is used to match the text preceding the,
of interest; the latter is then replaced with0,
For robustness,
\s*
is inserted to match varying amounts of whitespace, if any, in places where they have no syntactic meaning in the JSON string, to guard against incidental formatting variations.