awk - Replace each X nth occurs from anything between two strings in Different files using line rang-CodePudding

I would like to replace each X nth that occurs from anything between two strings, group_tree( and \t, in Different files using line range from another file.

I have worked in understanding the operation of this issue Replace each 2 nth occurs from a string in separate files using line range from another file , I have relived the explanations of the authors.

But I have not had success, I still have difficulty in handling the passage where it says somehow [a[int(n /2)%2 4] or a[int(j /2)%2 0] (where, and I know for example 0 means the first line index 1.txt used to replace anything at 0.txt, but the other elements are not sure of what they do exactly.

For example, in the line line below that starts with f==4, if I set that the replacement begins using the last line of 1.txt (index 4, ie, the [int (n /2)%2 4] ), the outfile 4.txt is does not have the expected content as seen below:

"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree()\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree()\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree()\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree()\t"

rather than:

"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"

Another difficulty is that I can only replace the 2 nth consecutive rows of 0.txt, 0-1.txt or 0-2.txt, I can not for example make the first 3 consecutive rows of 0.txt are replaced using the first 1.txt line. I tried to do this by putting the number 3 on the stretch a[int (j /2)%3 0 located on the fourth line of my code below:

awk \
'FNR==1 {  f}
f==1 {a[i  ]=$0}
f==2 {if (sub(/group_tree[[:space:]]*\(.*\)\\t",$/,"group_tree("a[int(j  /2)%3 0]")\\t\"")) {j  }; print > "2.txt"}
f==3 {if (sub(/group_tree[[:space:]]*\(.*\)\\t",$/,"group_tree("a[int(k  /2)%2 3]")\\t\"")) {k  }; print > "3.txt"}
f==4 {if (sub(/group_tree[[:space:]]*\(.*\)\\t",$/,"group_tree("a[int(n  /2)%2 4]")\\t\"")) {n  }; print > "4.txt"}
' \
    1.txt 0.txt 0-1.txt 0-2.txt

but this produced 2.txt below:

"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((car, shape)(milk, market,))\t"
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((car, shape)(milk, market,))\t"
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"

rather than:

"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((car, shape)(milk, market,))\t"
"car_snif = house.group_tree((car, shape)(milk, market,))\t"

content from source files 0.txt, 0-1.txt and 0-2.txt are the same:

"car_snif = house.group_tree((food,hhhh))\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"car_snif = house.group_tree((foodgggggtoise,))\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",

content from 1.txt is:

(food, apple,)(bag, tortoise,)
(sky, cat,)(sun, sea,)
(car, shape)(milk, market,)
(man, shirt)(hair, life)
(dog, big)(bal, pink)

EDIT UPDATE:

Output's desired:

2.txt:

"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t"
"car_snif = house.group_tree((car, shape)(milk, market,))\t"
"car_snif = house.group_tree((car, shape)(milk, market,))\t"

3.txt:

"car_snif = house.group_tree((man, shirt)(hair, life))\t"
"car_snif = house.group_tree((man, shirt)(hair, life))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((man, shirt)(hair, life))\t"
"car_snif = house.group_tree((man, shirt)(hair, life))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"

4.txt:

"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"
"car_snif = house.group_tree((dog, big)(bal, pink))\t"

CodePudding user response：

Be the a[int((k )% (x3*2)/x2) x1]: Be N the total number of 1.txt lines. x1 is the line indice (greater than or equal to 0, less than or equal to N) that I wish to be the first line of 1.txt to be used in substitution.

x2 is the number of occurrences that desire that is replaced by the nth line of 1.txt, in other words x2 is the number of repeated times that I use a certain line of 1.txt to make replacements at 0.txt, or 0-1.txt or 0-2.txt.

Now, let's say I want to use the interval that includes the x1 index line plus Z lines (Z less or equal to "N - (x1 1)"), and that Z 1 = U = (x3*2)/x2. For (x3*2)/x2 We know that x2 is constant because it represents the number of times I chose to use the same 1.txt line, then depending on arranging x3*2 such that at the end (x3*2)/x2 = U.

At this point x4 can be found algebraically and is also found in the derived sequence of %(x4*2)/2 as in this explanation of @markp-fuso https://stackoverflow.com/a/69867762/10824251 (here it uses (%6/2)).

As long as U will never be greater than N then there will never be empty spaces between group_tree (and \t in 2.txt, 3.txt and 4.txt output files, and so I consider the question solved IMHO.

If my answer is very wrong I am willing to correct it or deleted it. For now I will wait for a while and I will not mark as solved because my answer, although it works for me, I am not a specialist, I await a more academic answer from a specialist.

awk \
'FNR==1 {  f}
f==1 {a[i  ]=$0}
f==2 {if (sub(/group_tree[[:space:]]*\(.*\)\\t",$/,"group_tree("a[int((h  )%(3*2)/2) 1]")\\t\"")) ; print > "2.txt"}
f==3 {if (sub(/group_tree[[:space:]]*\(.*\)\\t",$/,"group_tree("a[int((j  )%(1*1)/1) 4]")\\t\"")) ; print > "3.txt"}
f==4 {if (sub(/group_tree[[:space:]]*\(.*\)\\t",$/,"group_tree("a[int((k  )%(3*2)/2) 2]")\\t\"")) ; print > "4.txt"}' \
    1.txt 0.txt 0-1.txt 0-2.txt

Note: Here I used three different variables, as also observed @markp-fuso in the comments. The variables are h, j, k respectively. This is necessary to avoid incrementation overlap (ie j , k , h ). Also note that U and x1 are relational, and the value of one affects the other in the end result of the output files, in this case, 2.txt, 3.txt and 4.txt and not suitable values for U result in empty spaces between The two strings searched in the contents of the output files.

Explanation:

Line 4 from code: Use the nth 1.txt line between the indice lines 1 and 3, 2 times for each occurrence of anything between the two searched strings

Line 5 from code: Use the nth 1.txt line and in this case the interval consists only by the indice line 4 (the last line of 1.txt), 1 time for each occurrence of anything between the two searched strings.

Line 6 from code: Use the nth 1.txt line between the indice 2 and 4 lines, 2 times for each occurrence of anything between the two searched strings.