Home > Mobile >  extracting the block of data separated by symbols
extracting the block of data separated by symbols

Time:06-03

I have a file that contain the data as follows

10.00      4.85   2.80  
16.00      6.25   3.61  
22.00      6.40   3.70  
25.00      6.80   3.79
>
100.00      4.85   2.80  
160.00      6.25   3.61  
220.00      6.40   3.70  
250.00      6.80   3.79
>
100.00      4.85   2.88
160.00      6.25   3.68  
220.00      6.40   3.78  
250.00      6.80   3.78
>

i want to read/print the portion of the data separated by > symbols in for loop

I tried the code

while read block
do
    ccc=sed 's/.*[>] * //' $block
    echo $ccc
done < filex

can anybody suggest a solution.Thanks in advance.

CodePudding user response:

You can do that easily with awk, incrementing the value i each time you encounter a greater-than symbol at the start of the line, and writing other lines to a file named blk- followed by the value of i:

awk 'BEGIN{i=0} /^>/{i  ; next} {print > "blk-" i}' YOURFILE

CodePudding user response:

I do not understand exactly what you are trying to do. I see multiple portions of data separated by >, so I don't know what you meant by "the portion of the data separated by > symbols".

But let's go through your script and see why it isn't working. Hope you get some inspiration from that.

while read block
do
    ...
done < filex

will loop over your file. The first iteration, block will contain

10.00      4.85   2.80  

The second iteration, block will contain

16.00      6.25   3.61  

et cetera.

Then, you write:

    ccc=sed 's/.*[>] * //' $block

In bash, this means execute the program s/.*[>] * // with the ccc set to sed and the contents of block as arguments (three arguments, 10.00, 4.85 and 2.80 the first iteration. That is not what you want. You will get the error message bash: s/.*[>] * //: No such file or directory because bash cannot find that file.

To get the output of a command into a variable, use

ccc=$(sed .....)

sed either takes STDIN as input or, if specified, the files on the command line. So,

sed 's/.*[>] * //' $block

will make sed look (first iteration of the loop again) for the files 10.00, 4.85 and 2.80. sed will not use the line in $block as input. If you want sed to use $block as input, you should provide it as STDIN:

echo "$block" | sed '....`

or

ccc=$(echo "$block" | sed '...')

And finally some comments on the pattern.

sed will, for every line in the input, 's/.*[>] * //'. For every line of the input, not for some arbitrary block of lines. The pattern means:

/.*[>] * /
  | | | | 
  | | | \- and a single space
  | | \--- zero or more spaces
  | \----- the character > The [ and ] are normally used to give a list of
  |        possible characters.
  \------- zero or more characters

So, any line containing zero or more characters, followed by a >, zero or more spaces followed by a space, will have that part of the line deleted. Even when I do not understand what you want, that probably is not it.

  • Related