In csv files on Linux server, I have thousands of rows in below csv format
0,20221208195546466,9,200,Above as:2|RAN34f2fb:HAER:0|RAND8365b2bca763:FON:0|RANDa7a5f964900b:ION:0|
I need to get output from all the files on below format (2nd field ie 20221208195546466 and 5th field but value after Above as:
and before first | ie 2 in above example )
output :
20221208195546466 , 2
Can anyone help me with linux command ?
Edit :
my attempts
I tried but it give field 5th value. How to add field 2 as well ?
cat *.csv | cut -d, -f5|cut -d'|' -f1|cut -d':' -f2|
CodePudding user response:
You can use awk for this:
awk -F',' '{gsub(/Above as:/,""); gsub(/\|.*/, ""); print($2, $5)}'
Probably need to adopt regexp a bit.
CodePudding user response:
Gnu awk
allows multiple field separators to be set, allowing you to delimit each record at ,
, |
, and :
at the same time. Thus, the following will fish out the required fields from file.csv
:
awk -F'[,|:]' '{print $2", "$6}' file.csv
Tested on the single record example:
echo "0,20221208195546466,9,200,Above as:2|RAN34f2fb:HAER:0|RAND8365b2bca763:FON:0|RANDa7a5f964900b:ION:0|" | awk -F'[,|:]' '{print $2", "$6}'
output:
20221208195546466, 2
CodePudding user response:
Assumptions:
- starting string of the 5th comma-delimited field can vary from line to line (ie, not known before hand)
- the item of interest in the 5th comma-delimited field occurs between the first
:
and the first|
Sample data:
$ cat test.csv
0,20221208195546466,9,200,Above as:2|RAN34f2fb:HAER:0|RAND8365b2bca763:FON:0|RANDa7a5f964900b:ION:0|
1,20230124123456789,10,1730,Total ts:7|stuff:HAER:0|morestuff:FON:0|yetmorestuff:ION:0|
One awk
approach:
awk '
BEGIN { FS=OFS="," } # define input/output field delimiter as ","
{ split($5,a,"[:|]") # split 5th field on dual delimiters ":" and "|", store results in array a[]
print $2,a[2] # print desired items to stdout
}
' test.csv
This generates:
20221208195546466,2
20230124123456789,7