I have written one shell script where I am creating 1 million data with 32 columns.
After creation, I have to remove few characters/numbers from the particular columns and I have tried a lot (using sed/awk/cut) but no luck.
Requirement is:
Remove dash,colon and space from the 4th column i.e date and time and then remove first 2 characters from the 4th column itself.
E.g : in 4th column - 2021-08-27 18:13:48 then output should be like this - 210827181348
Remove dash, colon and space from the 13th column i.e date and time and then remove first 2 characters from the 13th column itself and take 6 char/num after that only.
E.g : in 13th column - 2021-08-27 18:13:48 then output should be like this - 210827
current original output file is created through shell script-
2a81929d-9c82-43a9-a9aa-a31d949021a2,34161FA8203288AC044417A0,200,2021-08-27 18:13:48,481629102590229166400P,607529,720424,11,11,C,INR,F,2021-08-27 18:13:48,1,,11,C,INR,,,11,,,,,,,,DEBIT,29611,E280110520007119502E0993,5
cee6441f-8a47-457e-b342-ccb88e1306d2,34161FA8203288AC04315040,200,2021-08-27 18:15:38,351629102590180770866P,607529,720424,11,11,C,INR,F,2021-08-27 18:15:38,1,,11,C,INR,,,11,,,,,,,,DEBIT,86697,E280110520007119502E0993,5
72e32512-c9d3-4b89-9d76-6ec0c1a092fe,34161FA8203288AC04DEE7A0,200,2021-08-27 18:16:28,391629102590109095820P,607529,720424,11,11,C,INR,F,2021-08-27 18:16:28,1,,11,C,INR,,,11,,,,,,,,DEBIT,48977,E280110520007119502E0993,5
But I want overall file output format should be like this:
2a81929d-9c82-43a9-a9aa-a31d949021a2,34161FA8203288AC044417A0,200,210827181348,481629102590229166400P,607529,720424,11,11,C,INR,F,210827,1,,11,C,INR,,,11,,,,,,,,DEBIT,29611,E280110520007119502E0993,5
CodePudding user response:
You can do it using awk like this:
awk 'BEGIN { FS=OFS="," } { gsub(/[- :]/,"",$4); gsub(/[- :]/,"",$13); $13=substr($13,3,6); print}' < file
it replaces the chars you mentioned on field 4 and 13 and the gets the substring from field 13 that you wanted and prints everything