Home > Blockchain >  How to remove unwanted characters in a file (using a shell script)
How to remove unwanted characters in a file (using a shell script)

Time:10-26

I have a file which looks like this (file.txt)

AYOnanl3knsgv2StRr44  CRITICAL","component  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL","component  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL","component  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL","component  MMP-FileService  [email protected]     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR",               MMP-FileService  [email protected]  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER",             MMP-FileService  [email protected]      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR",               MMP-FileService  [email protected]      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER",             MMP-FileService  [email protected]   BUG
AYODwmuBknsgv2StRqkr  MINOR",               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR",               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR",               MMP-component    [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR",               MMP-component    [email protected]   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR",               MMP-component    [email protected]   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR",               MMP-component    [email protected]   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR",               MMP-component    [email protected]   CODE_SMELL

I have to remove unwanted characters in 2nd column ","component and ",

then expected output

AYOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  [email protected]    CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR     MMP-FileService  [email protected]  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER   MMP-FileService  [email protected]    CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR     MMP-FileService  [email protected]    CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER   MMP-FileService  [email protected]    BUG
AYODwmuBknsgv2StRqkr  MINOR     MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR     MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR     MMP-component  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR     MMP-component  [email protected]   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR     MMP-component  [email protected]   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR     MMP-component  [email protected]   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR     MMP-component  [email protected]   CODE_SMELL

This is what I tried

cat file.txt | tr -d '",' | sed 's/component//'

then output I got

YOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  [email protected]     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR               MMP-FileService  [email protected]  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER             MMP-FileService  [email protected]      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR               MMP-FileService  [email protected]      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER             MMP-FileService  [email protected]   BUG
AYODwmuBknsgv2StRqkr  MINOR               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR               MMP-    [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR               MMP-    [email protected]   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR               MMP-    [email protected]   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR               MMP-    [email protected]   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR               MMP-    [email protected]   CODE_SMELL

my executed shell command is applying to the other columns as well (in this case it has applied to 3rd column too) that is the problem I am having. Is there any way to apply command only for 2nd column?

Can someone help me to figure out this? Thanks in advance!

Note: I am not allowed to use jq or other scripting languages as JavaScript, Python etc.

CodePudding user response:

It can be done in a single sub:

awk '{sub(/"[^[:blank:]]*$/, "", $2)} 1' file | column -t

AYOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  [email protected]    CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR     MMP-FileService  [email protected]   CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER   MMP-FileService  [email protected]    CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR     MMP-FileService  [email protected]    CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER   MMP-FileService  [email protected]    BUG
AYODwmuBknsgv2StRqkr  MINOR     MMP-FileService  [email protected]    CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR     MMP-FileService  [email protected]    CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR     MMP-component    [email protected]    CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR     MMP-component    [email protected]    CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR     MMP-component    [email protected]    CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR     MMP-component    [email protected]    CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR     MMP-component    [email protected]    CODE_SMELL

Here:

  • "[^[:blank:]]*$: Matches text starting with " in input ($2) and we replace it with an empty string.
  • column -t is used for tabular output only that you can remove if you don't want.

CodePudding user response:

Consider this approach, if you are sure the characters you wish to remove appear in the second column only (borrowed and adapted from here https://unix.stackexchange.com/questions/492500/awk-replace-one-character-only-in-a-certain-column)

awk '{{gsub("\",(\"component)?","", $2)}} 1' file.txt
  • gsub("\",\"component?","", $2) for each input line, replace all the ",("component)? in 2nd field with blank - this is a regular expression saying find ", then optionally the part in brackets: "component. ? is the operator for optional
  • 1 is an awk idiom to print contents of $0 (which contains the input record)

CodePudding user response:

1st solution: Considering that your string ","component will come always kin same place please try following awk code. This code will preserve the spaces also between the fields as per shown samples only.

awk '
match($0,/","*[^[:space:]]*/){
  print substr($0,1,RSTART-1) sprintf("%-"(RLENGTH) "s",OFS) substr($0,RSTART RLENGTH)
  next
}
1
'  Input_file

2nd solution: with GNU awk using match function along with regex which does considers that your string will come in 2nd field only as per shown samples, this also takes care of spaces in Input_file and preserves them in output. Here is the Online Demo for used regex.

awk '
match($0,/^([^[:space:]] [[:space:]] )([^"]*)(","*[^[:space:]]*)(.*$)/,arr){
  print arr[1] arr[2] sprintf("%-"length(arr[3]) "s",OFS) arr[4]
  next
}
1
'  Input_file

CodePudding user response:

You might use GNU sed for this task following way, let file.txt content be

AYOnanl3knsgv2StRr44  CRITICAL","component  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL","component  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL","component  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL","component  MMP-FileService  [email protected]     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR",               MMP-FileService  [email protected]  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER",             MMP-FileService  [email protected]      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR",               MMP-FileService  [email protected]      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER",             MMP-FileService  [email protected]   BUG
AYODwmuBknsgv2StRqkr  MINOR",               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR",               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR",               MMP-component    [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR",               MMP-component    [email protected]   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR",               MMP-component    [email protected]   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR",               MMP-component    [email protected]   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR",               MMP-component    [email protected]   CODE_SMELL

then

sed 's/"[^[:space:]]*//' file.txt

gives output

AYOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  [email protected]  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  [email protected]     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR               MMP-FileService  [email protected]  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER             MMP-FileService  [email protected]      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR               MMP-FileService  [email protected]      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER             MMP-FileService  [email protected]   BUG
AYODwmuBknsgv2StRqkr  MINOR               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR               MMP-FileService  [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR               MMP-component    [email protected]   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR               MMP-component    [email protected]   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR               MMP-component    [email protected]   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR               MMP-component    [email protected]   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR               MMP-component    [email protected]   CODE_SMELL

Explanation: replace first " and all subsequent non-whitespace characters using empty string, i.e. delete it. Assumption: " does appear only in 2nd column and you do not need to keep columns aligned.

(tested in GNU sed 4.2.2)

  • Related