Home > database >  How to remove unwanted characters in a file using a shell script
How to remove unwanted characters in a file using a shell script

Time:10-12

I have a file which looks like this (file.txt)

AYOsVS7Wknsgv2StRsEK JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS7yknsgv2StRsEL JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8aknsgv2StRsEM JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8hknsgv2StRsEN JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8mknsgv2StRsEO JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
RondomText JBC-ChangeService-Test N/A Coverage(89.1%)     JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]

I have to remove unwanted characters in last column [JamesPeter] to JamesPeter

then expected output

AYOsVS7Wknsgv2StRsEK JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS7yknsgv2StRsEL JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8aknsgv2StRsEM JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8hknsgv2StRsEN JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8mknsgv2StRsEO JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
RondomText JBC-ChangeService-Test N/A Coverage(89.1%)     JBC 97 2022-10-11-10:23 [email protected] development JamesPeter

This is what I tried

sed 's/[//; s/]//' file.txt

then output I got

AYOsVS7Wknsgv2StRsEK JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS7yknsgv2StRsEL JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8aknsgv2StRsEM JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8hknsgv2StRsEN JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8mknsgv2StRsEO JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
RondomTextJBC-ChangeService-Test N/A Coverage(89.1%)     JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]

Can someone help me to figure out this? Thanks in advance!

Note: I am not allowed to use jq or general-purpose scripting language (JavaScript, Python etc).

CodePudding user response:

If [ and ] does appear only in last column, then you can delete all [ and ] by using tr which to my understanding does not count as general-purpose scripting language, let file.txt content be

AYOsVS7Wknsgv2StRsEK JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS7yknsgv2StRsEL JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8aknsgv2StRsEM JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8hknsgv2StRsEN JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
AYOsVS8mknsgv2StRsEO JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]
RondomText JBC-ChangeService-Test N/A Coverage(89.1%)     JBC 97 2022-10-11-10:23 [email protected] development [JamesPeter]

then

tr -d '[]' < file.txt

gives output

AYOsVS7Wknsgv2StRsEK JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS7yknsgv2StRsEL JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8aknsgv2StRsEM JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8hknsgv2StRsEN JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8mknsgv2StRsEO JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
RondomText JBC-ChangeService-Test N/A Coverage(89.1%)     JBC 97 2022-10-11-10:23 [email protected] development JamesPeter

Explanation: I feed file.txt to stdin of tr, which I instructed to delete following characters: [,].

However if [ and ] might appear in columns other than last then you might use GNU AWK following way, let file.txt content be

[HelloWorld] 1 [Name]
Hello        2 [Name]
World        3 [Name]

then

awk 'BEGIN{FS="[[:space:]]"}{gsub(/\[|\]/,"",$NF);print}' file.txt

gives output

[HelloWorld] 1 Name
Hello        2 Name
World        3 Name

Explanation: I inform GNU AWK that field separator (FS) is single whitespace character, then for each line I globally (hence gsub) replace literal [ or (|) literal ] using empty string i.e. delete it in last field ($NF)

(tested in gawk 4.2.1)

CodePudding user response:

This sed should work to replace [ and ] from anywhere in input:

sed -E 's/[][]//g' file

AYOsVS7Wknsgv2StRsEK JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS7yknsgv2StRsEL JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8aknsgv2StRsEM JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8hknsgv2StRsEN JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
AYOsVS8mknsgv2StRsEO JBC-ChangeService-Test    CODE_SMELL JBC 97 2022-10-11-10:23 [email protected] development JamesPeter
RondomText JBC-ChangeService-Test N/A Coverage(89.1%)     JBC 97 2022-10-11-10:23 [email protected] development JamesPeter

However if you want to remove [ and ] from last field only then use:

sed -E 's/\[([^]]*)\]$/\1/' file

CodePudding user response:

Why not use tr for this:

Prompt> cat file.txt | tr -d "[" | tr -d "]" >result.txt

The -d switch removes the mentioned character.

  • Related