Home > Software design >  Is it possible replace the value of a cell in a csv file using grep,sed or both
Is it possible replace the value of a cell in a csv file using grep,sed or both

Time:11-24

I have written the following command

#!/bin/bash
awk -v value=$newvalue -v row=$rownum -v col=1 'BEGIN{FS=OFS=","} NR==row {$col=value}1' "${file}".csv >> temp.csv && mv temp.csv "${file}".csv

Sample Input of file.csv

Header,1
Field1,Field2,Field3
1,ABC,4567
2,XYZ,7890

Assuiming $newvalue=3 ,$rownum=4 and col=1, then the above code will replace:

Required Output

Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890

So if I know the row and column, is it possible to replace the said value using grep, sed?

Edit1: Field3 will always have a unique value for their respective rows. ( in case that info helps anyway)

CodePudding user response:

Assuming your CSV file is as simple as what you show (no commas in quoted fields), and your newvalue does not contain characters that sed would interpret in a special way (e.g. newlines, ampersands, slashes or backslashes), the following should work with just sed (tested with GNU sed):

sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv

$rownum is used as the address (here the line number) where to apply the following command. s is the sed substitute command. [^,]* is the regular expression to search for and replace: the longest possible string not containing a comma. $newvalue is the replacement string. $col is the occurrence to replace.

CodePudding user response:

With sed, how about:

#!/bin/bash

newvalue=3
rownum=4
col=1

sed -i -E "${rownum} s/(([^,] ,){$((col-1))})[^,] /\\1${newvalue}/" file.csv

Result of file.csv

Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
  • ${rownum} matches the line number.
  • (([^,] ,){n}) matches the n-time repetition of the group of non-comma characters followed by a comma. Then it should be the substring before the target (to be substituted) column by assigning n to col - 1.

CodePudding user response:

Let's Try to Implement sed command

Let us consider a sample CSV file with the following content:

$ cat file

Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
  1. To remove the 1st field or column :
$ sed 's/[^,]*,//' file

25,11
31,2
21,3
45,4
12,5

This regular expression searches for a sequence of non-comma([^,]*) characters and deletes them which results in the 1st field getting removed.

  1. To print only the last field, OR remove all fields except the last field:
$ sed 's/.*,//' file

11
2
3
4
5

This regex removes everything till the last comma(.*,) which results in deleting all the fields except the last field.

  1. To print only the 1st field:
$ sed 's/,.*//' file

Solaris
Ubuntu
Fedora
LinuxMint
RedHat

This regex(,.*) removes the characters starting from the 1st comma till the end resulting in deleting all the fields except the last field.

  1. To delete the 2nd field:
$ sed 's/,[^,]*,/,/' file

Solaris,11
Ubuntu,2
Fedora,3
LinuxMint,4
RedHat,5

The regex (,[^,]*,) searches for a comma and sequence of characters followed by a comma which results in matching the 2nd column, and replaces this pattern matched with just a comma, ultimately ending in deleting the 2nd column.

Note: To delete the fields in the middle gets more tougher in sed since every field has to be matched literally.

  1. To print only the 2nd field:
$ sed 's/[^,]*,\([^,]*\).*/\1/' file

25
31
21
45
12

The regex matches the first field, second field and the rest, however groups the 2nd field alone. The whole line is now replaced with the 2nd field(\1), hence only the 2nd field gets displayed.

  1. Print only lines in which the last column is a single digit number:
$ sed -n '/.*,[0-9]$/p' file

Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5

The regex (,[0-9]$) checks for a single digit in the last field and the p command prints the line which matches this condition.

  1. To number all lines in the file:
$ sed = file | sed 'N;s/\n/ /'

1 Solaris,25,11
2 Ubuntu,31,2
3 Fedora,21,3
4 LinuxMint,45,4
5 RedHat,12,5

This is simulation of cat -n command. awk does it easily using the special variable NR. The '=' command of sed gives the line number of every line followed by the line itself. The sed output is piped to another sed command to join every 2 lines.

  1. Replace the last field by 99 if the 1st field is 'Ubuntu':
$ sed 's/\(Ubuntu\)\(,.*,\).*/\1\299/' file

Solaris,25,11
Ubuntu,31,99
Fedora,21,3
LinuxMint,45,4
RedHat,12,5

This regex matches 'Ubuntu' and till the end except the last column and groups each of them as well. In the replacement part, the 1st and 2nd group along with the new number 99 is substituted.

  1. Delete the 2nd field if the 1st field is 'RedHat':
$ sed 's/\(RedHat,\)[^,]*\(.*\)/\1\2/' file

Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,,5

The 1st field 'RedHat', the 2nd field and the remaining fields are grouped, and the replacement is done with only 1st and the last group , resuting in getting the 2nd field deleted.

  1. To insert a new column at the end(last column) :
$ sed 's/.*/&,A/' file

Solaris,25,11,A
Ubuntu,31,2,A
Fedora,21,3,A
LinuxMint,45,4,A
RedHat,12,5,A

The regex (.*) matches the entire line and replacing it with the line itself (&) and the new field.

  1. To insert a new column in the beginning(1st column):
$ sed 's/.*/A,&/' file

A,Solaris,25,11
A,Ubuntu,31,2
A,Fedora,21,3
A,LinuxMint,45,4
A,RedHat,12,5

Same as last example, just the line matched is followed by the new column

I hope this will help. Let me know if you need to use Awk or any other command. Thank you

  • Related