I have a csv file (separated by comma), which contains
file1a.extension.extension,file1b.extension.extension
file2a.extension.extension,file2b.extension.extension
Problem is, these files are name such as file.extension.extension
I'm trying to feed both columns to parallel and removing all extesions
I tried some variations of:
cat /home/filepairs.csv | sed 's/\..*//' | parallel --colsep ',' echo column 1 = {1}.extension.extension column 2 = {2}
Which I expected to output
column 1 = file1a.extension.extension column 2 = file1b
column 1 = file2a.extension.extension column 2 = file2b
But outputs:
column 1 = file1a.extension.extension column 2 =
column 1 = file2a.extension.extension column 2 =
The sed command is working but is feeding only column 1 to parallel
CodePudding user response:
As currently written the sed
only prints one name per line:
$ sed 's/\..*//' filepairs.csv
file1a
file2a
Where:
\.
matches on first literal period (.
).*
matches rest of line (ie, everything after the first literal period to the end of the line)//
says to remove everything from the first literal period to the end of the line
I'm guessing what you really want is two names per line ... one sed
idea:
$ sed 's/\.[^,]*//g' filepairs.csv
file1a,file1b
file2a,filepath2b
Where:
\.
matches on first literal period (.
)[^,]*
matches on everything up to a comma (or end of line)//g
says to remove the literal period, everything afterwards (up to a comma or end of line), and theg
says to do it repeatedly (in this case the replacement occurs twice)
NOTE: I don't have parallel
on my system so unable to test that portion of OP's code