I have a csv generated from a SQL output and I am trying to replace partial strings of the csv with a generic string. I have tried FART, (always makes me laugh), FINDSTR and POWERSHELL but Idont think my skills are enough and Google searching is quite hard because of the caveats I stipulate.
The txt file is like this (sample data).
course_id,user_id,role,status
2122-DAC00002,123456,sometxt,active
2122-DAC00002,13456,sometxt,active
2122-DAC00010/1,987654,sometxt,active
2122-DAC00010,55669988,sometxt,active
2122-DAC00010/2,112233,sometxt,active
2122-DAC00010,852349,sometxt,active
The headers can be ignored, the first part is the part I need changing en-masse so search for 2122-*
until the first ,
(the 2122-*
may be slightly different character lengths but will always stop at the ,
delimiter, and then replace all the first iterations of 2122-*
with 2122-GCE
.
So the final output would be :
course_id,user_id,role,status
2122-GCE,123456,sometxt,active
2122-GCE,13456,sometxt,active
2122-GCE,987654,sometxt,active
2122-GCE,55669988,sometxt,active
2122-GCE,112233,sometxt,active
2122-GCE,852349,sometxt,active
I need to automate this, so within a .bat
file, or a .ps1
would be good.
Hope this makes sense?
[EDIT /]
Apologies, missed my code attempts off.
My findstr
attempt:
findstr /V /i /R '2122-.*' '2122-GCE' "E:\path to file\file1.csv" > "E:\path to file\output3.csv"
findstr
output:
course_id,user_id,role,status
2122-GCENAC00025,123456,sometxt,active
2122-GCENAC00025,568974,sometxt,active
2122-GCENAC00025,223366,sometxt,active
2122-GCENAC00025,987654,sometxt,active
As you can see above, its prefixed and not replaced.
My FART
attempt:
E:\path to\fart "E:\path to file\file1.csv" 2122-N* 2122-GCE
E:\path to\fart "E:\path to file\output3.csv" 2122-D? 2122-GCE
My PS1
attempt was in an ISE and I closed without saving.
edit, I had a ps window still open:
((Get-Content -path E:\path to file\file1.csv -Raw) -replace '2122-*','2122-GCE') | Set-Content -Path E:\path to file\file2.csv
Some iterations of the replace command: -replace '[^2122]*'
type file1.csv | ForEach-Object { $_ -replace "2122-*", "2122-GCE" } | Set-Content file2.csv
CodePudding user response:
It looks like the first data value always exists with a non-empty value and not starting with ;
and must be always replaced by the same value and the second data column contains on all data rows always a value and for that reason there is never ,,
after first data value in a data row.
There can be used the following commented batch file under these conditions:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "SourceFile=E:\path to file\file1.csv"
set "OutputFile=E:\path to file\file2.csv"
if not exist "%SourceFile%" goto :EOF
rem Read just the header row from source CSV file and
rem create the output CSV file with just this line.
for /F "usebackq delims=" %%I in ("%SourceFile%") do (
>"%OutputFile%" echo(%%I
goto DataRows
)
rem Process just all data rows in source file by skipping the header row
rem with splitting each line into two substrings using the comma as string
rem delimiter with first substring assigned to loop variable I and not used
rem further and everything after first series of commas assigned to the
rem next but one loop variable J according to the ASCII table. The command
rem ECHO is used to output the first fixed data value and a comma and the
rem other data values of each data row. This output is appended to just
rem created output CSV file.
:DataRows
(for /F "usebackq skip=1 tokens=1* delims=," %%I in ("%SourceFile%") do echo 2122-GCE,%%J)>>"%OutputFile%"
endlocal
For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.
echo /?
endlocal /?
for /?
goto /?
if /?
rem /?
set /?
setlocal /?