I need split big *.csv file for several smaller. Currently there is 661497 rows, I need each file with max. 40000. I've tried solution that I found on Github but with no success:
FILENAME=/home/cnf/domains/cnf.com.pl/public_html/sklep/dropshipping-pliki/products-files/my_file.csv
HDR=$(head -1 ${FILENAME})
split -l 40000 ${FILENAME} xyz
n=1
for f in xyz*
do
if [[ ${n} -ne 1 ]]; then
echo ${HDR} > part-${n}-${FILENAME}.csv
fi
cat ${f} >> part-${n}-${FILENAME}.csv
rm ${f}
((n ))
done
The error I get:
/home/cnf/domains/cnf.com.pl/public_html/sklep/dropshipping-pliki/download.sh: line 23: part-1-/home/cnf/domains/cnf.com.pl/public_html/sklep/dropshipping-pliki/products-files/my_file.csv.csv: No such file or directory
thanks for help!
CodePudding user response:
Keep in mind FILENAME
contains both a directory and a file so later in the script when you build the new filename you get something like:
part-1-/home/cnf/domains/cnf.com.pl/public_html/sklep/dropshipping-pliki/products-files/tyre_8.csv.csv
One quick-n-easy fix would be split the directory and filename into 2 separate variables, eg:
srcdir='/home/cnf/domains/cnf.com.pl/public_html/sklep/dropshipping-pliki/products-files'
filename='tyre_8.csv'
hdr=$(head -1 ${srcdir}/${filename})
split -l 40000 "${srcdir}/${filename}" xyz
n=1
for f in xyz*
do
if [[ ${n} -ne 1 ]]; then
echo ${hdr} > "${srcdir}/part-${n}-${filename}"
fi
cat "${f}" >> "${srcdir}/part-${n}-${filename}"
rm "${f}"
((n ))
done
NOTES:
- consider using lowercase variables (using uppercase variables raises the possibility of problems if there's an OS variable of the same name)
- wrap variable references in double quotes in case string contains spaces
- don't need to add a
.csv
extension on the new filename since it's already part of$filename