Home > Software engineering >  I need to convert bulk of around 40 million Debian packages containing spaces into new line
I need to convert bulk of around 40 million Debian packages containing spaces into new line

Time:01-19

In the s3 bucket we have Debian packages stored in different folders each folder contains different sizes of files. while calling Debian packages from the s3 bucket(AWS) the packages are separated with spaces now I need to convert those space-separated packages into line-by-line. each input line doesn't contain equal spaces. each directory contains the different sizes of Debian packages. After converting packages into line-by-line will store all packages(of the different folders) in one file.

input:

package1.deb  package2.deb     pacakge3.deb        pacakge4.deb package5.deb 

output:

package1.deb  
package2.deb  
package3.deb  
pacakge4.deb

This function is running in the background for different folders of s3 bucket I have tried many commands like truncate, awk, xargs -n 1, and sed. pls, provide an answer.

function convertSpaceToNewLine(){
for line in filename; do
   cat $line| grep '.deb$'|tr [:space:] \\t | sed 's/\t\t*/\n/g' >> folder/newfile
done
}

Thanks

CodePudding user response:

I think using sed to replace spaces with newline would be sufficient.

$ echo package1.deb    package2.deb    pacakge3.deb        pacakge4.deb   package5.deb | sed 's/ /\n/g' 

package1.deb
package2.deb
pacakge3.deb
pacakge4.deb
package5.deb

CodePudding user response:

A simple approach for outputting each space-delimited string separately would be:

awk -v OFS='\n' '{$1=$1}1' file.txt
  • Related