Home > Net >  Split file based on the number of X symbol on each line
Split file based on the number of X symbol on each line

Time:07-09

I'm not sure if this is possible, but I'm wondering if it'd be able to split a file into multiple files - dependent on the amount of a specified character there is on each line.

Lets use a colon (:) as an example

File.txt contains the following data (example):

Stack:Overflow   
Stack:Overflow:Flow    
Stack:Over:Flow:Com

Entire line containing 1 colon, goes to 1.txt
Entire line containing 2 colons, goes to 2.txt
Entire line containing 3 colons, goes to 3.txt

(And of course) there wouldn't be a limit to the amount of colons, and format may not necessarily always match the exampled pattern.

Sorry if this is a vague question, I'm first time posting on StackOverflow in a long.


Another side question: Inserting a specific character between 2 different regexs.

Data:

[email protected]

I'm trying to insert a delimiter which will be ":"
Between 2 different regexes.
Regex #1 being: [A-Za-z0-9._% -] @[A-Za-z0-9.-] \.[A-Za-z]{2,6}
Regex #2 being: [0-9]{1,4}\.[0-9]{1,4}\.[0-9]{1,4}\.[0-9]{1,4}

So the desired output would be:

[email protected]:192.168.0.1

CodePudding user response:

With GNU AWK this approach will provide your expected outcome:

awk -F":" '{print > ((NF - 1)".txt")}' file.txt

NB. if you have a large number of delimiters (hundreds - thousands) you may also run into trouble for having too many open files (I believe ulimit -n will tell you how many different files you can have open at one time; on my system it's 256)

CodePudding user response:

awk can do this quite easily:

awk -F : '{print > NF-1".txt"}' File.txt
  • With : as a field separator, the number of fields (NF) minus one is equal to the number of field separators. Which we can use for the file name.
  • You can replace the colon with any character, except space, which must be written as [ ]. Otherwise awk handles it specially.
  • Field separator is normally a regular expression, except, if it's a single character it's treated literally (except space).
  • Related