Ruby split file by blank lines accounting for \r\n and \n\n-CodePudding

Originally I had File.foreachf(name, "\n\n") in my code. This worked fine with my own test files, but now using real data I'm running into those files potentially also using \r\n instead of \n\n.

I would like to split a file into chunks of data using the blank line as the delimiter.

Alternatively I also tried to use File.readlines(fname), however this will only split the file by the line and I can't then further sub-split it, even if the blank lines are now empty elements because I used .chomp.

Is there a way to split the file according to new lines as the delimiter that accounts for both \r\n and \n\n?

Thanks

CodePudding user response：

You could write the following.

str =<<~_
Little Miss Muffet
sat on her

tuffet


eating her curds
and whey
_

str.split(/(?:^\r?\n) /)
  #=> ["Little Miss Muffet\nsat on her\n",
  #    "tuffet\n",
  #    "eating her curds\nand whey\n"]

The regular expression reads, "match one or more ( ) contiguous empty lines having line terminators of \r\n or \n.

CodePudding user response：

You can write your regex to account for either \r or \n characters:

string.split(/[\r\n]\n/)

The brackets [] indicate any character within them can match, so that would mean the regex matches either \r or \n for the first character.