Originally I had File.foreachf(name, "\n\n") in my code. This worked fine with my own test files, but now using real data I'm running into those files potentially also using \r\n instead of \n\n.
I would like to split a file into chunks of data using the blank line as the delimiter.
Alternatively I also tried to use File.readlines(fname), however this will only split the file by the line and I can't then further sub-split it, even if the blank lines are now empty elements because I used .chomp.
Is there a way to split the file according to new lines as the delimiter that accounts for both \r\n and \n\n?
Thanks
CodePudding user response:
You could write the following.
str =<<~_
Little Miss Muffet
sat on her
tuffet
eating her curds
and whey
_
str.split(/(?:^\r?\n) /)
#=> ["Little Miss Muffet\nsat on her\n",
# "tuffet\n",
# "eating her curds\nand whey\n"]
The regular expression reads, "match one or more (
) contiguous
empty lines having line terminators of \r\n
or \n
.
CodePudding user response:
You can write your regex to account for either \r
or \n
characters:
string.split(/[\r\n]\n/)
The brackets []
indicate any character within them can match, so that would mean the regex matches either \r
or \n
for the first character.