Home > Software design >  String split by newline
String split by newline

Time:12-16

I want to split a string by new lines and I want the split function to act like most other languages.

Input:

split('\nhello\nworld\n')

Expected output:

{
    "",
    "hello",
    "world",
    ""
}

I have tried following which return the result without the first and the last empty strings.

function split(text)
    local lines = {}

    for str in string.gmatch(text, "([^\n] )") do
        table.insert(lines, str)
    end

    return lines
end

CodePudding user response:

From the manual:

a single character class followed by '*', which matches 0 or more repetitions of characters in the class. These repetition items will always match the longest possible sequence;

a single character class followed by ' ', which matches 1 or more repetitions of characters in the class. These repetition items will always match the longest possible sequence;

You want to match an empty substring, so you need use *:

function split(text)
    local lines = {}

    for str in string.gmatch(text, "([^\n]*)") do
        table.insert(lines, str)
    end

    return lines
end

CodePudding user response:

The naive approach of using string.gmatch(text, "([^\n]*)") does not work on LuaJIT; LuaJIT will randomly emit empty strings. Note also that you don't need the capture around the pattern; the entire pattern is captured implicitly.

Here is my spliterator function based on string.find:

function spliterator(str, delim, plain)
    assert(delim ~= "")
    local last_delim_end = 0

    -- Iterator of possibly empty substrings between two matches of the delimiter
    -- To exclude empty strings, filter the iterator or use `:gmatch"[...] "` instead
    return function()
        if not last_delim_end then
            return
        end

        local delim_start, delim_end = str:find(delim, last_delim_end   1, plain)
        local substr
        if delim_start then
            substr = str:sub(last_delim_end   1, delim_start - 1)
        else
            substr = str:sub(last_delim_end   1)
        end
        last_delim_end = delim_end
        return substr
    end
end

You can trivially use this to build a table:

function split(str, delim, plain)
    local t = {}
    for delimited in spliterator(str, delim, plain) do
        table.insert(t, delimited)
    end
    return t
end

Usage:

print(table.concat(split("\nhello\nworld\n", "\n"), ", "))

Note that usually you'd use io.lines / file:lines to iterate over the lines of a file.

  • Related