Home > Enterprise >  Behaviour of String.split for empty strings
Behaviour of String.split for empty strings

Time:06-09

Consider this pattern of assertions:

@Test
fun test() {
    val list = listOf("one", "two", "three")
    assertEquals(list.subList(0, 3), list.subList(0, 3).joinToString(" ").split(" "))
    assertEquals(list.subList(0, 2), list.subList(0, 2).joinToString(" ").split(" "))
    assertEquals(list.subList(0, 1), list.subList(0, 1).joinToString(" ").split(" "))
    assertEquals(list.subList(0, 0), list.subList(0, 0).joinToString(" ").split(" "))
}

All lines follow a pattern of descending list size. All assertions pass, except, somewhat inconsistently, the last one. This is because "".split(" ") is not the expected empty list, but instead it's a one-element list containing the empty string.

Is there another way of calling split, or another function, that behaves the way I was expecting as described above?

CodePudding user response:

You could check if the list returned by split has a size of 1 and the sole element is an empty string, and then return an empty list:

val text = "".split(" ").let { if (it.size == 1 && it[0].isEmpty()) emptyList() else it }

CodePudding user response:

It's because your first examples are all lists of strings to begin with (I don't mean List<String>, just what they contain - they got some strings in 'em), so by joining them into a string and splitting that back out into a list of strings, you end up with the same thing you started with

Whereas with the empty list, with no string in it, by calling joinToString you're creating one. And split works on a string and always returns a list of them - which is what you'd expect, right? "hello".split(" ") shouldn't return an empty list just because there's no space to split on, so neither should "".split(" ") - you always just get the original string you called it on

So the whole joinToString -> split pipeline produces a list with at least one string in it, which is fine when you're starting with a list of strings, but not if you're starting with an empty list - it's not inconsistent exactly, the logic just falls down for that edge case. If this is for some kind of processing function (like processing arbitrary List<String>s that might be empty) I'd consider handling that edge case explicitly:

fun List<String>.doThing() =
    if (isEmpty()) emptyList() // or just 'this' if that works for you
    else joinToString(" ").split(" ")

or whatever you're up to!

CodePudding user response:

As a slight variation on lukas.j's answer, I think it's a little more readable using takeIf():

"".split(" ").takeIf{ it.size > 1 || it[0].isNotEmpty() } ?: emptyList()

And of course you can wrap that in a function, e.g.:

fun String.mySplit(separator: String)
    = split(separator)
    .takeIf{ it.size > 1 || it[0].isNotEmpty() }
    ?: emptyList()

(That example is an extension function, so you can call it in a similar way to split(), e.g. "".mySplit(" "))

  • Related