Home > Enterprise >  How can split a string which include only one hyperlink into multiple segments in Kotlin?
How can split a string which include only one hyperlink into multiple segments in Kotlin?

Time:06-26

There is a string which include only one hyperlink, I hope to split it into multiple segments in Kotlin.

How can do? Thanks!

The follwoing is some sample source and target.

Visit our <a href="http://www.google.net">website</a> for the latest news     ->  a[0]="Visit our"  a[1]="http://www.google.net"  a[2]= "website"  a[3]="for the latest news"

Visit our <a href="http://www.google.net">website</a>                         ->  a[0]="Visit our"  a[1]="http://www.google.net"  a[2]= "website"  a[3]=""

<a href="http://www.google.net">website</a>  for the latest news              ->  a[0]=""           a[1]="http://www.google.net"  a[2]= "website"  a[3]="for the latest news"

Visit our <a href="#">website</a> for the latest news                         ->  a[0]="Visit our"  a[1]="#"                      a[2]= "website"  a[3]="for the latest news"

CodePudding user response:

You can use the following Regex:

val regex = "(.*)<a href=\"(.*)\">(.*)</a>(.*)".toRegex()

Example:

fun split(s: String): List<String> {
    val result = regex.find(s) ?: return emptyList()
    return result.groups.toList().slice(1..4).mapNotNull { it?.value?.trim() }
}

listOf(
    "Visit our <a href=\"http://www.google.net\">website</a> for the latest news",
    "Visit our <a href=\"http://www.google.net\">website</a>",
    "<a href=\"http://www.google.net\">website</a>  for the latest news",
    "Visit our <a href=\"#\">website</a> for the latest news",
).forEach {
    println(split(it))
}

Output:

[Visit our, http://www.google.net, website, for the latest news]
[Visit our, http://www.google.net, website, ]
[, http://www.google.net, website, for the latest news]
[Visit our, #, website, for the latest news]

CodePudding user response:

Might not be the best solution, but if the string is guaranteed to be built always the way you state it, you could use the built-in functions substring...():

fun split(str: String): List<String> {
  return listOf(
    str.substringBefore('<').trim(),
    str.substringAfter("<a href=\"").substringBefore("\">").trim(),
    str.substringAfter("\">").substringBefore("</a>").trim(),
    str.substringAfterLast('>').trim()
  )
}

println(split(str1))
println(split(str2))
println(split(str3))
println(split(str4))

Above function would not work if there are multiple spaces within the a tag. This version would take that into account:

fun split(str: String): List<String> {
  return str
    .split('<', '>')
    .filterNot { it.trim() == "/a" }
    .map { it.trim()
      .removePrefix("a").trim()
      .removePrefix("href").trim()
      .removePrefix("=").trim()
      .removePrefix("\"").trim()
      .removeSuffix("\"").trim()
    }
}
  • Related