I want to get all combinations of input string divided by space into pairs of substrings.
For example:
For input string: xxx yy zzzz
, I want to get from it two matches: "xxx", "yy zzzz"
and "xxx yy", "zzzz"
For input string: a b c d
, I want to get get from it these matches: "a", "b c d"
, "a b", "c d"
, "a b c", "d"
Can I do it with regex?
Somewhy with regex (.*)\s (.*)
I get only one match. (I use C# regex engine btw)
CodePudding user response:
If you really want to use regex, you could count the number of spaces in you string and use that many different patterns. For each one of those patterns you need to indicate where it should "break" with number instead of the <space_break_number>
below.
((?:[^\s]\s){<space_break_number>})(.*)
For a b c d
which has 3 spaces, you'd need to run three patterns
((?:[^\s]\s){1})(.*)
out: ["a ", "b c d"]
((?:[^\s]\s){2})(.*)
out: ["a b ", "c d"]
((?:[^\s]\s){3})(.*)
out: ["a b c ", "d"]
CodePudding user response:
This is not a regex task as a regex cannot match multiple times at the same location.
You can split the string with whitespace using a mere String.Split()
, and then build your pairs:
public static List<string[]> SplitIntoPairsByWhitespace(string text)
{
var chunks = text.Split();
var result = new List<string[]>();
for (var i=0; i<chunks.GetLength(0) - 1; i )
{
var first = new List<string>();
for (var j=0; j <= i; j )
first.Add(chunks[j]);
var second = new List<string>();
for (var j=i 1; j < chunks.GetLength(0); j )
second.Add(chunks[j]);
result.Add(new[] {string.Join(" ", first), string.Join(" ", second)});
}
return result;
}
See a C# demo:
using System;
using System.Collections.Generic;
using System.Linq;
public class Test
{
public static void Main()
{
Console.WriteLine("Testing with 'xxx yy zzzz'");
foreach (var pair in SplitIntoPairsByWhitespace("xxx yy zzzz"))
{
Console.WriteLine(string.Join(", ", pair));
}
Console.WriteLine("Testing with 'a b c d'");
foreach (var pair in SplitIntoPairsByWhitespace("a b c d"))
{
Console.WriteLine(string.Join(", ", pair));
}
}
public static List<string[]> SplitIntoPairsByWhitespace(string text)
{
var chunks = text.Split();
var result = new List<string[]>();
for (var i=0; i<chunks.GetLength(0) - 1; i )
{
var first = new List<string>();
for (var j=0; j <= i; j )
first.Add(chunks[j]);
var second = new List<string>();
for (var j=i 1; j < chunks.GetLength(0); j )
second.Add(chunks[j]);
result.Add(new[] {string.Join(" ", first), string.Join(" ", second)});
}
return result;
}
}
Output:
Testing with 'xxx yy zzzz'
xxx, yy zzzz
xxx yy, zzzz
Testing with 'a b c d'
a, b c d
a b, c d
a b c, d
CodePudding user response:
You can try finding out all spaces (here we can use regular expressions) on which you want to split the string and then split on these spaces. It's not a pure regex solution however:
using System.Linq;
using System.Text.RegularExpressions;
...
static IEnumerable<(string left, string right)> Pairs(string source) => Regex
.Matches(source, @"(?<=\S )\s (?=\S )")
.Cast<Match>()
.Select(match => (source.Substring(0, match.Index),
source.Substring(match.Index match.Length)));
Demo:
string[] tests = new string[] {
"xxx yy zzzz",
"a b c d",
};
string report = string.Join(Environment.NewLine, tests
.Select(test => $"{test,15} => {string.Join("; ", Pairs(test).Select(p => $"({p.left}, {p.right})"))}"));
Console.Write(report);
Outcome:
xxx yy zzzz => (xxx, yy zzzz); (xxx yy, zzzz)
a b c d => (a, b c d); (a b, c d); (a b c, d)