I have a string and must extract all the substrings between 2 different words ('alpha' and 'beta'). I must return a json with two fields.
I tried in this way but it doesn't work correctly:
string content = "string working on";
var listSubString = new List<string>();
int index = 0;
do
{
index = content.LastIndexOf("alpha");
if (index != -1)
{
var length = content.IndexOf("beta");
string substring= content.Substring(index, length);
content = content.Replace(substring, string.Empty);
listSubString.Add(substring.Replace("alpha", string.Empty).Replace("beta", string.Empty));
}
} while (index != -1);
Content = content;
ListSubString = listSubString;
I'd like with a string like "hello alpha I don't want this part 1 beta world alpha i don't want this part 2 beta have a nice day"
receive a json like {Content : "hello world have a nice day, ListSubString : ["i don't want this part 1", "i don't want this part 2"]}
Thanks for the help
CodePudding user response:
I got the output you want, Hope it solves your purpose
Link : https://dotnetfiddle.net/b8T8qy
This code part i changed in your code
string substring= content.Substring(index, length-1);
listSubString.Insert(0, substring.Replace("alpha", string.Empty).Replace("beta", string.Empty));
And finally i appended the output result as json String.
string json = string.Join("\",\"", listSubString);
string otp = "{\"Content\" : \"" content "\",\"ListSubString\": [\"" json "\"]}";
Output :
{
"Content": "hello world have a nice day",
"ListSubString": [
" I don't want this part 1 ",
" i don't want this part 2 "
]
}
CodePudding user response:
Regular expressions allow you to accomplish this without indexes and loops.
Once you have identified a pattern that describes the substrings you are looking to extract e.g. "alpha.*?beta"
, then rebuilding the content without said substrings is just a matter of concatenating the fragments split by a regular expression:
Content = string.Join(string.Empty, new Regex("alpha.*?beta").Split(text);
As per the substrings themselves, you can capture them in the pattern and extract them from the matches returned by the regular expression:
ListSubString = new Regex("alpha(.*?)beta")
.Matches(text)
.Select(match => match.Groups[1])
.SelectMany(group => group.Captures.OfType<Capture>())
.Select(capture => capture.Value)
.ToList();
You can have a look at this answer for some clarification on the Match
> Group
> Capture
hierarchy.
CodePudding user response:
Thanks to everyone who responded to me. In the end, I decided to do it in this way:
var splitedContent = content.Split(new string[] { "alpha", "beta" }, StringSplitOptions.None);
Content = string.Join(" ", splitedContent.Where((_, index) => index % 2 == 0));
Css = splitedContent.Where((_, index) => index % 2 != 0).ToList<string>();
Regex probably was the best and most performant solution but I don't get how it works perfectly so at the moment this is my solution.