I have 2 different strings need to split them get the output as I needed. Trying different solutions didnt work for me and blocked
Input
"12.2 - Chemicals, products and including,14.0 - Plastic products ,17.2 - Metal Products ,19.1 - and optical equipment (excluding and other electronic components; semiconductors; bare printed circuit boards; opti, watches)"
OutPut
"12.2, 14.0, 17.2, 19.1"
The other case.
Input
"DM 0405 - trtststodfm, fhreuu ,RD 3756 - yeyerffydydyd"
Output
"DM 0405, RD 3756"
didn't understand which logic I need to find it.
CodePudding user response:
You can solve both tasks with a help of regular expressions and Linq. The only difference is in the patterns (fiddle):
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
...
string input = "12.2 - Chemicals, products and including,14.0 - Plastic products ,17.2 - Metal Products ,19.1 - and optical equipment ...";
string pattern = @"[0-9]{1,2}\.[0-9]";
string[] result = Regex
.Matches(input, pattern)
.Cast<Match>()
.Select(match => match.Value)
.ToArray();
Console.Write(string.Join(", ", result));
Note, it is pattern
which differ
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
...
string input = "DM 0405 - trtststodfm, fhreuu ,RD 3756 - yeyerffydydyd";
string pattern = @"[A-Z]{2}\s[0-9]{4}";
string[] result = Regex
.Matches(input, pattern)
.Cast<Match>()
.Select(match => match.Value)
.ToArray();
Console.Write(string.Join(", ", result));
Patterns explained:
[A-Z]{2}\s[0-9]{4}
[A-Z]{2} - 2 capital English letters in A..Z range
\s - white space
[0-9]{4} - 4 digits in 0..9 range
[0-9]{1,2}\.[0-9]
[0-9]{1,2} - from 1 up to 2 digits in 0..9 range
\. - dot .
[0-9] - digit in 0..9 range
CodePudding user response:
If the input is always in the combination of {id}-{item},{id}-{item}. I would split the string on the ',' character. After you've done that it would be quick if you search through the collection with Linq and Regex.
But you would need to know in what formats the ID of the item can have. if it is like
2.2, 14.0, 17.2, 19.1
and does not change. Then a simple Regex like beneath suffices, which you can use in your Linq query.
new Regex(@"(\d*\.\d*)")
CodePudding user response:
You could use this regex: ((\w{2} \d{4})|\d .\d )(?=( - ))
Here's a fiddle demonstrating it: https://dotnetfiddle.net/0pTqFQ
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string input1 = "12.2 - Chemicals, products and including,14.0 - Plastic products ,17.2 - Metal Products ,19.1 - and optical equipment (excluding and other electronic components; semiconductors; bare printed circuit boards; opti, watches)";
string input2 = "DM 0405 - trtststodfm, fhreuu ,RD 3756 - yeyerffydydyd";
var regex1 = new Regex(@"((\w{2} \d{4})|\d .\d )(?=( - ))");
var matches1 = regex1.Matches(input1);
var matches2 = regex1.Matches(input2);
PrintMatches(matches1);
PrintMatches(matches2);
}
private static void PrintMatches(MatchCollection matches)
{
foreach (var match in matches)
{
Console.WriteLine(match);
}
}
}
CodePudding user response:
You can use string.Split
and string.Join
and some Linq power:
var data = new[]{"12.2 - Chemicals, products and including,14.0 - Plastic products ,17.2 - Metal Products ,19.1 - and optical equipment (excluding and other electronic components; semiconductors; bare printed circuit boards; opti, watches)", "DM 0405 - trtststodfm, fhreuu ,RD 3756 - yeyerffydydyd"};
foreach(string s in data)
{
var query = s.Split(',')
.Select(t => t.Split('-').First().Trim())
.Where(t => t.Any(char.IsDigit));
string result = string.Join(", ", query);
Console.WriteLine(result);
}
Output:
12.2, 14.0, 17.2, 19.1
DM 0405, RD 3756