following problem: I try to read out user names out of a textblock. I normally use .Net for this and tried it with LINQ and Regex but I cant get a solution.
The pattern for the username is 'jane.doe' (without the quotations). Right now I have the following code sequence:
Imports System.Text.RegularExpressions
Public Class Form1
Dim arrStrSplittet As String()
Dim strRegEx As String = "[a-z] [.]{1}[a-z] "
Dim regExKriterium As Regex = New Regex(strRegEx)
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
arrStrSplittet = stringSplitten(TextBox1.Text)
TextBox2.Text = testFiltern(arrStrSplittet)
End Sub
Function testFiltern(text As String()) As String
Dim query = From x In text Where (regExKriterium.IsMatch(x)) Select x
Dim strBuild As New System.Text.StringBuilder()
For Each y As String In query
strBuild.AppendLine(y)
Next
MsgBox(strBuild.ToString())
Return strBuild.ToString()
End Function
Public Function stringSplitten(text As String) As String()
Dim arrX = Split(text, vbNewLine)
Return arrX
End Function
End Class
I try the following input:
Type Status Name
Benutzer Mustermann, Max (Server-name\max.mustermann)
Benutzer Normalverbraucher, Otto (Server-name\otto.normalverbraucher)
Benutzer Doe, Jane (Server-name\jane.doe)
Benutzer Svensson, Kalle (Server-name\kalle.svensson)
Benutzer Borg, Joe (Server-name\joe.borg)
And with the Code above I get the following output:
Benutzer Mustermann, Max (Server-name\max.mustermann)
Benutzer Normalverbraucher, Otto (Server-name\otto.normalverbraucher)
Benutzer Doe, Jane (Server-name\jane.doe)
Benutzer Svensson, Kalle (Server-name\kalle.svensson)
Benutzer Borg, Joe (Server-name\joe.borg)
The output should be:
max.mustermann
otto.normalverbraucher
jane.doe
kalle.svensson
joe.borg
Is it even possible to change the Object x in the LINQ? Does someone has another idea how to solve this? Currently I have a working (but pretty ugly) solution via InStr.
I hope someone can help me. Thanks in advance! Misao
CodePudding user response:
This is the closest I can get using .Net and Linq and Regex with the format given with readability
var textBook = @"Type Status Name
Benutzer Mustermann, Max (Server-name\max.mustermann)
Benutzer Normalverbraucher, Otto (Server-name\otto.normalverbraucher)
Benutzer Doe, Jane (Server-name\jane.doe)
Benutzer Svensson, Kalle (Server-name\kalle.svensson)
Benutzer Borg, Joe (Server-name\joe.borg)";
//split by new lines
string[] lines = textBook.Split(
new string[] { Environment.NewLine },
StringSplitOptions.None
);
//removed line without regex match of username
var removedWordWithoutServerName = lines.Where(x => Regex.IsMatch(x, "[a-z] [.]{1}[a-z] ")).ToList();
var userNames = new List<string>();
//split by Server-name\
foreach (var serverNameAndUserName in removedWordWithoutServerName.Select(serverName => serverName.Split(@"Server-name\")))
{
//Add only matching Regex and replace ")" with ""
userNames.AddRange(from s in serverNameAndUserName
where Regex.IsMatch(s, "[a-z] [.]{1}[a-z] ")
select s.Replace(")", ""));
}
var returnUsernames = userNames;
or a straightforward Linq but less readability
var textBook = @"Type Status Name
Benutzer Mustermann, Max (Server-name\max.mustermann)
Benutzer Normalverbraucher, Otto (Server-name\otto.normalverbraucher)
Benutzer Doe, Jane (Server-name\jane.doe)
Benutzer Svensson, Kalle (Server-name\kalle.svensson)
Benutzer Borg, Joe (Server-name\joe.borg)";
var userNames = new List<string>();
textBook.Split(new string[] { Environment.NewLine },StringSplitOptions.None)
.Where(x => Regex.IsMatch(x, "[a-z] [.]{1}[a-z] "))
.Select(x => x.Split(@"Server-name\")).ToList()
.ForEach(a =>
{
userNames.AddRange(from s in a
where Regex.IsMatch(s, "[a-z] [.]{1}[a-z] ")
select s.Replace(")", ""));
});
var returnNames = userNames;
CodePudding user response:
The solutions you have so far are complete overkill. Regex.Match
will return captures, you specify them with ()
Dim users = lines
.Select(Function(l) Regex.Match(l, "\(. ?\\(. ?)\)"))
.Where(Function(m) m.Success)
.Select(Function (m) m.Groups(1).Captures(0).Value)
.ToList()
The Regex goes as follows
\(
an escaped open parenthesis. ?
any characters, lazy match (the minimum possible)\\
an escaped backslash(
begins a capture. ?
any characters)
ends the capture\)
an escaped closing parenthesis