Home > Mobile >  c# get non-alphanumeric characters in a string as a string
c# get non-alphanumeric characters in a string as a string

Time:04-18

The string likes this:

Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.

I want to get from this string which are non-alphanumeric characters like this:

,?_,.

But how? I tried this:

var r = new Regex("[^a-zA-Z0-9]");
var m = r.Match(textBox1.Text);
var a = m.Value;

But it returns only last non-alphanumeric character .

CodePudding user response:

Try this:

private static string TakeOutTheTrash(string Source, string Trash)
{
return new Regex(Trash).Replace(Source, string.Empty);
}

private static string Output(string Source, string Trash)
{
return TakeOutTheTrash(Source, Trash);
}

var InvertedTrash = @"[a-zA-Z0-9]";
var str = Output(textBox1.Text, InvertedTrash);

// ,?_,.

CodePudding user response:

You can try Linq as an alternative to regular expressions. All we should do is to filter out (with a help of Where) letters, digits and, probably, whitespaces and then Concat them to a string:

  using System.Linq;

  ...

  var str = string.Concat(textBox1
    .Text
    .Where(c => !char.IsLetterOrDigit(c) && !char.IsWhiteSpace(c)));

If you insist on regular expressions, you have to combine all matches, e.g.

  using System.Linq;
  using System.Text.RegularExpressions;

  ...

  // I've removed whitespaces - \s - from the match
  var str = string.Concat(Regex
    .Matches(textBox1.Text, @"[^A-Za-z0-9\s] ")
    .Cast<Match>()
    .Select(match => match.Value));

CodePudding user response:

Maybe this =>

string input = "Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.";
    
Regex rgx = new Regex("[a-zA-Z0-9 -]");
string output = rgx.Replace(input, "");
    
Console.WriteLine(output);

CodePudding user response:

Note that your regex also returns whitespace, if this is not intentional, you can use [^a-zA-Z0-9\s] instead.

You can get the collection of all matches with r.Matches(textBox1.Text); (See https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matches)

If you want to use regex for this you can try

var text = "Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.";
var regex = new Regex("[^a-zA-Z0-9\s]");
var result = regex.Matches(text).Concat(match => match.Value);

CodePudding user response:

You might also make the match a bit more specific and match punctuation characters:

var s = "Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.";
var r = new Regex(@"\p{P}");
Console.WriteLine(String.Join("", r.Matches(s)));

Output

,?_,.
  • Related