I'm trying to remove "ال" from every arabic string thats contains "ال"
I'm trying to do this by using this code but its only delete "ال" from the first word:
input : الغيث الغيث الغيث
output : غيث الغيث الغيث
what i need: غيث غيث غيث
string[] prefixes = { "ال", "اَلْ", "الْ", "اَل" };
foreach (string prefix in prefixes)
{
if (text.StartsWith(prefix))
{
text = text.Substring(prefix.Length);
break;
}
CodePudding user response:
If you are going to work with words not just Replace
every occurrence, you may want regular expression to match words, e.g.
using System.Text.RegularExpressions;
...
string input = "الغيث الغيث الغيث";
string[] prefixes = { "ال", "اَلْ", "الْ", "اَل" };
// \b - word boundary - we are looking for prefixes only
string output = Regex.Replace(input, @$"\b({string.Join("|", prefixes)})", "");
Let's have a look:
Console.Write(string.Join(Environment.NewLine, input, output));
Output:
الغيث الغيث الغيث
غيث غيث غيث
CodePudding user response:
Try this regex:
\b\u0627(?:\u0644\u0652?|\u064e\u0644\u0652?)
See regex demo.
And this is the C# code that does what you want:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = @"الغيث الغيث الغيث الغيث
اَلغيث اَلغيث اَلغيث اَلغيث
اَلْغيث اَلْغيث اَلْغيث اَلْغيث
الْغيث الْغيث الْغيث الْغيث
";
string pattern = @"\b\u0627(?:\u0644\u0652?|\u064e\u0644\u0652?)";
string replacement = "";
string result = Regex.Replace(input, pattern, replacement);
Console.WriteLine("Original String: {0}", input);
Console.WriteLine("\n\n-----------------\n\n");
Console.WriteLine("Replacement String: {0}", result);
}
}
See C# code demo.