I'm trying to figure out a regex expression to match and replace Yo/yo
with Йо/йо
or ЬО/ьо
based on the rules below.
- Replace
Yo/yo
(capital, non-capital letter) withЙо/йо
if it is in the beginning of the word or is preceded by the letter:а
,ъ
,о
,у
,е
,и
. - If the condition above is not met, replace
Yo/yo
withЬО/ьо
.
I believe the following regex would work but there are two issues:
- how do I make it work for capital (Yo/Йо/Ьо) and non-capital (yo/йо/ьо)?
- if the conditions for
Йо/йо
are not met, how do I replace withЬО/ьо
instead?
(?<=а|ъ|о|у|е|и)йо
Regex.Replace(text, "(?<=а|ъ|о|у|е|и)йо", " "); // ??
Test cases
[Theory]
[InlineData("Асансyoр", "Асансьор")]
[InlineData("Актyoр", "Актьор")]
[InlineData("Шофyoр", "Шофьор")]
[InlineData("Пощалyoн", "Пощальон")]
[InlineData("Trenyor", "Треньор")]
[InlineData("Булyoн", "Бульон")]
[InlineData("Бокyoр", "Бокьор")]
[InlineData("Сервитyoр", "Сервитьор")]
[InlineData("Раyoн", "Район")]
[InlineData("Маyoнеза", "Майонеза")]
[InlineData("Маyoр", "Майор")]
[InlineData("Yoрдан", "Йордан")]
[InlineData("Yoвка", "Йовка")]
public void ShouldReturnReplacedWord_WhenGivenWord(string word, string expected)
CodePudding user response:
This works for me with xUnit in LinqPad 7:
#load "xunit"
void Main()
{
RunTests(); // Call RunTests() or press Alt Shift T to initiate testing.
}
static readonly Regex _yoRegex = new Regex( @"(?<pre>[\bayаъоуеи]?)(?<yo>[Yy]o)(?<post>\w )?", RegexOptions.Compiled );
static String ReplaceYo( String input )
{
if( _yoRegex.IsMatch( input ) )
{
String replaced = _yoRegex.Replace( input, YoMatchEvaluator );
return replaced;
}
else
{
throw new InvalidOperationException( "Input did not match regex." );
// return input;
}
}
static String YoMatchEvaluator( Match match )
{
String pre = match.Groups["pre" ].Value;
String yo = match.Groups["yo" ].Value;
String post = match.Groups["post"].Value;
Boolean isBeginningOfWord = ( pre.Length == 0 ) && ( match.Index == 0 );
Boolean isPrecededByVowel = ( pre.Length == 1 ); // Note that `\b` will mean `pre.Length == 0`.
Boolean isEndOfWord = ( post.Length == 0 );
if( isBeginningOfWord || isPrecededByVowel )
{
if( yo == "Yo" )
{
return pre "Йо" post;
}
else if( yo == "yo" )
{
return pre "йо" post;
}
else
{
throw new InvalidOperationException( "Unexpected \"Yo\" match: \"{0}\"".FmtInv( yo ) );
}
}
else
{
if( yo == "Yo" )
{
return pre "ЬО" post;
}
else if( yo == "yo" )
{
return pre "ьо" post;
}
else
{
throw new InvalidOperationException( "Unexpected \"Yo\" match: \"{0}\"".FmtInv( yo ) );
}
}
}
static class MyExtensions
{
public static String FmtInv( this String format, params Object?[]? args ) => String.Format( CultureInfo.InvariantCulture, format, args: args );
}
#region private::Tests
[Theory]
[InlineData( 1, "Асансyoр", "Асансьор")]
[InlineData( 2, "Актyoр", "Актьор")]
[InlineData( 3, "Шофyoр", "Шофьор")]
[InlineData( 4, "Пощалyoн", "Пощальон")]
//[InlineData( 5, "Trenyor", "Треньор")]
[InlineData( 5, "Trenyor", "Trenьоr")]
[InlineData( 6, "Булyoн", "Бульон")]
[InlineData( 7, "Бокyoр", "Бокьор")]
[InlineData( 8, "Сервитyoр", "Сервитьор")]
[InlineData( 9, "Раyoн", "Район")]
[InlineData( 10, "Маyoнеза", "Майонеза")]
[InlineData( 11, "Маyoр", "Майор")]
[InlineData( 12, "Yoрдан", "Йордан")]
[InlineData( 13, "Yoвка", "Йовка")]
[InlineData( 14, "Светлyo", "Светльо")]
public void ShouldReturnReplacedWord_WhenGivenWord( Int32 testCase, String word, String expected)
{
String actual = ReplaceYo( word );
Assert.Equal( expected: expected, actual: actual );
}
#endregion
Test results:
Input | Expected | Actual | Result |
---|---|---|---|
"Асансyoр" |
"Асансьор" |
"Асансьор" |
Pass |
"Актyoр" |
"Актьор" |
"Актьор" |
Pass |
"Шофyoр" |
"Шофьор" |
"Шофьор" |
Pass |
"Пощалyoн" |
"Пощальон" |
"Пощальон" |
Pass |
"Trenyor" |
"Trenьоr" |
"Trenьоr" |
Pass |
"Булyoн" |
"Бульон" |
"Бульон" |
Pass |
"Бокyoр" |
"Бокьор" |
"Бокьор" |
Pass |
"Сервитyoр" |
"Сервитьор" |
"Сервитьор" |
Pass |
"Раyoн" |
"Район" |
"Район" |
Pass |
"Маyoнеза" |
"Майонеза" |
"Майонеза" |
Pass |
"Маyoр" |
"Майор" |
"Майор" |
Pass |
"Yoрдан" |
"Йордан" |
"Йордан" |
Pass |
"Yoвка" |
"Йовка" |
"Йовка" |
Pass |
"Светлyo" |
"Светльо" |
"Светльо" |
Pass |