I am trying to convert string to seo friendly url. For this I have written below code and set the table column collation type to utf8_general_ci It is working for English but not working for Bengali Language. Just outputting single hypen(-) for bengali string
function seo_url( $string, $separator = '-' )
{
$accents_regex = '~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i';
$special_cases = array( '&' => 'and', "'" => '');
$string = mb_strtolower( trim( $string ), 'UTF-8' );
$string = str_replace( array_keys($special_cases), array_values( $special_cases), $string );
$string = preg_replace( $accents_regex, '$1', htmlentities( $string, ENT_QUOTES, 'UTF-8' ) );
$string = preg_replace("/[^a-z0-9]/u", "$separator", $string);
$string = preg_replace("/[$separator] /u", "$separator", $string);
return $string;
}
Is there any solution for unicode like bengali language for the same
CodePudding user response:
To accept glyph in Bengali (or any other language) you have to change the regex on this line :
$string = preg_replace("/[^a-z0-9]/u", "$separator", $string);
Currently, it means "change any character wich in not a letter or a number by a -". By another regex asking "change any character wich is not a letter or a number in any language" :
$string = preg_replace("/[^\p{L}\p{M}]/u", "$separator", $string);
Changing this line, your function will work fine ! More information and related anwser here : https://stackoverflow.com/a/6005511/15282066