My browser (chrome and firefox) does not display the umlaut "ö" correctly, once I concatenate a string with the umlaut character.
// words inside string with umlaute, later add http://zahnstocher47.de instead of "zahnstocher" as the correct solution
$string = "apfelsaft siebenundvierzig zahnstocher gelb ethereum österreich";
// get length of string
$l = mb_strlen($string);
$f = '';
// loop through length and output each letter by itself
for ($i = 0; $i <= $l; $i ){
// umlaute buggy when there is a concatenation
$f .= $string[$i] . " ";
}
var_dump($f);
When I replace $string[$i] . " ";
with $string[$i];
everything works as expected.
Why is that and how can I fix it so I can concatenate each letter with another string?
CodePudding user response:
In PHP, a string is a series of bytes. The documentation clumsily refers to those bytes as characters at times.
A string is series of characters, where a character is the same as a byte. This means that PHP only supports a 256-character set, and hence does not offer native Unicode support.
And then later
It has no information about how those bytes translate to characters, leaving that task to the programmer.
Using mb_strlen
over just strlen
is the correct way to get the number of actual characters in a string (assuming a sane byte order and internal encoding to begin with) however using array notation, $string[$i]
is wrong because it only accesses the bytes, not the characters.
The proper way to do what you want is to split the string into characters using mb_str_split
:
// words inside string with umlaute, later add http://zahnstocher47.de instead of "zahnstocher" as the correct solution
$string = "apfelsaft siebenundvierzig zahnstocher gelb ethereum österreich";
// get length of string
$l = mb_strlen($string);
$chars = mb_str_split($string);
$f = '';
// loop through length and output each letter by itself
for ($i = 0; $i <= $l; $i ){
// umlaute buggy when there is a concatenation
$f .= $chars[$i] . " ";
}
var_dump($f);
Demo here: https://3v4l.org/JIQoE