Home > Back-end >  Browser does not display umlaut correctly when concatenating
Browser does not display umlaut correctly when concatenating

Time:11-13

My browser (chrome and firefox) does not display the umlaut "ö" correctly, once I concatenate a string with the umlaut character.

BILD

// words inside string with umlaute, later add http://zahnstocher47.de instead of "zahnstocher" as the correct solution
$string = "apfelsaft siebenundvierzig zahnstocher gelb ethereum österreich";

// get length of string
$l = mb_strlen($string);

$f = '';
// loop through length and output each letter by itself
for ($i = 0; $i <= $l; $i  ){
    // umlaute buggy when there is a concatenation
    $f .= $string[$i] . " ";
}

var_dump($f);

When I replace $string[$i] . " "; with $string[$i]; everything works as expected.

BILD

Why is that and how can I fix it so I can concatenate each letter with another string?

CodePudding user response:

In PHP, a string is a series of bytes. The documentation clumsily refers to those bytes as characters at times.

A string is series of characters, where a character is the same as a byte. This means that PHP only supports a 256-character set, and hence does not offer native Unicode support.

And then later

It has no information about how those bytes translate to characters, leaving that task to the programmer.

Using mb_strlen over just strlen is the correct way to get the number of actual characters in a string (assuming a sane byte order and internal encoding to begin with) however using array notation, $string[$i] is wrong because it only accesses the bytes, not the characters.

The proper way to do what you want is to split the string into characters using mb_str_split:

// words inside string with umlaute, later add http://zahnstocher47.de instead of "zahnstocher" as the correct solution
$string = "apfelsaft siebenundvierzig zahnstocher gelb ethereum österreich";

// get length of string
$l = mb_strlen($string);
$chars = mb_str_split($string);

$f = '';
// loop through length and output each letter by itself
for ($i = 0; $i <= $l; $i  ){
    // umlaute buggy when there is a concatenation
    $f .= $chars[$i] . " ";
}

var_dump($f);

Demo here: https://3v4l.org/JIQoE

  •  Tags:  
  • php
  • Related