What is the best way to see if a string contains mostly capital letters?
The string may also contain symbols, spaces, numbers, so would still want it to return true in those cases.
For example: I can check if a strings is ALL-CAPS by something similar to this:
if (strtoupper($str) == $str) { /* its true */ }
But what if we need to determine if a string is 80% or more ALL-CAPs.
THE 15 SMALL BROWN FOXES JUMP INTO THE BURNING barn! -> true
The 15 Small Brown Foxes JUMP Into the Burning Barn! -> false
I can loop though all the characters, checking them individually, but thats seems a bit wasteful imho.
Is there a better way?
CodePudding user response:
$countUppercase = strlen(preg_replace('/[^A-Z] /', '', $str));
// or: mb_strlen(...)
... and then divide by strlen($str)
CodePudding user response:
A simple for loop should give the best performance
$numUpper = 0;
for ($i = 0; $i < strlen($str); $i ){
if (ctype_upper($str[$i])) {
$numUpper ;
}
}
return $numUpper;
CodePudding user response:
I wanted to play around with alternative approaches and came to something like this. Honestly I'd probably just write a loop. Just more clear than something like this and you can omit numbers and other things counted towards "lower case".
Though, with a little massaging of your input you might be able to do something along these lines, simply comparing differences between two strings.
<?php
isMostlyUpperCase('THE 15 SMALL BROWN FOXES JUMP INTO THE BURNING barn!');
isMostlyUpperCase('The 15 Small Brown Foxes JUMP Into the Burning Barn!');
function isMostlyUpperCase($strIn) {
$strAsUpper = str_split(strtolower($strIn));
$str = str_split($strIn);
$diff = array_diff_assoc($str, $strAsUpper);
return (count($diff) > strlen($strIn) - count($diff));
}
I've added functional examples based on the other two answers:
<?php
$str1 = 'THE 15 SMALL BROWN FOXES JUMP INTO THE BURNING barn!';
$str2 = 'The 15 Small Brown Foxes JUMP Into the Burning Barn!';
function pregIsMostlyUpper($str)
{
$countUppercase = strlen(preg_replace('/[^A-Z] /', '', $str));
$countLowercase = strlen(preg_replace('/[^a-z] /', '', $str));
return (bool)($countUppercase > $countLowercase);
}
function loopIsMostlyUpper($str)
{
$numUpper = 0;
$numLower = 0;
for ($i = 0;$i < strlen($str);$i )
{
if (ctype_alpha($str[$i]))
{
if (ctype_upper($str[$i]))
{
$numUpper ;
} else {
$numLower ;
}
}
}
return (bool)($numUpper > $numLower);
}
var_dump(loopIsMostlyUpper($str1)); //true
var_dump(loopIsMostlyUpper($str2)); //false
var_dump(pregIsMostlyUpper($str1)); //true
var_dump(pregIsMostlyUpper($str2)); //false
I'll let you time them if you want! Try and beat a sed / awk solution.