I have a PHP array like this:
array('A','B','C','F','a','b','c','d','e','f','h','i','j','k','l','o','q','!','?','0','2','3','4','9')
How can I convert this to:
array('A-C','F','a-f','h-l','o','q','!','?','0-4','9')
There will only be standard ASCII characters - nothing fancy. Every character can only appear once.
CodePudding user response:
This is tricky because you have to keep track of whether the char you are up to is in a "range" (e.g. A-Z
), and if so whether it is the next sequential char in that range or whether it is a new character in the same range (which would result in a new range being started).
Here is code that produces the desired result, and also prints the tests being done at every step to determine whether to output a single character (e.g. '?'
) or a range of characters (e.g. 'a-f'
).
// Starting array:
$a = array('A','B','C','F','a','b','c','d','e','f','h','i','j','k','l','o','q','!','?','0','2','3','4','9');
// Determine range that $ch is in and return an array with the start and end chars of that range, or an array of empty strings if no range.
function r( $ch ) {
if (
$ch >= 'A' && $ch <= 'Z'
) {
return ['A', 'Z'];
}
if (
$ch >= 'a' && $ch <= 'z'
) {
return ['a', 'z'];
}
if (
$ch >= '0' && $ch <= '9'
) {
return ['0', '9'];
}
return ['', ''];
}
$b = []; // Destination array
// Determine starting range (if any):
$current_range = $ch_range = r($a[0]);
if ( strlen( $current_range[0] ) ) {
$range_start = $a[0];
echo "Current range is " . $current_range[0] . '-' . $current_range[1] . " (starting at $range_start)\n";
} else {
echo "Not currently in a range\n";
}
echo "Starting loop\n";
for ( $i=1; $i < sizeof($a); $i ) {
$ch = $a[$i];
echo "Got $ch\n";
$ch_range = r($ch); // Range that current char is in
// Are we currently in a range?
if ( strlen( $current_range[0] ) ) {
// Are we still in a range?
if ( strlen( $ch_range[0] ) ) { // Yes
// Are we currently in the same range?
if ( $ch_range[0] === $current_range[0] ) {
// Are we at the next sequential char?
if (
ord($a[$i-1]) 1 == ord($ch) // We are at the next char in the range
) {
echo "We are at the next character in the range " . $current_range[0] . '-' . $current_range[1] . " (starting at $range_start) so doing nothing\n";
} else {
echo "We are at a non-sequential character in the current range so adding [" . $range_start . '-' . $a[$i-1] . "] and starting a new range " . $ch_range[0] . '-' . $ch_range[1] . " (starting at $ch)\n";
$b[] = ($range_start == $a[$i-1]) ? $range_start : ( $range_start . '-' . $a[$i-1] );
$current_range = $ch_range;
$range_start = $ch;
}
} else {
echo "Detected switch from range " . $current_range[0] . '-' . $current_range[1] . " to " . $ch_range[0] . '-' . $ch_range[1] . " so adding [" . $range_start . '-' . $a[$i-1] . "]\n";
$b[] = ($range_start == $a[$i-1]) ? $range_start : ( $range_start . '-' . $a[$i-1] );
$current_range = $ch_range;
$range_start = $ch;
}
} else {
// We were in a range, but no longer
echo "We were in range " . $current_range[0] . '-' . $current_range[1] . " (starting with $range_start) but no longer, so adding range [$range_start-" . $a[$i-1] . "]\n";
$b[] = ($range_start == $a[$i-1]) ? $range_start : ( $range_start . '-' . $a[$i-1] );
$current_range = $ch_range;
$range_start = $ch;
// If not starting a range, add this ch:
if ( ! strlen( $ch_range[0] ) ) {
echo "Not starting a new range with '$ch' so adding [$ch]\n";
$b[] = $ch;
$range_start = '';
}
}
} else { // We are not in a range
// Are we starting a new range?
if ( strlen( $ch_range[0] ) ) {
echo "We are not in a range but starting a new one [" . $ch_range[0] . '-' . $ch_range[1] . "\n";
$current_range = $ch_range;
$range_start = $ch;
} else {
echo "We are not in a range and not starting one, so adding $ch\n";
$b[] = $ch;
$current_range = [];
$range_start = '';
}
}
}
// Did we finish in the middle of a range?
if ( strlen( $range_start ) ) {
echo "Finished in the middle of a range so adding [$range_start-" . $a[$i-1] . "]\n";
$b[] = ($range_start == $a[$i-1]) ? $range_start : ( $range_start . '-' . $a[$i-1] );
}
print_r($b);
// Intended output: array('A-C','F','a-f','h-l','o','q','!','?','0','2-4','9')
There is one line of code which is repeated four times, and which could probably be refactored into a separate function:
$b[] = ($range_start == $a[$i-1]) ? $range_start : ( $range_start . '-' . $a[$i-1] );
CodePudding user response:
Here is a technique that is relying on ord()
and the ascii table integer values for cgaracters to determine consecutive ranges. I do realize that your characters, but it suffices for your sample data. The rest of the algorithm can remain the same if you need to tweak the consecutive check logic.
In a nutshell, you check if the current range is empty, or check if the character immediately follows the last character order-wise, otherwise push the imploded range and start again.
Code: (Demo)
$chars = ['A','B','C','F','a','b','c','d','e','f','h','i','j','k','l','o','q','!','?','0','2','3','4','9'];
$result = [];
$range = [null];
foreach ($chars as $char) {
if ($range[0] === null) {
$range = [$char];
} elseif (ord($range[1] ?? $range[0]) === ord($char) - 1) {
$range[1] = $char;
} else {
$result[] = implode('-', $range);
$range = [$char];
}
}
if ($range[0] !== null) {
$result[] = implode('-', $range);
}
var_export($result);
Output:
array (
0 => 'A-C',
1 => 'F',
2 => 'a-f',
3 => 'h-l',
4 => 'o',
5 => 'q',
6 => '!',
7 => '?',
8 => '0',
9 => '2-4',
10 => '9',
)