I have a text like this:
81. Text1
82. Text2
82.1. Some text3
82.2. Some long text goes there in two or more lines... Some more text goes here...
83. Text4
84. Text5
It has some random spacing between the lines. I'm trying to extract every single option separately. So for example my output for 82.2. should be like this: "82.2." and "Some long text goes there in two or more lines... Some more text goes here...".
I've already tried to do this like that:
$exp = explode(". ", $text);
foreach($exp as $newline) {
echo explode(". ", $newline)[0];
}
But probably that's not the best idea, because sometimes there's an ". " in the end of sentence.
CodePudding user response:
You're on the right track making use of explode
:
$output = [];
$input = '81. Text1
82. Text2
82.1. Some text3
82.2. Some long text goes there in two or more lines... Some more text goes here...
83. Text4
84. Text5';
// split lines, trim any whitespace on each line and remove any that are empty
// PHP_EOL may need to be changed to how newlines are encoded in the text file
$lines = array_filter(array_map('trim', explode(PHP_EOL, $input)));
foreach ($lines as $line) {
$split = explode('. ', $line);
// The number will be the first element
$number = trim(array_shift($split));
// Join the rest of the elements together
$text = implode('', $split);
$output[] = [
'number' => $number,
'text' => $text
];
}
var_dump($output);
This yields:
array(6) {
[0]=>
array(2) {
["number"]=>
string(2) "81"
["text"]=>
string(5) "Text1"
}
[1]=>
array(2) {
["number"]=>
string(2) "82"
["text"]=>
string(5) "Text2"
}
[2]=>
array(2) {
["number"]=>
string(4) "82.1"
["text"]=>
string(10) "Some text3"
}
[3]=>
array(2) {
["number"]=>
string(4) "82.2"
["text"]=>
string(75) "Some long text goes there in two or more lines..Some more text goes here..."
}
[4]=>
array(2) {
["number"]=>
string(2) "83"
["text"]=>
string(5) "Text4"
}
[5]=>
array(2) {
["number"]=>
string(2) "84"
["text"]=>
string(5) "Text5"
}
}
CodePudding user response:
You can use the limit
parameter of the explode
function to only get two results:
$str = <<<EOD
81. Text1
82. Text2
82.1. Some text3
82.2. Some long text goes there in two or more lines... Some more text goes here...
83. Text4
84. Text5
EOD;
foreach (explode("\n", $str) as $line) {
if (trim($line) == "") {
continue;
}
list($prefix, $text) = explode(" ", $line, 2);
echo $prefix . " -> " . $text . "\n";
}
This prints:
81. -> Text1
82. -> Text2
82.1. -> Some text3
82.2. -> Some long text goes there in two or more lines... Some more text goes here...
83. -> Text4
84. -> Text5
CodePudding user response:
You can use a simple multiline regex to split the text and finish this in just 2 lines(concise code).
- Match all digits and period character from the start. Capture them in a group.
^([\d.] )
- Match the rest of the string in another group.
(.*)$
. - Now, use
preg_match_all
to match all of those lines and pass an array as a third parameter to store those matches. (say$matches
). - Use
array_map
to merge captured groups1
and2
.
Snippet:
<?php
preg_match_all('/^([\d.] )(.*)$/m', $str, $matches);
$result = array_map(fn($v1, $v2) => [ $v1, $v2] , $matches[1], $matches[2]);
print_r($result);