Home > Enterprise >  Extracting data from a template string
Extracting data from a template string

Time:09-13

I need to get data from a string by a template. An example should make it clearer:

// What I have
$utterance = 'This is a brown bear with 7 kids';
$template = 'This is a {color} bear with {kids} kids';

// What I want
[
  "color" => "brown",
  "kids" => "7",
]

I have a very ugly solution to this:

$regex = '/' . preg_replace('/{.*?}/' , '.*?', $template) . '/i';

foreach(preg_split('/\{.*?\}/', $template) as $part) {      
  $utterance = str_replace($part, Str::startsWith($template, $part) || Str::endsWith($template, $part)  ? '' : '|', $utterance);
}

preg_match_all('/{(.*?)}/', $template, $variables);

$values = explode('|', $utterance);
$variables = $variables[1];

array_combine($variables, $values);

Does anyone have a nicer way of doing this? Seems like an ugly approach...

CodePudding user response:

I would go with something like this, first you will explode template, then foreach that array and find words which starts with {, when u find such you know index of the word so you can try to find same index in exploded utterance.

<?php

$utterance = 'This is a brown bear with 7 kids';
$template = 'This is a {color} bear with {kids} kids';

$utter_words = explode(" ",$utterance);
$temp_words = explode(" ",$template);

$output = array();
foreach($temp_words as $i=>$temp_word){
    if(strpos($temp_word, "{")===0){
        $key = str_replace(array("{","}"), "", $temp_word);
        $output[$key] = $utter_words[$i];
    }
}

var_dump($output);

output

array(2) {
  ["color"]=>
  string(5) "brown"
  ["kids"]=>
  string(1) "7"
}

CodePudding user response:

Try this code

$utterance = 'This is a brown bear with 7 kids';
$template = '/This is a (.*?) bear with (.*?) kids/';

preg_match($template, $utterance, $m);

print_r($m);

echo "color = ".$m[1].PHP_EOL;
echo "kids = ".$m[2].PHP_EOL;

output

Array
(
    [0] => This is a brown bear with 7 kids
    [1] => brown
    [2] => 7
)
color = brown
kids = 7

CodePudding user response:

You can convert your template into a regular expression which uses "named captures", which look like (?<name>pattern)

In your example, the template 'This is a {color} bear with {kids} kids' can become '/This is a (?<color>.*?) bear with (?<kids>.*?) kids/'

To generate that, you use a different regular expression to find all the placeholders - /\{(.*?)\}/ - and a replacement string using the back-reference \1 - (?<\1>.*?)

Then you match the final regex against the utterance, and the named matches will show up in the by-reference matches array:

$utterance = 'This is a brown bear with 7 kids';
$template = 'This is a {color} bear with {kids} kids';

$templateRegex = '/' . preg_replace('/\{(.*?)\}/', '(?<\1>.*?)', $template) . '/';

$matches = [];
preg_match($templateRegex, $utterance, $matches);

var_dump($matches);

Gives:

array(5) {
  [0]=>
  string(32) "This is a brown bear with 7 kids"
  ["color"]=>
  string(5) "brown"
  [1]=>
  string(5) "brown"
  ["kids"]=>
  string(1) "7"
  [2]=>
  string(1) "7"
}

So $matches['color'] is 'brown'. You can filter out the numeric offsets which you don't want, and you'll just have the key-value list you wanted.

Note that you may need to do some extra preparation on your string using preg_quote to make sure everything other than the placeholders is matched literally.

  • Related