Home > OS >  extracting data from a string in PHP
extracting data from a string in PHP

Time:02-12

I am a complete beginner in PHP. I have several hundred product descriptions, for example here:

  • Superpremium cat food light chicken 1 kg

  • Superpremium cat food light chicken 10kg

  • Superpremium cat food beef 2x10 kg

  • Superpremium cat food beef 2 x 3,8 kg

  • Superpremium cat food beef 2 x 2kg

  • Superpremium cat food beef 42 x 85g

I would need to extract from the product name the name of flavor, quantity and weight like this:

For example:

From this: Superpremium cat food beef 42 x 85g

To this:

  • Flavor: beef
  • Number of pieces: 42
  • Weight: 85
  • KG/g: g

I tried to use an array, but it doesn't work for me at all.

There's a mess in the numbers. Sometimes 10kg is given, sometimes 10 kg with a gap as you can see in the examples above.

I have a list of name of the products (brands), list of flavors etc. so my idea was to use array and use if the string contains flavor then return "Flavor: beef" etc. I spent hours on looking how to use the functions but as a beginner have no idea how to make it work.

Thanks for help!

<?php
function flavor($product) {

$arr = array('turkey', 'chicken', 'fish', 'salmon', 'rabbit', 'beef', 'cod');
foreach ($arr as $product) {
    return $product;
    
}   
}

?>

CodePudding user response:

This how the solution can look like:

<?php

$description = <<<EOD
    Superpremium cat food light chicken 1 kg

    Superpremium cat food light chicken 10kg

    Superpremium cat food beef 2x10 kg

    Superpremium cat food beef 2 x 3,8 kg

    Superpremium cat food beef 2 x 2kg

    Superpremium cat food beef 42 x 85g
EOD;

preg_match_all("/.*food\s?([^\d]*)\s?([\d] )\s*x?\s*([\d,]*)\s*([\w]*)/im", $description, $matches, PREG_SET_ORDER);

foreach ($matches as $match) {
    echo 'Flavor: ' . $match[1] . PHP_EOL;
    echo 'Number of pieces: ' . $match[2] . PHP_EOL;
    echo 'Weight: ' . ($match[3] ?: 1) . PHP_EOL;
    echo 'kg/g: ' . $match[4] . PHP_EOL . PHP_EOL;
}

Explanations:

  • <<<EOD means heredoc syntax for strings in PHP, it allows to handle multiline text in a single variable
  • preg_match_all looks for pattern in the entire description text and save each piece in parentheses into $matches variable
  • im in the end of regex pattern means i - case-insensitive, m - multiline
  • PREG_SET_ORDER constant sets a needed order of found pieces in the $matches variable
  • PHP_EOL constant stands for end-of-line symbol "\n", allows to print text on a new line
  • ($match[3] ?: 1) is a shortening for ($match[3] ? $match[3] : 1) which means that if we have a value in $match[3] array index then we use this value, if no then we use 1 (one piece if no value for it in the text)

CodePudding user response:

Probably the best way to parse it, you could use regular expressions.

If you have T-Regx you can do

$pattern = pattern('^(.*?) (?:(\d ) ?x ?)?((?:\d )(?:,\d )?) ?(k?g)$', 'm');
$match = $pattern->match('Superpremium cat food beef 2 x 3,8 kg');

$match->first(function (Detail $detail) {
    $text = $detail->get(1);
    $quantity = $detail->group(2)->orReturn(null);
    $amount = $detail->get(3);
    $unit = $detail->get(4);

    var_dump($text, $quantity, $amount, $unit);
});
  •  Tags:  
  • php
  • Related