my string may be like this:
@ *lorem.jpg,,, ip sum.jpg,dolor ..jpg,-/ ?
in fact - it is a dirty csv
string - having names of jpg images
I need to remove any
non-alphanum chars - from both sides
of the string
then - inside the resulting string - remove the same - except
commas and dots
then - remove duplicates commas and dots - if any - replace them with single ones
so the final result should be:
lorem.jpg,ipsum.jpg,dolor.jpg
I firstly tried to remove any white space - anywhere
$str = str_replace(" ", "", $str);
then I used various forms of trim
functions - but it is tedious and a lot of code
the additional problem is - duplicates commas and dots may have one
or more
instances - for example - ..
or ,,,,
is there a way to solve this using regex, pls ?
CodePudding user response:
List of modeled steps following your words:
Step 1
"remove any non-alphanum chars from both sides of the string"
translated: remove trailing and tailing consecutive [^a-zA-Z0-9] characters
regex: replace
^[^a-zA-Z0-9]*(.*?)[^a-zA-Z0-9]*$
with$1
Step 2
- "inside the resulting string - remove the same - except commas and dots"
- translated: remove any [^a-zA-Z0-9.,]
- regex: replace
[^a-zA-Z0-9.,]
with empty string
Step 3
- "remove duplicates commas and dots - if any - replace them with single ones"
- translated: replace consecutive [,.] as a single instance
- regex: replace
(\.{2,})
with.
- regex: replace
(,{2,})
with,
PHP Demo:
<?php
$subject = " @ *lorem.jpg,,, ip sum.jpg,dolor ..jpg,-/ ?";
$firstStep = preg_replace('/^[^a-zA-Z0-9]*(.*?)[^a-zA-Z0-9]*$/', '$1', $subject);
$secondStep = preg_replace('/[^a-z,A-Z0-9.,]/', '', $firstStep);
$thirdStepA = preg_replace('(\.{2,})', '.', $secondStep);
$thirdStepB = preg_replace('(,{2,})', ',', $thirdStepA);
echo $thirdStepB; //lorem.jpg,ipsum.jpg,dolor.jpg
CodePudding user response:
Look at
https://www.php.net/manual/en/function.preg-replace.php
It replace anything inside a string based on pattern. \s represent all space char, but care of NBSP (non breakable space, \h match it )
Exemple 4
$str = preg_replace('/\s\s /', '', $str);
It will be something like that
CodePudding user response:
Can you try this :
$string = ' @ *lorem.jpg,,,, ip sum.jpg,dolor .jpg,-/ ?';
// this will left only alphanumirics
$result = preg_replace("/[^A-Za-z0-9,.]/", '', $string);
// this will remove duplicated dot and ,
$result = preg_replace('/, /', ',', $result);
$result = preg_replace('/\. /', '.', $result);
// this will remove ,;. and space from the end
$result = preg_replace("/[ ,;.]*$/", '', $result);