I want to shorten the field of the 4th words:
unaccent('unaccent', lower(regexp_replace(titre, '[^\w] ','_','g')))
CodePudding user response:
You can use This
=regexextract(A2,"\w (?:\W \w ){5}")
EXPLANATION
--------------------------------------------------------------------------------
\w word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (21 times):
--------------------------------------------------------------------------------
\W non-word characters (all but a-z, A-Z, 0-
9, _) (1 or more times (matching the
most amount possible))
--------------------------------------------------------------------------------
\w word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
){5} end of grouping
CodePudding user response:
If you don't need the remaining words to be separated by exactly the same amount of whitespace as the input, then you could turn the string into an array, take the first four elements and convert that back into a string where the words are delimited with a single space:
array_to_string((regexp_split_to_array(titre, '[^\w]'))[1:4], ' ')
(regexp_split_to_array(titre, '[^\w]')
will turn e.g. the string one two three four five six
into an array with six elements. [1:4]
then extracts the first four elements (or all if there are less than four) and array_to_string
converts this back into a string. So one two three four five six
will be converted to one two three four
.
However one two three four five six
will also be converted to one two three four
.