I have input like
sie%Qu7s Kuux"oh9 ohc9ahG% hoe8Toh: Eix*ohd1 doh:bo2U Cu0doo|t zo`L9xaW
fie5Du[h Phe8aid# Opu&fai5 ieZ<aek6 hu4ga&Di Oose}p1p aiD@oos2 nu-a1Fub
ahqu5To/ ahtie[H3 ioK&u5Ai nei1Za#d poo_Th9r gu|aGh7h uZ%io2ah IeNah&v7
eif\e8AE Ieb,ing4 reph1oW* eeSh'ee8 Ah ei4ai Oi0Ca,vu Esh1xe?e Wei&k4ic
ue5OhQu. aaf-i8uP eedae%T5 sei?M9Pu ieH[oh2l ieh~ah8A aev"oo9A Ohf"i8de
Foh:x2zi aLoo'qu2 Ia6aig-e La{vie1E IeFoh{cI Au_h7Hee Se)f4ebi Cah$yu7m
where each word in the column constitutes a password ، i am trying to print lines where where any word begins and ends with the same letter , with this we do not distinguish between uppercase and lowercase letters
i know with command grep i can do this
cat passwords.txt | grep -e ' \([A-Z]\)......\1 ' -e ' \([a-z]\)......\1 '
but here the word will start and end only with same latter (uppercase or lowercase letters) , like
Foh:x2zi aLoo'qu2 Ia6aig-e La{vie1E IeFoh{cI Au_h7Hee Se)f4ebi Cah$yu7m
expected output
eif\e8AE Ieb,ing4 reph1oW* eeSh'ee8 Ah ei4ai Oi0Ca,vu Esh1xe?e Wei&k4ic
sie%Qu7s Kuux"oh9 ohc9ahG% hoe8Toh: Eix*ohd1 doh:bo2U Cu0doo|t zo`L9xaW
ue5OhQu. aaf-i8uP eedae%T5 sei?M9Pu ieH[oh2l ieh~ah8A aev"oo9A Ohf"i8de
Foh:x2zi aLoo'qu2 Ia6aig-e La{vie1E IeFoh{cI Au_h7Hee Se)f4ebi Cah$yu7m
ahqu5To/ ahtie[H3 ioK&u5Ai nei1Za#d poo_Th9r gu|aGh7h uZ%io2ah IeNah&v7
CodePudding user response:
With GNU grep:
grep -iE '(.)[^ ]{6}\1' passwords.txt
Output:
sie%Qu7s Kuux"oh9 ohc9ahG% hoe8Toh: Eix*ohd1 doh:bo2U Cu0doo|t zo`L9xaW ahqu5To/ ahtie[H3 ioK&u5Ai nei1Za#d poo_Th9r gu|aGh7h uZ%io2ah IeNah&v7 eif\e8AE Ieb,ing4 reph1oW* eeSh'ee8 Ah ei4ai Oi0Ca,vu Esh1xe?e Wei&k4ic ue5OhQu. aaf-i8uP eedae%T5 sei?M9Pu ieH[oh2l ieh~ah8A aev"oo9A Ohf"i8de Foh:x2zi aLoo'qu2 Ia6aig-e La{vie1E IeFoh{cI Au_h7Hee Se)f4ebi Cah$yu7m
-i
: Ignore case distinctions in patterns and input data, so that characters that differ only in case match each other.
-E
: Interpret(.)[^ ]{6}\1
as extended regular expressions.
CodePudding user response:
Use GNU grep:
grep -i -P '(?<!\S)(\S)(?:\S*\1)?(?!\S)' passwords.txt
The -i
option turns on case insensitivity, -P
turns on PCRE flavor (supports lookbehinds/lookaheads).
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
\S non-whitespace (all but \n, \r, \t, \f,
and " ")
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\S non-whitespace (all but \n, \r, \t, \f,
and " ")
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\S* non-whitespace (all but \n, \r, \t, \f,
and " ") (0 or more times (matching the
most amount possible))
--------------------------------------------------------------------------------
\1 what was matched by capture \1
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
\S non-whitespace (all but \n, \r, \t, \f,
and " ")
--------------------------------------------------------------------------------
) end of look-ahead