I have DataFrame like below:
COL1 | COL2 | COL3
-----|------|--------
abc | P | 123
b.bb | , | 22
1 | B | 2
... |... | ...
And I need to find columns where is only punctation mark like !"#$%&'()* ,-./:;<=>?@[]^_`{|}~
So as a result I need something like below (only COL2, because in COL1 is also punctation mark, but there is with other values).
COL2
-------
P
,
B
...
CodePudding user response:
Using a regex with str.fullmatch
and any
:
import re
chars = '''!"#$%&'()* ,-./:;<=>?@[]^_`{|}~'''
pattern = f'[{re.escape(chars)}]'
# [!"\#\$%\&'\(\)\*\ ,\-\./:;<=>\?@\[\]\^_`\{\|\}\~]
out = df.loc[:, df.astype(str).apply(lambda s: s.str.fullmatch(pattern).any())]
Or with isin
:
out = df.loc[:, df.isin(set(chars)).any()]
Output:
COL2
0 P
1 ,
2 B
CodePudding user response:
punc = set("!\"#$%&'()* ,-./:;<=>?@[]^_`{|}~")
df.loc[:, df.applymap(lambda x: set(x).issubset(punc)).any()]