I have a dataframe as shown below.It has 3 columns with names "TTN_163_2.5_-40 ","TTN_163_2.7_-40" and " TTN_163_3.6_-40".
I need to select all rows whose column name contains '2.5','3.6','2.7'.
I have some column names which contains 1.6,1.62 and 1.656.I need to select
these separately.when I am writing df_psrr_funct_1V6.filter(regex='1\.6|^xvalues$')
I am geting all rows corresponds to 1.6 ,1.65 and 1.62 .I don't want this .May I know how to select uniquely.
I used this method (df_psrr_funct = df_psrr_funct.filter(regex='2.5'))but it is not capturing 1st column(xvalues)
Sample dataframe
xvalues TTN_163_2.5_-40 TTN_163_2.7_-40 TTN_163_3.6_-40
23.0279 -58.7591 -58.5892 -60.0966
30.5284 -58.6903 -57.3153 -59.9111
Please the image my dataframe
CodePudding user response:
Expand regex with |
for or
, ^
is for start string, $
is for end string for extract column name xvalues
and avoid extract colums names with substrings like xvalues 1
or aaa xvalues
:
df_psrr_funct = df_psrr_funct.filter(regex='2\.5|^xvalues$')
print (df_psrr_funct)
xvalues TTN_163_2.5_-40
0 23.0279 -58.7591
1 30.5284 -58.6903
EDIT: If need values between _
use:
print (df_psrr_funct)
xvalues TTN_163_1.6_-40 TTN_163_1.62_-40 TTN_163_1.656_-40
0 23.0279 -58.7591 -58.5892 -60.0966
1 30.5284 -58.6903 -57.3153 -59.9111
df_psrr_funct = df_psrr_funct.filter(regex='_1\.6_|^xvalues$')
print (df_psrr_funct)
xvalues TTN_163_1.6_-40
0 23.0279 -58.7591
1 30.5284 -58.6903
CodePudding user response:
Another approach:
df_psrr_funct.filter(regex = '^\D $|2.5')
xvalues TTN_163_2.5_-40
0 23.0279 -58.7591
1 30.5284 -58.6903
CodePudding user response:
using regex for this doesnt make any sense... just do
columns_with_2point5 = [c for c in df.columns if "2.5" in c]
only_cool_cols = df[['xvalues'] columns_with_2point5]
dont overcomplicate it ...
if you dont need the first column you can just use filter
with like
instead of using one of the regex solutions (see first comment from @BeRT2me)