Home > Enterprise >  How can I get the values which only contains alphanumeric values?
How can I get the values which only contains alphanumeric values?

Time:01-06

I have 2 columns of object datatype like

  col1             col2  
AB              AB123
BC123             SG
BG274             HF
  DC               DG789
DG156            HD

I want my output to be like

 output
                                                                                                   BC123
BG274
DG789
DG156

I have tried rg expression and I am getting my col1 as my output

CodePudding user response:

If I understand correctly, you can mask the data and fillna:

m1 = df['col1'].str.fullmatch(r'[a-zA-Z] \d ')
m2 = df['col2'].str.fullmatch(r'[a-zA-Z] \d ')

(df['col1'].where(m1)
 .fillna(df['col2'].where(m2))
 )

Generic method for an arbitrary number of columns:

cols = ['col1', 'col2']

df[cols].where(
  df[cols]
 .apply(lambda s: s.str.fullmatch(r'[a-zA-Z] \d '))
 ).bfill(axis=1).iloc[:, 0]

Output:

0    AB123
1    BC123
2    BG274
3    DG789
4    DG156
Name: col1, dtype: object

CodePudding user response:

You can do it in your database query:

(For SQL Server)

To select alphanumeric values from col1

select col1 from table1 where col1 like  '%[A-Za-z]%' and col1 like '%[0-9]%'

To select unique alphanumeric values from col1 and col2

create table table1(col1 varchar(50), col2 varchar(50));
insert into table1 values('AB','AB123');
insert into table1 values('BC123','SG');
insert into table1 values('BG274','1');
insert into table1 values('DC','DG789');
insert into table1 values('DG156','HD');

Query:

select col1 from table1 where col1 like  '%[A-Za-z]%' and col1 like '%[0-9]%'
  union
select col2 from table1 where col2 like  '%[A-Za-z]%' and col2 like '%[0-9]%'

Output:

col1
AB123
BC123
BG274
DG156
DG789

fiddle

For MySQL:

Query:

select col1 from table1 where col1 regexp '[A-Za-z][0-9]'
  union
select col2 from table1 where col2 regexp '[A-Za-z][0-9]'

Output:

| col1 |
|------|
| BC123 |
| BG274 |
| DG156 |
| AB123 |
| DG789 |

fiddle

CodePudding user response:

You can filter with regx

def regex_filter(val, rex):
    if val:
        mo = re.search(rex, val)
        if mo:
            return True
        else:
            return False
    else:
        return False
df[df['col1'].apply(lambda x : regex_filter(x,"[A-Z] [0-9] "))]['col1']
Out[103]: 
1    BC123
2    BG274
4    DG156
  • Related