Home > Software engineering >  How to check if a number (or str) from a list is in another column? - Python
How to check if a number (or str) from a list is in another column? - Python

Time:11-14

I have a problem cross-checking numbers between a list and a column.

I have a list called "allowed_numbers" with 40 different phone numbers and a column imported from an excel sheet with 8000 calls called df['B-NUMBER']. I believe around 90% of these 8000 calls are in the allowed_number list but I need to cross-check this somehow and be able to see what numbers that isn't in the list. preferably store these numbers in a variable called "fraud"

So I made the allowed_numbers to a list with strings inside, it looks like this.

'21114169202',
 '27518725605',
 '514140099453',
 '5144123173905',
allowed_number=re.sub(",","", allowed_number)
allowed_number = allowed_number.split(" ")

Then I tried to cross-check this with the column df['B-NUMBER'] in different ways but nothing works and need help. I've tried this

 df[df['B-NUMBER'].isin(allowed_number)]
 fraud = [df['B-NUMBER'] in allowed_number if allowed_number not in df["B-NUMBER"]]
fraud = df['B-NUMBER'].apply(lambda x: ''.join(y for y in x if y not in allowed_number))

I try to avoid loops because of the run time but if it is possible with a loop somehow please share your insight :) cheers

CodePudding user response:

Just to summarize the discussion in the comments. Using

df['B-NUMBER'].isin(allowed_number)

works once the content of allowed_number is turned into integers via

allowed_number = [int(x) for x in allowed_number]

So to get the fraudulent numbers something like this works

allowed_number=re.sub(",","", allowed_number)
allowed_number = allowed_number.split(" ")
allowed_number = [int(x) for x in allowed_number]

df["allowed"] = df["B-NUMBER"].isin(allowed_number)

# fraudulent 
df_fradulent = df.loc[~df["allowed"]]

  • Related