I'm using python and having trouble finding a code that matches 3 or more numbers and return the Id number. For example, I would like to lookup these 6 numbers in one sheet ( 2,4,6,8,9,10 )
lookup 6 number above and find 3 or more numbers in this sheet below.
I would like my result to be something like this in another excel sheet but only return the ID only if it matches 3 or more numbers. If you look at id u it matches 2,4,6,8 and id 6 matches 6,8,9,10 and id 7 matches 2,6,8
CodePudding user response:
Here is what I think you are trying to achieve. I have made the assumption that you have imported both excel files and have named them df1
and df2
respectively. With df1
being the smaller dataframe containing the values to search. I created my own dataframes for testing purposes. I only recreated df2
up to ID=7.
Here is the code to build the tables:
import pandas as pd
cols = ['Num1','Num2','Num3','Num4','Num5','Num6']
df1 = pd.DataFrame([[2,4,6,8,9,10]], columns=cols)
df2 = pd.DataFrame([[1,1,2,4,5,6,8],
[2,5,6,20,22,23,34],
[3,8,12,13,34,45,46],
[4,9,10,14,29,32,33],
[5,1,22,13,23,33,35],
[6,1,6,7,8,9,10],
[7,0,2,3,5,6,8]],
columns = ['Id', 'Num1','Num2','Num3','Num4','Num5','Num6'])
And now the code to find the IDs:
# convert the values in the first dataframe to a list
vals_to_find = df1.iloc[0].tolist()
# Print the values to find
print("Vals to find:", vals_to_find)
# Create an empty list to hold the matching IDs
matching_ids = []
# iterate through the big dataframe
for index, row in df2.iterrows():
rowlist = row.tolist() # convert the row to a list
# keep the id for later, and extract the other values for evaluation
id = rowlist[0]
vals = rowlist[1:]
# count the number of values in one list against another list
counter = sum(elem in vals_to_find for elem in vals)
# If the number of matches is greater than 2, then grab the ID
if counter > 2:
matching_ids.append({'ID': id})
# Print the matching IDs
print('Matching IDS:', matching_ids)
# Convert to dataframe and export to excel spreadsheet
df3 = pd.DataFrame(matching_ids)
df3.to_excel("output.xlsx", index=False)
OUTPUT:
Vals to find: [2, 4, 6, 8, 9, 10]
Matching IDS: [{'ID': 1}, {'ID': 6}, {'ID': 7}]
And the output.xlsx: