I have a dataframe like as below
Company,year
T123 Inc Ltd,1990
T124 PVT ltd,1991
T345 Ltd,1990
T789 Pvt.LTd,2001
ABC Limited,1992
ABCDE Ltd,1994
ABC Ltd,1997
ABFE,1987
Tesla ltd,1995
AMAZON Inc,2001
Apple ltd,2003
compare = pd.MultiIndex.from_product([tf['Company'].astype(str),tf['Company'].astype(str)]).to_series()
compare = compare[compare.index.get_level_values(0) != compare.index.get_level_values(1)]
my data looks like below
I would like to do the below
a) print the index number (i) at the end of comparison of each input key.for ex: T123 Inc Ltd is compared with ten other strings. So, once it is done and moves to a new/next string/input_key which is T124 PVT ltd, I want the value of i to be incremented by 1 and shown using print function
.
I tried the below
def metrics(tup):
print(compare.loc[tup]) # doesn't work. I want the index number
return list(tup)
I don't know how to increment/iterate within apply
function
I expect my output of print statement to be like as below. Print statement should be executed only after each input key's comparison is over.
comparison of 1st input key with 10 strings is done
comparison of 2nd input key with 10 strings is done
update - code
ls = []
ks=[]
for i, k in enumerate(compare, 1):
ls.append(metrics(k))
ks.append(k)
print(f'comparison of {i} input key: {k} with {len(ls)} strings is done')
pd.concat(pd.DataFrame(ks),pd.DataFrame(ls)) # error
pd.concat(ks,ls) # error
metrics
def metrics(tup):
return pd.Series([fuzz.ratio(*tup),
fuzz.token_sort_ratio(*tup),
fuzz.token_set_ratio(*tup),
fuzz.QRatio(*tup),
fuzz.UQRatio(*tup),
fuzz.UWRatio(*tup)],
['ratio', 'token','set','qr','uqr','uwr'])
CodePudding user response:
Let us try enumerating over the unique values in level 0
index:
groups = []
grouper = compare.groupby(level=0, sort=False)
for i, (k, g) in enumerate(grouper, 1):
# Execute statements here
groups.append(g.apply(metrics))
print(f'comparison of {i} input key: {k} with {len(g)} strings is done')
df_out = pd.concat(groups)
comparison of 1 input key: T123 Inc Ltd with 10 strings is done
comparison of 2 input key: T124 PVT ltd with 10 strings is done
...
comparison of 11 input key: Apple ltd with 10 strings is done