Let's say I have a function like this, to merge names in two lists:
def merge_name(list1="John",list2="Doe"):
def merge(name1=list1,name2=list2):
merge=name1 "-" name2
data={"Merged":merge}
return data
d = pd.DataFrame()
for i,j in [(i,j) for i in list1 for j in list2]:
if i==j:
d=d
else:
x = merge(name1=i,name2=j)
ans=pd.DataFrame({"Merged":[x["Merged"]]})
d=pd.concat([d,ans])
return d
What I am interested in are unique combinations, i.e, "John-Doe" and "Doe-John" are the same for my purposes. So if I run something like this:
names1=["John","Doe","Richard"]
names2=["John","Doe","Richard","Joana"]
df=merge_name(list1=names1,list2=names2)
I will get:
- John-Doe
- John-Richard
- John-Joana
- Doe-John
- Doe-Richard
- Doe-Joana
- Richard-John
- Richard-Doe
- Richard-Joana
The groups in bold are all repetitions. Essentially, every time it comes to the next i, it creates n-1 repeated groups, with n being the position in names1. Is there a way to avoid this, like drop the top name in "list2" every time j becomes the last element in the list?
Thanks in advance.
I have tried to update list2 while in loop but obviously that does not work
CodePudding user response:
Below code can be useful
import pandas as pd
def merge_name(list1="John", list2="Doe"):
merged=[]
for i in list1:
for j in list2:
if (i!=j) and (f"{j} - {i}" not in merged):
merged.append(f"{i} - {j}")
df = pd.DataFrame(set(merged))
return df
names1 = ["John", "Doe", "Richard"]
names2 = ["John", "Doe", "Richard", "Joana"]
df = merge_name(list1=names1, list2=names2)
print(df)
CodePudding user response:
Below is my solution with some explanations:
def combineName(listName):
res = []
for i in range(len(listName)):
for j in range(i 1, len(listName)):
res.append(listName[i] "-" listName[j])
return res
names1=["John","Doe","Richard"]
names2=["John","Doe","Richard","Joana"]
listName = list(set(names1 names2))
print(listName)
print(combineName(listName))
First, you should create a simple list without repetitions. This way you only get unique elements in your list. To do this, I used a set. I take care to transform my set into a list because later I go through the structure in a given order, which is not supposed to be true for a set.
Secondly, the function creates all the combinations. There are two loops, and you notice that the second loop has a special range. Indeed, you do not want repetitions such as "John-Doe" and "Doe-John". Each combination is created at a unique time!