I have two lists where am trying to perform join on the keys using core python with out using any additional libraries. The key column is the first value of the tuple 100, 101 and 102.
List 1 = [(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568'), (102, 'Lex', '515.123.4569')]
List 2 = [(100, 'Engineer', '515.123.4567'), (101, 'Doctor', '515.123.4568')]
Expected Result
Inner join
[(100, 'Steven', '515.123.4567', 'Engineer'), (101, 'Neena', '515.123.4568', 'Doctor')]
Left outer
[(100, 'Steven', '515.123.4567', 'Engineer'), (101, 'Neena', '515.123.4568', 'Doctor'), (102, 'Lex', '515.123.4569', null)]
We can easily do this using pandas. But am trying to do this in the python itself. Any suggestion will be helpful.
I tried using collections and itertools, but am not getting expected results
CodePudding user response:
List_1 = [(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568'), (102, 'Lex', '515.123.4569')]
List_2 = [(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568')]
def inner_join(list_1, list_2):
return [entries for entries in list_1 if entries in list_2]
def left_outer_join(list_1, list_2):
return [entries if entries in list_2 else None for entries in list_1]
print(inner_join(List_1, List_2))
print(left_outer_join(List_1, List_2))
This gives the following result:
[(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568')]
[(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568'), None]
CodePudding user response:
UPDATE
The question was updated with different examples. Below are two functions that would yield the correct result:
def inner_join(list1: list[tuple], list2: list[tuple]):
ids_list2 = [item[0] for item in list2]
return_list = []
for item in list1:
if item[0] in ids_list2:
item_list_1 = list(item)
engineer = list2[ids_list2.index(item[0])][1]
return_list.append(tuple(item_list_1 [engineer]))
return return_list
print(inner_join(list1, list2))
# [(100, 'Steven', '515.123.4567', 'Engineer'), (101, 'Neena', '515.123.4568', 'Doctor')]
For left outer, you can ammend easily the above:
def left_outer_join(list1: list[tuple], list2: list[tuple]):
ids_list2 = [item[0] for item in list2]
return_list = []
for item in list1:
if item[0] in ids_list2:
item_list_1 = list(item)
engineer = list2[ids_list2.index(item[0])][1]
return_list.append(tuple(item_list_1 [engineer]))
else:
return_list.append(None)
return return_list
print(left_outer_join(list1, list2))
# [(100, 'Steven', '515.123.4567', 'Engineer'), (101, 'Neena', '515.123.4568', 'Doctor'), None]
OLD ANSWER
I am not sure that 'join' is the right terminology here. A join performs an enrichment of a dataset from another related dataset based on common key(s). What you are illustrating is a intersection of two lists ('inner join') which can be ammended to simulate what you call 'left outer'.
list1 = [(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568'), (102, 'Lex', '515.123.4569')]
list2 = [(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568')]
inner_join = [item for item in list1 if item in list2]
left_outer = [item if item in list2 else None for item in list1]
print(inner_join)
print(left_outer)
# [(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568')]
# [(100, 'Steven', '515.123.4567'), (101, 'Neena', '515.123.4568'), None]