Home > front end >  Unable to find values unique to a string list when compared with another in Python
Unable to find values unique to a string list when compared with another in Python

Time:06-10

I have two Databases each containing 309 (old_root) and 483 (root) tables. I extracted two lists containing names of all tables for both databases using a function I wrote called table_search() which returns a boolean flag value (this is irrelevant) and a list containing all tables of the database, and is called in the snippet mentioned below. The function works perfectly and the lists contain the names of all tables for both databases.

unique_d1_tables = []
unique_d2_tables = []
d1_tables = []
d2_tables = []
unique_total_tables = []
temp_d1 = []

f, tab_list = table_search("old_root", "x")
d1_tables = tab_list
temp_d1=tab_list
print("Total number of tables in old_root:", len(d1_tables))
          
f, tab_list = table_search("root", "x")
unique_total_tables = d1_tables
d2_tables = tab_list

print("Total number of tables in root:", len(tab_list))
print("Total number of tables in both old_root & root (including duplicates):", len(d1_tables   d2_tables))

unique_total_tables.extend(x for x in tab_list if x not in unique_total_tables)
print("Total number of tables in both old_root & root (excluding duplicates):", len(unique_total_tables))

unique_d1_tables.extend(x for x in d1_tables if x not in d2_tables)
print("Total number of tables in old_root not present in root:", len(unique_d1_tables))

unique_d2_tables.extend(x for x in d2_tables if x not in d1_tables)
print("Total number of tables in root not present in old_root:", len(unique_d2_tables))

The snippet works perfectly except for the last case, where the unique tables of D2 (root) are to be found. The output:

Total number of tables in old_root: 309
Total number of tables in root: 483
Total number of tables in both old_root & root (including duplicates): 792
Total number of tables in both old_root & root (excluding duplicates): 484
Total number of tables in old_root not present in root: 1
Total number of tables in root not present in old_root: 0

The value of the last line is supposed to be 175. unique_d2_tables is an empty list too. I have checked the output with smaller dummy databases containing 2 and 3 tables each and again the last case failed. Please let me know about the flaw in my code, as something is not being considered correctly.

CodePudding user response:

unique_total_tables = d1_tables is passing by reference the list d1_tables. This means that when you extend unique_total_tables you are actually extending d1_tables.

You can fix this by making a copy of the list.

Use unique_total_tables = d1_tables[:]. This is equivalent to unique_total_tables = [x for x in d1_tables]

  • Related