Home > front end >  Python: Compare values of two fields with multiple strings and add all unique values to a 3rd field
Python: Compare values of two fields with multiple strings and add all unique values to a 3rd field

Time:05-15

I am still a Python-Rookie and are stuck with an idea. I want to compare the values of Type1 with Type2 and add all unique values to a 3rd field.

My demo data: Demo Data Car Types

I tried to import it into a Pandas dataframe and loop through it to append the missing values to Type2 (if Type1 not in Type2, Type2.append(Type1)).

I also tried to concatenate both lists and reduce it to unique values by using set().

I get plenty of results, but not what I need. The result should be: Demo Data Car Types with results

How would you approach this?

Thank you for your support, Konstantin

CodePudding user response:

How is your datas stored ?

If your work with list :

Type1 = ["Scirocco", "Golf"]
Type2 = ["Golf", "Polo", "Phaeton"]
Result = set(Type1).union(Type2)  # It is a set not a list 

If you work with string :

Type1 = "Scirocco, Golf"
Type2 = "Golf, Polo, Phaeton"
# Manage your data with list object
l_Type1 = Type1.remove(",").split()  # Create list from values between " "
l_Type2 = Type2.remove(",").split() 
set_Result = set(Type1).union(Type2)
# Create a string from set_Result values separated by ", "
Result = ", ".join(set_Result)  

CodePudding user response:

You can do it like this:

If the columns store strings:

   Marke           Type1                Type2
0     VW  Scirocco, Golf  Golf, Polo, Phaeton
1   Opel  Ascona, Rekord        Kadett, Corsa
2   Ford  Focus, Granada  Taunus, Focus, Kuga
3  Volvo   V40, V70, 240             V90, S80
df['Result'] = (df['Type1']   ', '   df['Type2']).str.split(', ').map(dict.fromkeys).str.join(', ')

print(df)

If the columns store lists:

   Marke             Type1                  Type2
0     VW  [Scirocco, Golf]  [Golf, Polo, Phaeton]
1   Opel  [Ascona, Rekord]        [Kadett, Corsa]
2   Ford  [Focus, Granada]  [Taunus, Focus, Kuga]
3  Volvo   [V40, V70, 240]             [V90, S80]
df['Result'] = (df['Type1']   df['Type2']).map(dict.fromkeys).str.join(', ')

print(df)

Using a dict will ensure you will have the keys in order, matching your target result.

Result:

   Marke           Type1                Type2                         Result
0     VW  Scirocco, Golf  Golf, Polo, Phaeton  Scirocco, Golf, Polo, Phaeton
1   Opel  Ascona, Rekord        Kadett, Corsa  Ascona, Rekord, Kadett, Corsa
2   Ford  Focus, Granada  Taunus, Focus, Kuga   Focus, Granada, Taunus, Kuga
3  Volvo   V40, V70, 240             V90, S80        V40, V70, 240, V90, S80
  • Related