Home > front end >  How to compare the components of two lists python
How to compare the components of two lists python

Time:06-12

I want to compare two lists, row by row. If 2 rows are equal, add just one of the 2 rows to a new dataframe. But if not, add both rows to the new dataframe.

These are my both lists:

original = data_Unos['text']

13      Speaking to Africa Review   he also pointed ou...
17      Through Gawad Kalinga   Meloto has proven to b...
21      May you attain Nibbana Sena   thank you so muc...
22      Dodgeballs were flying fast and hard at Mornin...
26      Most are from desperately poor Horn of Africa ...
                              ...                        
3155    The statement signed by Ikonomwan Francis   le...
3159      Most of them   the homeless   have the abili...
3162      In Metro Manila   7 464 families of disabled...
3163      We are working with an aim to build a countr...
3172    Our hearts go out to the hundreds of thousands...
Name: text, Length: 794, dtype: object

And:

backTranslated = backTranslated['text']
backTranslated

0      Talking to Africa Review also noted that most ...
1      Through Gawad Kalinga Meloto has proven to be ...
2      May you reach Nibbana Sena thank you so much f...
3      Dodgeballs were flying fast and hard at Mornin...
4      Most of them are from poor countries in the Ho...
                             ...                        
789    The declaration signed by Ikonomwan Francis le...
790    Most of them homeless have the ability to work...
791    In Metro Manila 7 464 families of disabled cyc...
792    We are working with the objective of building ...
793    Our hearts are directed to the hundreds of tho...
Name: text, Length: 794, dtype: object

And this is what I'm trying to do:

final = pd.DataFrame()

for i in original:
  for j in backTranslated:
    if(set(i)!=set(j)):
      final = final.append(i,ignore_index=True) 
      final = final.append(j,ignore_index=True) 
    else:
      final = final.append(i,ignore_index=True) 

But the following error appears in this line:

final = final.append(j,ignore_index=True)

   TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid

How can I do that?

CodePudding user response:

The easiest way is to append both of them and remove duplicates:

final = data_Unos.append(backTranslated)
final.drop_duplicates(subset=['text'], inplace=True)

Iterating in Pandas should be last resource

CodePudding user response:

pandas.DataFrame.append method is deprecated since 1.4.0, The alternative is to use pandas.concat method.

This is how pandas.concat method is defined

pandas.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True)

The parameter objs, here needs to be either a Series or Dataframe objects. So the correct way to do it in your code is

final = pd.DataFrame()

for i in original:
  for j in backTranslated:
    series_i = pd.Series(i)
    if(set(i)!=set(j)):
      series_j = pd.Series(j)
      final = pd.concat((final, series_i, series_j), ignore_index=True) 
    else:
      final = pd.concat((final, series_i), ignore_index=True)

Furthermore you can define the column name via the key parameter.

  • Related