Home > Enterprise >  How to look up data between two columns pandas
How to look up data between two columns pandas

Time:04-13

I have excel files that contains two columns, I want to check presence of every cell in column 1 against data in column 2,

If data in a cell in column 1 is present in column 2 then it must output 1 and if not 0.

Here's dataframe

     COLUMN 1                                    COLUMN 2
ZUBEDA SALIBOKO JUMANNE                     REDEMPTHA MATINDI     
STEPHEN STAFFORD MIHUNGO                    PETER G. DATTAN 
JUMANNE MWALIMU                             JOANES PETER LUGAZIA 
HUWAIDA IDRISSA  JUMBE                      HAMIS JUMA IDD  ISAKA  
AIDANIA LUAMBANO                            EDWIN MARTIN  MUHONDEZI  
KESSY BONIFAS FULANO                        RICHARD THOMAS MLIWA   
KENEDY  STEPHEN MSHOMI                      JUMANNE MWALIMU     
JOANES PETER LUGAZIA                        ISAAC RUGEMALILA ABRAHAM 
MWANAISHA MOHAMED MUNGIA                    ZAITUN SALUM MGAZA    
PETRO ZACHARIA MAGANGA                      STEPHEN STAFFORD MIHUNGO 

Desired output

    COLUMN 1                                    COLUMN 2                       RESULTS
ZUBEDA SALIBOKO JUMANNE                     REDEMPTHA MATINDI                      0
STEPHEN STAFFORD MIHUNGO                    PETER G. DATTAN                        1
JUMANNE MWALIMU                             JOANES PETER LUGAZIA                   1
HUWAIDA IDRISSA  JUMBE                      HAMIS JUMA IDD  ISAKA                  0
AIDANIA LUAMBANO                            EDWIN MARTIN  MUHONDEZI                0
KESSY BONIFAS FULANO                        PETRO ZACHARIA MAGANGA                 0
KENEDY  STEPHEN MSHOMI                      JUMANNE MWALIMU                        0
JOANES PETER LUGAZIA                        ISAAC RUGEMALILA ABRAHAM               0
MWANAISHA MOHAMED MUNGIA                    ZAITUN SALUM MGAZA                     0
PETRO ZACHARIA MAGANGA                      STEPHEN STAFFORD MIHUNGO               1

df['RESULTS'] = df['COLUMN 1'] isin df['COLUMN 2']

CodePudding user response:

You almost had it:

df["RESULTS"] = df["COLUMN 1"].isin(df["COLUMN 2"]).astype(int)

>>> df
                   COLUMN 1                  COLUMN 2  RESULTS
0   ZUBEDA SALIBOKO JUMANNE         REDEMPTHA MATINDI        0
1  STEPHEN STAFFORD MIHUNGO           PETER G. DATTAN        1
2           JUMANNE MWALIMU      JOANES PETER LUGAZIA        1
3    HUWAIDA IDRISSA  JUMBE     HAMIS JUMA IDD  ISAKA        0
4          AIDANIA LUAMBANO   EDWIN MARTIN  MUHONDEZI        0
5      KESSY BONIFAS FULANO      RICHARD THOMAS MLIWA        0
6    KENEDY  STEPHEN MSHOMI           JUMANNE MWALIMU        0
7      JOANES PETER LUGAZIA  ISAAC RUGEMALILA ABRAHAM        1
8  MWANAISHA MOHAMED MUNGIA        ZAITUN SALUM MGAZA        0
9    PETRO ZACHARIA MAGANGA  STEPHEN STAFFORD MIHUNGO        0

CodePudding user response:

Use np.where

import numpy as np

df["RESULTS"] = np.where(df["COLUMN 1"]==df["COLUMN 2"], 1, 0)

https://numpy.org/doc/stable/reference/generated/numpy.where.html

  • Related