Home > database >  If column's items doesn't equal another column's items, append to a new column
If column's items doesn't equal another column's items, append to a new column

Time:02-04

Let's say I have a CSV invites.csv:

Email Invite                    Email Denied                                                                                       
[email protected]                [email protected]                       
[email protected]             [email protected]             
[email protected]              [email protected]           
[email protected]                           
[email protected]                             
[email protected]   

I want to compare both columns and have a new column made, Emails Left, with just the emails that are not in the Email Denied column. An output like this:

Email Invite                    Email Denied               Emails Left                                                                                 
[email protected]                [email protected]           [email protected]               
[email protected]             [email protected]          [email protected]   
[email protected]              [email protected]        [email protected]   
[email protected]                                          [email protected]
[email protected]                             
[email protected]
[email protected]       

This is my code so far:

import pandas as pd

Dir='invites.csv'

df = pd.read_csv(Dir)
df = pd.DataFrame(df)

a = len(df['Email invite'])
aList = []

for i in range(a):
    if df['Email invite'][i] != df['Email Denied'][i]:
        aList.append(df['Email Invite'][i])  

#place list as third column df['Emails Left']

CodePudding user response:

I figured it out.

I had to make the column rows match each other first before doing the IF statement.

df = pd.read_csv(Dir)
df = pd.DataFrame(df)

df = pd.merge(df[["Email Invite"]],
              df[['Email Denied']],
              left_on='Email Invite',
              right_on='Email Denied',
              how='left')

This then leaves the DataFrame looking like this:

Email Invite                    Email Denied                                                                                       
[email protected]                [email protected]                       
[email protected]                          
[email protected]                         
[email protected]               [email protected]             
[email protected]                             
[email protected]             [email protected]

I continue to my FOR loop and IF statement:

a = len(df['Email invite'])
aList = []

for i in range(a):
    if df['Email invite'][i] != df['Email Denied'][i]:
        aList.append(df['Email Invite'][i])  

df['Emails Left'] = pd.Series(aList)

Now I have my extra column:

Email Invite                  Email Denied           Emails Left                                                                                      
[email protected]              [email protected]       [email protected]                
[email protected]                                  [email protected]  
[email protected]                                   [email protected]
[email protected]             [email protected]              
[email protected]                             
[email protected]           [email protected] 

And I can now transfer it to a new CSV.

df.to_csv("NewInvite.csv", index=False)

Now the program is running good.

CodePudding user response:

It is not recommended to use dataframes with columns of different lengths. You have to pad out remaining elements with NaN or ' ' or something to make them equal lengths. List would work better here:

import pandas as pd

df = pd.DataFrame({'Email Invite': ['[email protected]', '[email protected]', '[email protected]','[email protected]' ,'[email protected]' ,'[email protected]'], 'Email Denied': ['[email protected]', '[email protected]' ,'[email protected]', '','','']})


email_invited = list(df['Email Invite'])
email_denied = list(df[df['Email Denied']!='']['Email Denied'])

email_left = [email for email in email_invited if email not in email_denied]
print(email_left)
  • Related