Home > database >  Pandas creating a column after comparing two columns of different sheets
Pandas creating a column after comparing two columns of different sheets

Time:06-15

im working on a project that automaticly using py scripts create finals excel files with sheets.

Now i need to compare a column of current sheet with another column of another sheet and after to create on current sheet a column with return of comparation Yes or NO

CURRENT SHEET: [SvnUsers] firstsheet(curent where need to add)

SECOND SHEET[UserDetails]second sheet

HOW CURRENT SHEET SHOULD LOOK AFTER [SvnUsers]: [output SvnUsers]

So, i need to compare the element of first sheet from column 1 "accountName" with the element from second sheet column1 "Account" if the elements.

If the elements from sheet1 exist in column1 of sheet2, on the created column "Svnaccount?" on sheet1, for the specified element of current sheet from column1, will return Yes if the element exist in secondary sheet or no.

How i started: SCRIPT PHOTO

import pandas as pd
from timestampdirectory import  createdir
import os
import time
def svnanalysis():

    dest = createdir()
    dfSvnUsers = pd.read_excel(os.path.join(dest, "SvnUsers.xlsx"))
    dfSvnGroupMembership = pd.read_excel(os.path.join(dest, "SvnGroupMembership.xlsx"))
    dfSvnRepoGroupAccess = pd.read_excel(os.path.join(dest, "SvnRepoGroupAccess.xlsx"))
    dfsvnReposSize = pd.read_excel(os.path.join(dest, "svnReposSize.xlsx"))
    dfsvnRepoLastChangeDate = pd.read_excel(os.path.join(dest, "svnRepoLastChangeDate.xlsx"))
    dfUserDetails = pd.read_excel(r"D:\GIT-files\Automate-Stats\SVN_sample_files\CM_UsersDetails.xlsx")

    timestr = time.strftime("%Y-%m-%d-")
    xlwriter = pd.ExcelWriter(os.path.join(dest,f'{timestr}Usage-SvnAnalysis.xlsx'))

    dfUserDetails.to_excel(xlwriter, sheet_name='UserDetails',index = False)
    dfSvnUsers.to_excel(xlwriter, sheet_name='SvnUsers', index = False )
    dfSvnGroupMembership.to_excel(xlwriter, sheet_name='SvnGroupMembership', index = False )
    dfSvnRepoGroupAccess.to_excel(xlwriter, sheet_name='SvnRepoGroupAccess', index = False)
    dfsvnReposSize.to_excel(xlwriter, sheet_name='svnReposSize', index = False)
    dfsvnRepoLastChangeDate.to_excel(xlwriter, sheet_name='svnRepoLastChangeDate',index= False)
    xlwriter.close()

    dfUsageSvnAnalysis = pd.read_excel(os.path.join(dest, f'{timestr}Usage-SvnAnalysis.xlsx'), sheet_name='SvnUsers')
    dfUsageSvnAnalysis = pd.read_excel(os.path.join(dest, f'{timestr}Usage-SvnAnalysis.xlsx'), sheet_name='UserDetails')
    dfUsageSvnAnalysis.insert(2,'SVNaccount?',False)
    #here code
       #compare elements of('SvnUsers') sheet column1 with ('SvnUserDetails) column1 sheet
       #In ('SvnUsers') create column [SVNaccount?] and if the elements of column [users] was found on sheet two return Yes if not return NO
    print(dfUsageSvnAnalysis)
svnanalysis()

On # lines should be the right lines

I tried using wk and applying excel formulas but its not good SCRIPT PHOTO OF WK TRY

CodePudding user response:

Thank you for providing such great description, this should do what you want:

import pandas as pd

# Dataframe from which we want to find out whether a pair exists
caller = pd.read_csv('caller.csv').set_index('id')
# Other dataframe we want to search
other = pd.read_csv('other.csv').set_index('id')
# add a column to caller with true if a match in other is found
caller['match'] = caller.index.isin(other.index)
# replace True with 'Yes' and False with 'No'
caller['match'].replace(True, 'Yes', inplace=True)
caller['match'].replace(False, 'No', inplace=True)
# print the result
print(caller)
  • Related