dfA = pd.read_excel("DocumentA.xlsx")
dfB = pd.read_excel("DocumentB.xlsx")
dfB_compare = dfB[dfB["Status"]=="Vaccinated"]
dfA_userID = dfA["userID"].tolist()
dfB_userID = dfB_compare["userID"].tolist()
eligible_users.to_excel("eligible_users.xlsx", index=False)
I believe these are the steps I need to take.
Steps
- Filter ExcelDataFrame_A by vaccination status == ‘Vaccinated’.
- Create new data frame (dfA) containing the vaccinated.
- Determine which users (userID) in dfA are present in ExcelDataFrame_B
- Create new DataFrame (dfA_present_in_B) of the rows that have a userID present in ExcelDataFrame_B.
CodePudding user response:
import pandas as pd
documentA = pd.read_excel('documentA.xlsx')
documentB = pd.read_excel('documentB.xlsx')
list_of_mutual = []
for i in documentA['name']:
print (i)
for j in documentB['name']:
print (j)
if i == j:
list_of_mutual.append(j)
documentC = pd.DataFrame(list_of_mutual, columns = ['mutual'])
CodePudding user response:
I believe I figured it out.
import pandas as pd
### Creates dataframe objects of Excel documents for analysis
documentA = pd.read_excel("documentA.xlsx")
documentB = pd.read_excel("documentB.xlsx")
### Filters Vaccination dataframe by those who are vaccinated
vaccinated_users = documentB[documentB.Status.eq('Vaccinated')]
vaccinated_users_ID = vaccinated_users['UserID']
### Filters users that are in documentA by those who also have been
### vaccinated and keeps data fields (rows) from documentA
registered_and_vaccinated = documentA[documentA['UserID'].isin(vaccinated_users_ID)]
### Creates new Excel Document for Vaccinated users titled "vaccinated_users.xlsx"
vaccinated_users.to_excel("vaccinated_users.xlsx", index=False)
### Creates new Excel Document for Registered AND Vaccinated titled "registered_and_vaccinated.xlsx"
registered_and_vaccinated.to_excel("registered_and_vaccinated.xlsx", index=False)