Home > database >  Trying to take a subset of dataframe rows in document A that have a column value shared with documen
Trying to take a subset of dataframe rows in document A that have a column value shared with documen

Time:12-10

enter image description here

dfA = pd.read_excel("DocumentA.xlsx")
dfB = pd.read_excel("DocumentB.xlsx")
dfB_compare = dfB[dfB["Status"]=="Vaccinated"]
dfA_userID = dfA["userID"].tolist()
dfB_userID = dfB_compare["userID"].tolist()
eligible_users.to_excel("eligible_users.xlsx", index=False)

I believe these are the steps I need to take.

Steps

  1. Filter ExcelDataFrame_A by vaccination status == ‘Vaccinated’.
  2. Create new data frame (dfA) containing the vaccinated.
  3. Determine which users (userID) in dfA are present in ExcelDataFrame_B
  4. Create new DataFrame (dfA_present_in_B) of the rows that have a userID present in ExcelDataFrame_B.

CodePudding user response:

import pandas as pd

documentA = pd.read_excel('documentA.xlsx')
documentB = pd.read_excel('documentB.xlsx')
list_of_mutual = []

for i in documentA['name']:
    print (i)
    for j in documentB['name']:
        print (j)
        if i == j:
            list_of_mutual.append(j)

documentC = pd.DataFrame(list_of_mutual, columns = ['mutual'])

CodePudding user response:

I believe I figured it out.

import pandas as pd

### Creates dataframe objects of Excel documents for analysis
documentA = pd.read_excel("documentA.xlsx")
documentB = pd.read_excel("documentB.xlsx")

### Filters Vaccination dataframe by those who are vaccinated
vaccinated_users = documentB[documentB.Status.eq('Vaccinated')]
vaccinated_users_ID = vaccinated_users['UserID']

### Filters users that are in documentA by those who also have been
### vaccinated and keeps data fields (rows) from documentA

registered_and_vaccinated = documentA[documentA['UserID'].isin(vaccinated_users_ID)]

### Creates new Excel Document for Vaccinated users titled "vaccinated_users.xlsx"
vaccinated_users.to_excel("vaccinated_users.xlsx", index=False)
### Creates new Excel Document for Registered AND Vaccinated titled "registered_and_vaccinated.xlsx"
registered_and_vaccinated.to_excel("registered_and_vaccinated.xlsx", index=False)
  • Related