I have 3 excels A,B and C with some data as below.
Characteristics of data in Excel A,B
Each title is unique
For each title, there is a corresponding ID. Please find sample data below.
Excel A Excel B > > **New_Title** **New_ID** **Existing_Title** **Existing_ID** > Title1 W-1 Title3 W-3 > Title2 W-2 Title4 W-4
Excel C
> > **IncidentTitle** > Title1 Title1 Title1 Title2 Title2 Title3 Title4
Excel C can have multiple rows with same title.
I am trying to compare data in A, B to C and write ID column to excel C. So that for each row that has same Title, it gets written with respective ID. For eg: Title1 is appeared 3 times in Excel C, so it's ID: W-1 has to be written thrice in the Excel C and so on. I have tried with some approaches but they were unsuccessful. Any help in identifying where I am going wrong is appreciated.
for i in range(length):
\\Omitted some lines of code here.
for j in range(c_len): //length of excel C.
Title = id_map.at[j, 'IncidentTitle'] //id_map is the dataframe containing the data of Excel C
if Title == New_Title:
Jira_ID.insert(j, New_ID)
elif Title == Existing_Title:
Jira_ID.insert(j, Existing_ID)
else:
continue // Can be any placeholder.
CodePudding user response:
Defining the dataframes:
dfA = pd.DataFrame({
'New_Title': ['Title1', 'Title2'],
'New_ID': ['W-1', 'W-2'],
})
dfB = pd.DataFrame({
'Existing_Title': ['Title3', 'Title4'],
'Existing_ID': ['W-3', 'W-4'],
})
dfC = pd.DataFrame({
'IncidentTitle': ['Title1', 'Title1', 'Title1', 'Title2', 'Title2', 'Title3', 'Title4']
})
First we can union the dfA
and dfB
by pd.concat
,
dfAB = pd.concat([
dfA.rename(columns={'New_Title': 'Title', 'New_ID': 'ID'}),
dfB.rename(columns={'Existing_Title': 'Title', 'Existing_ID': 'ID'}),
])
print(dfAB)
Title ID
0 Title1 W-1
1 Title2 W-2
0 Title3 W-3
1 Title4 W-4
Then one of the pandas ways to do your job is by merge
, it finds matching keys among the two dataframes dfC
and dfAB
, where 'IncidentTitle'
is used for dfC
and 'Title'
is used for dfAB
, and in case there is a match it merges data from dfAB
to dfC
(this merging direction is by using how='left'
)
dfC.merge(dfAB, left_on='IncidentTitle', right_on='Title', how='left').drop(columns='Title')
Final outcome:
IncidentTitle ID
0 Title1 W-1
1 Title1 W-1
2 Title1 W-1
3 Title2 W-2
4 Title2 W-2
5 Title3 W-3
6 Title4 W-4
Note that if you have duplicated Titles in dfAB
, the outcome may not be as expected.