first of all, I'm quite new to programming overall (< 2 Months), so I'm sorry if that's an 'simple, no need to ask for help, try it yourself until you get it done' problem.
I have two data-frames with partially the same content (general overview of mobile-numbers including their cost centers in the company and monthly invoices with the affected mobile-numbers and their invoice amount).
I'd like to compare the content of the 'mobile-numbers' column of the monthly invoices DF to the content of the 'mobile-numbers' column of the general overview DF and if matching, assign the respective cost center to the mobile-number in the monthly invoices DF.
I'd love to share my code with you, but unfortunately I have absolutely zero clue how to solve that problem in any way.
Thanks
Edit: I'm from germany, I tried my best to explain the problem in english. If there is anything I messed up (so u dont get it) just tell me :)
CodePudding user response:
program meets your needs, in the second dataframe I put the value '40' to demonstrate that the dataframes already filled will not be zeroed, the replacement will only occur if there is a similar value between the dataframes, if you want a better explanation about the program , comment below, and don't forget to vote and mark as solved, I also put some 'prints' for a better view, but in general they are not necessary
import pandas as pd
general_df = pd.DataFrame({"mobile_number": [1234,3456,6545,4534,9874],
"cost_center": ['23F','67F','32W','42W','98W']})
invoice_df = pd.DataFrame({"mobile_number": [4534,5567,1234,4871,1298],
"invoice_amount": ['19,99E','19,99E','19,99E','19,99E','19,99E'],
"cost_center": ['','','','','40']})
print(f"""GENERAL OVERVIEW DF
{general_df}
________________________________________
INVOICE DF
{invoice_df}
_________________________________________
INVOICE RESULT
""")
def func(line):
t = 0
for x in range(0, len(general_df['mobile_number'])):
t = general_df.loc[general_df['mobile_number'] == line[0]]
if t.empty:
return line[2]
else:
return t.values.tolist()[0][1]
invoice_df['cost_center'] = invoice_df.apply(func, axis = 1)
print(invoice_df)