I'm trying to make a calculation on 2 columns using python pandas. Use case is like this:
I have values like 100,101,102... in column "hesapKodu1". I have split this column in 3 columns for the first 3 characters. "hesapkodu1_1" is the first character of "hesapKodu1" so it is "1". "hesapKodu1_2" is the first 2 characters of "hesapkodu1", so it is like "10,11"...
What I'm trying to do is this: When hesapKodu1 is 123 or 125 or 130 I would like to make calculation for columns BORC and ALACAK: it will be like BORC - ALACAK.
But for the other hesapKodu1 values it will be ALACAK - BORC.
And at the end all of the results should be summed as a return.
Right now my code is like this. And this code can only do BORC - ALACAK when hesapKodu1 starts with 1. I cannot find a way to iterate through upper conditions.
source_df['hesapKodu1_1']=source_df['hesapKodu1'].str[:1]
source_df['hesapKodu1_2']=source_df['hesapKodu1'].str[:2]
source_df['hesapKodu1_3']=source_df['hesapKodu1'].str[:3]
hk1 = round(source_df.loc[source_df['hesapKodu1_1'] == '1', 'BORÇ'].sum() - source_df.loc[source_df['hesapKodu1_1'] == '1', 'ALACAK'].sum(),2)
h
CodePudding user response:
You can make use of np.where()
which would be faster than apply()
.
import numpy as np
source_df["new_column"] = np.where(
source_df["hesapKodu1"].isin(["123", "125", "130"]),
source_df["BORC"] - source_df["ALACAK"],
source_df["ALACAK"] - source_df["BORC"],
)
CodePudding user response:
You can use apply for that -
def my_func(record):
if source_df['hesapKodu1'] in ['123', '125', '130']
record['new_column'] = record['BORC'] - record['ALACAK']
else:
record['new_column'] = record['ALACAK'] - record['BORC']
return record
target_df = source_df.apply(my_func, axis=1)