I'm new to programming and python,
I'm trying to create a function to iterate over a dataframe and directly store results from the function to dataframe, so far here is what I've done:
def principal_loop2(df, col1, col2, col3):
for i, row in df_principal.iterrows():
balance = row[col1]
Suku_bunga = row[col2]
terms = int(row[col3])
periode = range(1, terms 1)
if balance > 0:
p = npf.ppmt(
rate=Suku_bunga/12, per=periode, nper=terms, pv=-balance
)
return (p)
after running it I'm able to get the NumPy array from p and store it to a variable then transform it into dataframe, but that's only work in the first datapoint since the return exits the function after it satisfies the first condition. What alternative I can do so Im able to get all the results from the function as a NumPy array or directly save it to dataframe
thank you
CodePudding user response:
When making the transition to DataFrames, it's important not to hold too tightly to programming patterns you use with things like lists and dicts.
In this case, you're iterating over the rows of a DataFrame as if it is a list - that's not illegal or anything, but in making this transition to DataFrames you really want to be running operations like this in a single line of code.
In this case, there are two things you want to do:
- move the code currently in your loop into a function that will take as its input the DataFrame - or more specifically, every row of the DataFrame.
- use the apply() method on the DataFrame to apply that function to every row in the DataFrame, as a single line of code.
So for the function, you'll have something like this:
def someLambdaFunc(row):
balance = row[col1]
Suku_bunga = row[col2]
terms = int(row[col3])
periode = range(1, terms 1)
if balance > 0:
p = npf.ppmt(
rate=Suku_bunga/12, per=periode, nper=terms, pv=-balance
)
return (p)
And your one liner, which is not obvious for someone new to Pandas, is:
df["NewValue"] = df.apply(lambda x: someLambdaFunc(x), axis=1)
This one liner just says, apply this function to the DataFrame, and the key bit, apply it to each row (by specifying axis=1).
CodePudding user response:
Here's what you need -
My data is made up, so the values might not be representative of the problem you are trying to solve, but it will work with the dataframe you have given the columns are located at the same index as assumed in solution below.
# payment against loan principal
# numpy_financial.ppmt(rate, per, nper, pv, fv=0, when='end')
data = {
"Suku_bunga": [0.018, 0.018, 0.018, 0.018, 0.018, 0.018],
"periode": [10, 10, 10, 10, 10, 10],
"terms": [10, 10, 10, 10, 10, 10],
"balance": [10000, 9000, 8000, 7000, 6000, 0]
}
data = pd.DataFrame(data)
import numpy_financial as npf
get_principal = lambda x: npf.ppmt(rate=x[0]/12, per=x[1], nper=x[2], pv=-x[3]) if x[3] > 0 else None
# where x[0] = Suku_bunga, x[1] = periode, x[2] = terms, x[3] = balance
data["principal"] = data.apply(get_principal, axis=1)
data
# Output
# Suku_bunga periode terms balance principal
# 0.018 10 10 10000 1006.758411
# 0.018 10 10 9000 906.082570
# 0.018 10 10 8000 805.406729
# 0.018 10 10 7000 704.730888
# 0.018 10 10 6000 604.055047
# 0.018 10 10 0 NaN