I have a DataFrame with 4 columns. I want to calculate the log form of three columns and then make a new DataFrame.
year gnp labor capital
0 1955 114043 8310 182113
1 1956 120410 8529 193745
2 1957 129187 8738 205192
3 1958 134705 8952 215130
this is how my DataFrame looks like. I want to calculate the log values of all the columns except the 'year'. Here is my code:
ln_gnp = np.log(df.gnp)
ln_labor = np.log(df.labor)
ln_capital = np.log(df.capital)
Now, I want to create a new DataFrame with 'year', 'ln_gnp', 'ln_labor', and 'ln_capital'. I have tried some methods but none of them gave me the result that I want.
CodePudding user response:
Considering the following simple dataset:
df = pd.DataFrame({'year':[1,2,3,4,5],
'gnp':[100, 200, 300, 400, 500],
'labor':[1000, 2000, 3000, 4000, 5000],
'capital':[1e4, 2e4, 3e4, 4e4, 5e4]},
)
df
Here is one of the solutions:
df['ln_gnp'] = np.log(df['gnp'])
df['ln_labor'] = np.log(df['labor'])
df['ln_capital'] = np.log(df['capital'])
df1=df[['year', 'ln_gnp', 'ln_labor', 'ln_capital']].copy()
df1
Output:
CodePudding user response:
here is one way to do it
Simpler approach
# using applymap, take log for the three columns
# concat with the year column
df2=pd.concat([df['year'],
df[['gnp', 'labor','capital']].applymap(np.log)],
axis=1)
df2
year gnp labor capital
0 1955 11.644331 9.025215 12.112383
1 1956 11.698658 9.051227 12.174298
2 1957 11.769016 9.075437 12.231701
3 1958 11.810842 9.099632 12.278998
if you need to use the series you created then
# create a dataframe from the series you already created
df2=pd.DataFrame({'year': df['year'], 'gnp': ln_gnp, 'labor': ln_labor, 'capital' :ln_capital} )
df2
CodePudding user response:
Another possible solution:
df['ln_' df.columns[1:]] = np.log(df.iloc[:,1:])
Output:
year gnp labor capital ln_gnp ln_labor ln_capital
0 1955 114043 8310 182113 11.644331 9.025215 12.112383
1 1956 120410 8529 193745 11.698658 9.051227 12.174298
2 1957 129187 8738 205192 11.769016 9.075437 12.231701
3 1958 134705 8952 215130 11.810842 9.099632 12.278998