Home > database >  how to replace some columns' values with their log forms and create a new DataFrame in python?
how to replace some columns' values with their log forms and create a new DataFrame in python?

Time:10-25

I have a DataFrame with 4 columns. I want to calculate the log form of three columns and then make a new DataFrame.

    year      gnp   labor   capital
0   1955    114043  8310    182113
1   1956    120410  8529    193745
2   1957    129187  8738    205192
3   1958    134705  8952    215130

this is how my DataFrame looks like. I want to calculate the log values of all the columns except the 'year'. Here is my code:

ln_gnp = np.log(df.gnp)
ln_labor = np.log(df.labor)
ln_capital = np.log(df.capital)

Now, I want to create a new DataFrame with 'year', 'ln_gnp', 'ln_labor', and 'ln_capital'. I have tried some methods but none of them gave me the result that I want.

CodePudding user response:

Considering the following simple dataset:

df = pd.DataFrame({'year':[1,2,3,4,5], 
                  'gnp':[100, 200, 300, 400, 500], 
                  'labor':[1000, 2000, 3000, 4000, 5000],
                  'capital':[1e4, 2e4, 3e4, 4e4, 5e4]},
                 )
df

enter image description here

Here is one of the solutions:

df['ln_gnp'] = np.log(df['gnp'])
df['ln_labor'] = np.log(df['labor'])
df['ln_capital'] = np.log(df['capital'])

df1=df[['year', 'ln_gnp', 'ln_labor', 'ln_capital']].copy()
df1

Output:

enter image description here

CodePudding user response:

here is one way to do it

Simpler approach

# using applymap, take log for the three columns
# concat with the year column

df2=pd.concat([df['year'],
               df[['gnp', 'labor','capital']].applymap(np.log)],
              axis=1)
df2
    year          gnp      labor    capital
0   1955    11.644331   9.025215    12.112383
1   1956    11.698658   9.051227    12.174298
2   1957    11.769016   9.075437    12.231701
3   1958    11.810842   9.099632    12.278998

if you need to use the series you created then

# create a dataframe from the series you already created

df2=pd.DataFrame({'year': df['year'], 'gnp': ln_gnp, 'labor': ln_labor, 'capital' :ln_capital} )
df2

CodePudding user response:

Another possible solution:

df['ln_'   df.columns[1:]] = np.log(df.iloc[:,1:])

Output:

   year     gnp  labor  capital     ln_gnp  ln_labor  ln_capital
0  1955  114043   8310   182113  11.644331  9.025215   12.112383
1  1956  120410   8529   193745  11.698658  9.051227   12.174298
2  1957  129187   8738   205192  11.769016  9.075437   12.231701
3  1958  134705   8952   215130  11.810842  9.099632   12.278998
  • Related