Home > Back-end >  Diagonalizing a pandas DataFrame
Diagonalizing a pandas DataFrame

Time:03-15

Consider the following pandas DataFrame:

import numpy as np
import pandas as pd

df_foo = pd.DataFrame([1,2,3])

I believe I used to be able to diagonalize this DataFrame as follows (see e.g. this thread Diagonalising a Pandas series)

df_foo_diag = pd.DataFrame(np.diag(df_foo), index=df_foo.index, columns = df_foo.index)

However, when I do this now, it seems that np.diag(df_foo) returns a 1 by 1 array containing the first value of the DataFrame. In other words, it seems like numpy extracts the diagonal, instead of constructing a diagonal array.

How can I construct a diagonal DataFrame out of a 1-dimensional DataFrame?

CodePudding user response:

Convert one column Dataframe to Series by DataFrame.squeeze and then your solution working well:

df_foo_diag = pd.DataFrame(np.diag(df_foo.squeeze()), 
                           index=df_foo.index, 
                           columns = df_foo.index)
print (df_foo_diag)
   0  1  2
0  1  0  0
1  0  2  0
2  0  0  3

df_foo = pd.DataFrame([10,20,30])

df_foo_diag = pd.DataFrame(np.diag(df_foo.squeeze()), 
                           index=df_foo.index, 
                           columns = df_foo.index)
print (df_foo_diag)
    0   1   2
0  10   0   0
1   0  20   0
2   0   0  30

CodePudding user response:

It doesn't make much sense to use a 2D input.

Just use the relevant column of your DataFrame, and you'll have the original case:

df_foo_diag = pd.DataFrame(np.diag(df_foo[0]),
                           index=df_foo.index, columns=df_foo.index)

output:

   0  1  2
0  1  0  0
1  0  2  0
2  0  0  3
  • Related