Home > OS >  Transform a specific nested list to a pandas dataframe
Transform a specific nested list to a pandas dataframe

Time:06-27

My nested list looks like:

[['NP-00002',
  Motor1    0.126878
  Lpi           0.099597
  dtype: float64],
 ['NP-00067',
  Health    0.253135
  Travel     0.157896
  dtype: float64],
 ['LE-00035',
  Train      0.134382
  Property    0.126089
  dtype: float64],
 ['NP-00009',
  Start    0.171959
  Casco    0.163557
  dtype: float64]]

I would like my data to be in 3 columns in a pandas dataframe (dtype: float64 is dropped). I have a problem with ' ' separation, also with .astype(str).

Example for 1st item in nested list (2 rows outputed):

1st column  2nd column  3rd column
NP-00002    Motor1      0.126878
NP-00002    Lpi         0.099597

CodePudding user response:

Use pd.concat:

df = (pd.concat(dict(lst)).rename_axis(['Type', 'Property'])
        .rename('Value').reset_index())
print(df)

# Output
       Type  Property     Value
0  NP-00002    Motor1  0.126878
1  NP-00002       Lpi  0.099597
2  NP-00067    Health  0.253135
3  NP-00067    Travel  0.157896
4  LE-00035     Train  0.134382
5  LE-00035  Property  0.126089
6  NP-00009     Start  0.171959
7  NP-00009     Casco  0.163557

CodePudding user response:

In reality I found out that I had problems with too many spaces that I did not see in the pandas dataframe. The way I solved it was not that elegant, but it works.

list_output = pd.DataFrame(n_largest, columns=["Policyholder", "Recommendation"])

list_output["Recommendation"] = list_output["Recommendation"].astype(str)
list_output["Recommendation"] = list_output["Recommendation"].str.replace('\n',' ', regex=True)
list_output["Recommendation"] = list_output["Recommendation"].str.replace('dtype: float64',' ', regex=True)
list_output["Recommendation"] = list_output["Recommendation"].replace(r'\s ', ' ', regex=True)

output = pd.concat([list_output["Policyholder"],list_output["Recommendation"].str.split(' ', expand=True)], axis=1)

So in the end my output looks a bit different, which is still fine

   Policyholder  Property1   Value1    Property2   Value2
0  NP-00002      Motor1      0.126878  Lpi         0.099597
1  NP-00067      Health      0.253135  Travel      0.157896

Thank you for all the help!

  • Related