Home > Mobile >  Add series as new column to dataframe using specific index level
Add series as new column to dataframe using specific index level

Time:09-11

I have this series

n_position
0      0.000000e 00
1      9.960938e-07
2      2.001953e-06
3      2.998047e-06
4      4.003906e-06
           ...     
329    3.289941e-04
330    3.300000e-04
331    3.309961e-04
332    3.320020e-04
333    3.329980e-04
Name: Distance (m), Length: 334, dtype: float64

and this data frame

                                           x (m)     y (m)     z (m)  ...
n_position n_trigger n_channel n_pulse                                ...
0          0         1         1       -0.002926  0.001314  0.071339  ...
                               2       -0.002926  0.001314  0.071339  ...
                     4         1       -0.002926  0.001314  0.071339  ...
                               2       -0.002926  0.001314  0.071339  ...
           1         1         1       -0.002926  0.001314  0.071339  ...
...                                          ...       ...       ...  ...
333        109       4         2       -0.002926  0.001647  0.071339  ...
           110       1         1       -0.002926  0.001647  0.071339  ...
                               2       -0.002926  0.001647  0.071339  ...
                     4         1       -0.002926  0.001647  0.071339  ...
                               2       -0.002926  0.001647  0.071339  ...

[148296 rows x 36 columns]

I want to add the series as a column following the n_position index level. I am trying with

df[series.name] = series

but this adds the column with all the values NaN. Why? And, how can this be done?

CodePudding user response:

join on n_position

df.join(s, on='n_position')

CodePudding user response:

You can use a map after extracting the index level, converting to Series:

df['new'] = pd.Series(df.index.get_level_values('n_position')).map(s).to_numpy()

output (slightly modified):

                                            x(m)      y(m)      z(m)  ...           new
n_position n_trigger n_channel n_pulse                                                 
0          0         1         1       -0.002926  0.001314  0.071339  ...  0.000000e 00
                               2       -0.002926  0.001314  0.071339  ...  0.000000e 00
                     4         1       -0.002926  0.001314  0.071339  ...  0.000000e 00
                               2       -0.002926  0.001314  0.071339  ...  0.000000e 00
1          1         1         1       -0.002926  0.001314  0.071339  ...  9.960938e-07

CodePudding user response:

This normally happens if the index can't be machted. Sometimes because the datatype is not correct and in your case, because the DataFrame df has a MultiIndex and s does not.

Example

Producing Nones

import pandas as pd
df = pd.DataFrame(
    {'a':[1,2,3,4]},
    index=pd.MultiIndex.from_product([['a', 'b'], ['A', 'B']], names=['small', 'captial'])
)
s = pd.Series([5,5,5,5], name='test')
df[s.name]=s
>>> df
               a  test
small captial         
a     A        1   NaN
      B        2   NaN
b     A        3   NaN
      B        4   NaN

Making the index the same befor adding the Series to the DataFrame gives

import pandas as pd
df = pd.DataFrame(
    {'a':[1,2,3,4]},
    index=pd.MultiIndex.from_product([['a', 'b'], ['A', 'B']], names=['small', 'captial'])
)
s = pd.Series([5,5,5,5], name='test')
s.index = df.index
df[s.name]=s
>>>df
               a  test
small captial         
a     A        1     5
      B        2     5
b     A        3     5
      B        4     5

Comment

Setting the index of the Series to the index of the DataFrame is not always the best solution. Another option is to make the MultiIndex a Single index using reset_index().

CodePudding user response:

Another option:

df = df.merge(series, left_index=True, right_index=True)
  • Related