Home > Software design >  How to define a correct index by constructing simple pandas Series?
How to define a correct index by constructing simple pandas Series?

Time:11-30

I have the following python dictionary:

sdata ={'Ohio': 35000, 'Oregon': 16000, 'Texas': 71000, 'Utah': 5000}

Suppose I want to create pandas Series from this dictionary. For some reasons, I want to construct the Series with additional columns:

states = ['California', 'Damascus','Ohio', 'Oregon', 'Texas','Regensburg', 'Munich']
obj4 = pd.Series(sdata, index=states)
obj4

And the output will be:

California        NaN
Damascus          NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
Regensburg        NaN
Munich            NaN
dtype: float64

In this case, 3 values found in sdata were placed in the appropriate locations, but since no value for California, Damascus, Regensburg, and Munich were found, they appears as NaN. In other words, an index without corresponding value in sdata will appear as NaN.

However, it does not work when I am trying to create Series from a list:

labels = ['Covid', 'Delta', 'Omicron', 'Mu']
obj2 = pd.Series([1.5, -2.5, 0], index=labels) 
obj2

The error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-87-3f289c72627f> in <module>()
      1 # use the above created index object as an index in this Serie
----> 2 obj2 = pd.Series([1.5, -2.5, 0], index=labels)
      3 obj2

/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    312                     if len(index) != len(data):
    313                         raise ValueError(
--> 314                             f"Length of passed values is {len(data)}, "
    315                             f"index implies {len(index)}."
    316                         )

ValueError: Length of passed values is 3, index implies 4.

I did not understand why I get this message error although it is allowed to create Series with NaN values as in the first case?

Thank you in advance!

CodePudding user response:

Use pd.Series only with dictionary and then add Series.reindex:

obj4 = pd.Series(sdata).reindex(states)

If create by list is necessary same length of index like data list first, e.g. for length of 3 is filtered first 3 values of list labels:

labels = ['Covid', 'Delta', 'Omicron', 'Mu']
obj2 = pd.Series([1.5, -2.5, 0], index=labels[:3]).reindex(labels)
  • Related