I have the following python dictionary:
sdata ={'Ohio': 35000, 'Oregon': 16000, 'Texas': 71000, 'Utah': 5000}
Suppose I want to create pandas Series from this dictionary. For some reasons, I want to construct the Series with additional columns:
states = ['California', 'Damascus','Ohio', 'Oregon', 'Texas','Regensburg', 'Munich']
obj4 = pd.Series(sdata, index=states)
obj4
And the output will be:
California NaN
Damascus NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
Regensburg NaN
Munich NaN
dtype: float64
In this case, 3 values found in sdata were placed in the appropriate locations, but since no value for California
, Damascus
, Regensburg
, and Munich
were found, they appears as NaN
.
In other words, an index without corresponding value in sdata
will appear as NaN
.
However, it does not work when I am trying to create Series from a list:
labels = ['Covid', 'Delta', 'Omicron', 'Mu']
obj2 = pd.Series([1.5, -2.5, 0], index=labels)
obj2
The error message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-87-3f289c72627f> in <module>()
1 # use the above created index object as an index in this Serie
----> 2 obj2 = pd.Series([1.5, -2.5, 0], index=labels)
3 obj2
/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
312 if len(index) != len(data):
313 raise ValueError(
--> 314 f"Length of passed values is {len(data)}, "
315 f"index implies {len(index)}."
316 )
ValueError: Length of passed values is 3, index implies 4.
I did not understand why I get this message error although it is allowed to create Series with NaN
values as in the first case?
Thank you in advance!
CodePudding user response:
Use pd.Series
only with dictionary and then add Series.reindex
:
obj4 = pd.Series(sdata).reindex(states)
If create by list is necessary same length of index like data list first, e.g. for length of 3
is filtered first 3 values of list labels
:
labels = ['Covid', 'Delta', 'Omicron', 'Mu']
obj2 = pd.Series([1.5, -2.5, 0], index=labels[:3]).reindex(labels)