Consider the following example:
a = pd.DataFrame([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]], index=['a', 'b', 'c', 'd', 'e'], columns=['A', 'B'])
b = pd.Series([10], index=['c'])
a.loc['a':'c', 'A'] = b
print(a)
This correctly sets the third value of A. I believe this is also the correct way to set a slice of the dataframe.
A B
a NaN 2.0
b NaN 3.0
c 10.0 4.0
d 4.0 5.0
e 5.0 6.0
Next, consider an example with multi-index.
d = pd.DataFrame([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]], index=pd.MultiIndex.from_tuples([(1, 'a'), (1, 'b'), (1, 'c'), (2, 'd'), (2, 'e')], names=['First', 'Second']), columns=['A', 'B'])
d.loc[1, 'A'] = b
print(d)
This does not correctly set the third value.
A B
First Second
1 a NaN 2
b NaN 3
c NaN 4
2 d 4.0 5
e 5.0 6
[Edit] Here is a more direct example of what the problem is. I would have expected the below to work.
d = pd.DataFrame([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [7, 8]], index=pd.MultiIndex.from_tuples([(1, 'a'), (1, 'b'), (1, 'c'), (2, 'a'), (2, 'b'), (2, 'c')], names=['First', 'Second']), columns=['A', 'B'])
print(d)
# A B
# First Second
# 1 a 1 2
# b 2 3
# c 3 4
# 2 a 4 5
# b 5 6
# c 7 8
d.loc[1, 'A'] = d.loc[2, 'A']
print(d)
# A B
# First Second
# 1 a NaN 2
# b NaN 3
# c NaN 4
# 2 a 4.0 5
# b 5.0 6
# c 7.0 8
How do you set a slice of a dataframe with multi-index?
CodePudding user response:
Index alignment is the reason why the multiindex is not working; for the single index case it was easy to align since they are both single indices; for the MultiIndex, you are aligning the second level of d
with the first level of b
, hence the nulls.
One way about it is to ensure both indices are aligned - for this case a reindex suffices:
d.loc[1, 'A'] = b.reindex(d.index, level = -1)
A B
First Second
1 a NaN 2
b NaN 3
c 10.0 4
2 a 4.0 5
b 5.0 6
c 7.0 8
Use the same concept for the second example in your question:
d.loc[1, 'A'] = d.loc[2, 'A'].reindex(d.index, level = -1)
d
A B
First Second
1 a 4 2
b 5 3
c 7 4
2 a 4 5
b 5 6
c 7 8