Context

Say I have a multi-indexed dataframe as follows:

import numpy as np
import pandas as pd

arrays = [
    ["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
    ["one", "two", "one", "two", "one", "two", "one", "two"],
]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
data = np.array([
    [1, 2],
    [3, 4],
    [5, 6],
    [7, 8],
    [9, 10],
    [11, 12],
    [13, 14],
    [15, 16],
])
df = pd.DataFrame(data, index=index, columns=('a', 'b'))

which looks something like this:

               a   b
first second        
bar   one      1   2
      two      3   4
baz   one      5   6
      two      7   8
foo   one      9  10
      two     11  12
qux   one     13  14
      two     15  16

I would like to copy the values of column a for the first index level bar into the same column for the first index level qux, aligned on the second level of the index (here called second). In other words, I would like to obtain the following dataframe from the one above:

               a   b
first second        
bar   one      1   2
      two      3   4
baz   one      5   6
      two      7   8
foo   one      9  10
      two     11  12
qux   one      1  14  # <-- column a changed to match first = bar for second = one
      two      3  16  # <-- column a changed to match first = bar for second = two

I understand based on the answer given to this question I can accomplish this by using pd.IndexSlice in conjunction with .loc and .values as follows:

df.loc[pd.IndexSlice['qux', :], 'a'] = df.loc[pd.IndexSlice['bar', :], 'a'].values

I don't intuitively like this (perhaps/probably unjustifiably) as it's not immediately clear to me if the values will always be aligned on the second index level or not:

Question:

Can I guarantee that the above assignment (accessing using .values) will always be aligned on the second level of the multi-index?

If not, is there a way of accomplishing what I'm trying to achieve?

CodePudding user response：

No, it will not be aligned, because by using .value (which, by the way, is deprecated in favor of .to_numpy()), which returns the underlying numpy array, you remove all index/column information, so alignment is not possible.

Here's one solution to preserve the alignment:

df.loc['qux', 'a'] = df.loc['qux', 'a'].index.map(df.loc['bar', 'a'].to_dict())

Output:

>>> df
                 a   b
first second          
bar   two      1.0   2
      one      3.0   4
baz   one      5.0   6
      two      7.0   8
foo   one      9.0  10
      two     11.0  12
qux   one      3.0  14
      two      1.0  16