Home > Net >  Pandas assing a full dataframe to a subdataframe
Pandas assing a full dataframe to a subdataframe

Time:01-02

I need to assign a 'full' dataframe to a part of another bigger dataframe based on some conditions.

So, I have two dataframes, the first lets say:

import pandas as pd

df_1 = pd.DataFrame({
  'A': [0, 0, 1, 1, 2, 2],
  'B': [1, 2, 3, 4, 5, 6],
  'C': ['a', 'b', 'c', 'd', 'e',  'f']
}) 

and


df_2 = pd.DataFrame({
  'A': [0, 0, 0],
  'B': [5, 5, 6],
  'C': ['z', 'z', 't']
}) 

What I want to do is something like:

df_1.loc[df_1.A == 0][[ 'B', 'C' ]] = df_2[['B', 'C']]

to get in df_1 the values of df_2. The result that I get is that the rows of df_1 wih A == 0 became NaN.

How can I fix this issue? Thanks for the answers.

CodePudding user response:

Your solution working with sample data, because indices matching between first 2 rows of df_1 and df_2, what obviously in real data is not:

df_1.loc[df_1.A == 0, [ 'B', 'C' ]] = df_2[['B', 'C']]

print (df_1)
   A  B  C
0  0  5  z
1  0  5  z
2  1  3  c
3  1  4  d
4  2  5  e
5  2  6  f

For general solution is changed indices, if use solution above get NaNs.

You can filter by count Trues by sum and assign numpy array:

df_1 = pd.DataFrame({
  'A': [0, 0, 1, 1, 2, 2],
  'B': [1, 2, 3, 4, 5, 6],
  'C': ['a', 'b', 'c', 'd', 'e',  'f']
}, index=list('efghik')) 

m = df_1.A == 0
df_1.loc[m, [ 'B', 'C' ]] = df_2[['B', 'C']].iloc[:m.sum()].to_numpy()

print (df_1)
   A  B  C
e  0  5  z
f  0  5  z
g  1  3  c
h  1  4  d
i  2  5  e
k  2  6  f

Another idea is rename indices for matching:

m = df_1.A == 0
df_1.loc[m, [ 'B', 'C' ]] = df_2[['B', 'C']].rename(dict(zip(df_2.index, df_1.index[m])))

print (df_1)
   A  B  C
e  0  5  z
f  0  5  z
g  1  3  c
h  1  4  d
i  2  5  e
k  2  6  f
  • Related