Home > Software engineering >  replace empty strings in a dataframe with values from another dataframe iwth a different index
replace empty strings in a dataframe with values from another dataframe iwth a different index

Time:02-27

two sample dataframes with different index values, but identical column names and order:

df1 = pd.DataFrame([[1, '', 3], ['', 2, '']], columns=['A', 'B', 'C'], index=[2,4])

df2 = pd.DataFrame([[1, '', 3], ['', 2, '']], columns=['A', 'B', 'C'], index=[7,9]) df1

    A   B   C
2   1       3
4       2   

df2

   A    B   C
7       4   
9   5       6

I know how to concat the two dataframes, but that gives this:

   A    B   C
2   1       3
4       2   

Omitting the non=matching indexes from the other df

result I am trying to achieve is:

    A   B   C
0   1   4   3   
1   5   2   6

I want to combine the rows with the same index values from each df so that missing values in one df are replaced by the corresponding value in the other.

Concat and Merge are not up to the job I have found.
I assume I have to have identical indexes in each df which correspond to the values I want to merge into one row. But, so far, no luck getting it to come out correctly. Any pandas transformational wisdom is appreciated.

This merge attempt did not do the trick:

df1.merge(df2, on='A', how='outer')

The solutions below were all offered before I edited the question. My fault there, I neglected to point out that my actual data has different indexes in the two dataframes.

CodePudding user response:

Let us try mask

out = df1.mask(df1=='',df2)
Out[428]: 
   A  B  C
0  1  4  3
1  5  2  6

CodePudding user response:

for i in range(df1.shape[0]):
    for j in range(df1.shape[1]):
        if df1.iloc[i,j]=="":
            df1.iloc[i,j] = df2.iloc[i,j]

print(df1)


    A   B   C
0   1   4   3
1   5   2   6

CodePudding user response:

Since the index of your two dataframes are different, it's easier to make it into the same index.

index = [i for i in range(len(df1))]
df1.index = index
df2.index = index

ddf = df1.replace('',np.nan)).fillna(df2)
  • Related