I am trying to merge two dataframes that does not have equal number of rows or columns. It results in NaN
values. I want to fill this NaN
values with the previous value in the column.
import pandas as pd
import numpy as np
dflist = [[1, "a", "b"], [2, "a", "b"], [3, "a", "b"]]
df = pd.DataFrame(dflist)
dflist1 = [[1, "a", "b", "c", "e"], [1, "a", "b", "c", "e"], [2, "a", "b", "c", "e"], [3, "a", "b", "c", "e"], [1, "a", "b", "c", "e"],[4, "a", "b", "c", "e"], [5, "a", "b", "c", "e"]]
df1 = pd.DataFrame(dflist1)
df.columns = ["col1", "col2", "col3"]
df1.columns = ["col1", "col21", "col31", "col45", "col56"]
result = pd.merge(df1, df, how='outer')
print(result)
It results in
col1 col21 col31 col45 col56 col2 col3
0 1 a b c e a b
1 1 a b c e a b
2 1 a b c e a b
3 2 a b c e a b
4 3 a b c e a b
5 4 a b c e NaN NaN
6 5 a b c e NaN NaN
But the desired table should be filled with previous values of the NaN
,
col1 col21 col31 col45 col56 col2 col3
0 1 a b c e a b
1 1 a b c e a b
2 1 a b c e a b
3 2 a b c e a b
4 3 a b c e a b
5 4 a b c e a b
6 5 a b c e a b
What I tired to do is to get the indices of NaN
values but it is not giving the desired result.
indices = list(np.where(result['col3'].isna()[0]))
print(indices)
Results in [array([], dtype=int64)]
How can this be accomplished?
CodePudding user response:
In this case all you need is the ffill()
method
result = pd.merge(df1, df, how='outer').ffill() # Will give your wanted series