I have problem that I have been trying to find a solution for. You would think that it wouldn't be that hard to figure out.
I have a pandas DataFrame with the below format:
Id Name Now Then There Sold Needed
0 1 Caden 8.1 3.40 3.95 NaN NaN
1 7 Bankist NaN 2.45 2.20 NaN NaN
2 1 Artistes 8.1 3.40 3.95 NaN NaN
0 1 NaN NaN NaN NaN 33.75 670,904
1 7 NaN NaN NaN NaN 33.75 670,904
I would like to have the DataFrame merge its rows based on the 'Id' column so that it looks like this:
Id Name Now Then There Sold Needed
0 1 Caden 8.1 3.40 3.95 33.75 670,904
1 7 Bankist NaN 2.45 2.20 33.75 670,904
2 1 Artistes 8.1 3.40 3.95 33.75 670,904
As you can see, the 'Id' column has two Id# 1 that each have a unique 'Name'. I have not been able to figure out how to ask the question that might provide some sample code. So far I have tried different methods, and have failed, including different combinations of merge, join, and concat. The best result has lead to the current DataFrame with NaN values.
I am trying to accomplish having the 'Sold' and 'Needed' columns (which have only one value) aligned with the appropriate 'Id' row when there are repeating Ids.
CodePudding user response:
here is one way to do it
# using groupby on Id, backfill the Sold and Needed where values are null
df[['Sold','Needed']] = df.groupby(['Id'], as_index=False)[['Sold','Needed']].bfill()
# drop the rows that has Null in a name
out=df.dropna(subset='Name')
out
Id Name Now Then There Sold Needed
0 1 Caden 8.1 3.40 3.95 33.75 670,904
1 7 Bankist NaN 2.45 2.20 33.75 670,904
2 1 Artistes 8.1 3.40 3.95 33.75 670,904
CodePudding user response:
Make a copy, separate the columns, drop rows where all values are nan, then merge:
*Assuming your dataframe is df1
df2=df1.copy()
df1.drop(['Sold', 'Needed'],axis=1,inplace=True)
df2.drop(['Name', 'Now', 'Then', 'There'],axis=1,inplace=True)
df1.dropna(subset="Name","Now","Then","There"],inplace=True,how='all',axis='rows')
df2.dropna(subset=["Sold","Needed"], inplace=True, how='all',axis='rows')
newdf=df1.merge(df2,how='left',left_on='Id',right_on='Id')