df1:
Customer Manager Time Period Amount
Samsung Rosalie 90D 46726190
Samsung Rosalie 18M 46726190
Samsung Rosalie 18M 46726190
Samsung Rosalie 18M 46726190
Samsung Rosalie 30D 46726190
Samsung Rosalie 30D 46726190
Apple Abir 24M 359233179
Apple Abir 30D 359233179
Apple Abir 30D 25000000
Nokia Abir 90D 571711209
Nokia Abir 24M 571711209
Nokia Abir 24M -284461
Nokia Abir 1M 571711209
Nokia Abir 1M 49715539
Google Michael 90D 49850165
Google Michael 12M 49850165
Google Michael 12M 49850165
Google Michael 12M 33048028
Google Michael 12M 49850165
Google Michael 12M 33048028
Blackberry Zec 90D 27975518
Blackberry Zec 18M 27975518
Blackberry Zec 30D 27975518
Blackberry Zec 30D 27975518
Expected Output
df1:
Customer Manager Time Period Amount
Samsung Rosalie Navarrete 90D 46726190
Samsung Rosalie Navarrete 18M 46726190
Samsung Rosalie Navarrete 18M 46726190
Samsung Rosalie Navarrete 18M 46726190
Samsung Rosalie Navarrete 30D 46726190
Samsung Rosalie Navarrete 30D 46726190
Apple Abir Paul 24M 359233179
Apple Abir Paul 30D 359233179
Apple Abir Paul 30D 25000000
Nokia Abir Paul 90D 571711209
Nokia Abir Paul 24M 571711209
Nokia Abir Paul 24M -284461
Nokia Abir Paul 1M 571711209
Nokia Abir Paul 1M 49715539
Google MichaelZec 90D 49850165
Google MichaelZec 12M 49850165
Google MichaelZec 12M 49850165
Google MichaelZec 12M 33048028
Google MichaelZec 12M 49850165
Google MichaelZec 12M 33048028
How to add a blank row after each of the customer in dataframe, as shown in expected output?
Tried Code:
for index, row in df.iterrrows(): if df.loc[index,'Customer Code'] != df.loc[index 1,'Customer Code'] and not(pd.isna(df.iloc[index,'Customer Code'])) and not(pd.isna(df.iloc[index 1,'Type'])) df.loc[index 1] = pd.Series([np.nan,np.nan, np.nan, np.nan])
CodePudding user response:
if you have range index
Customer Manager Time Period Amount
0 Samsung Rosalie 90D 46726190 NaN
1 Samsung Rosalie 18M 46726190 NaN
2 Samsung Rosalie 18M 46726190 NaN
3 Samsung Rosalie 18M 46726190 NaN
4 Samsung Rosalie 30D 46726190 NaN
5 Samsung Rosalie 30D 46726190 NaN
6 Apple Abir 24M 359233179 NaN
7 Apple Abir 30D 359233179 NaN
get index of groups tail(1) and 0.5, declare
new_idx
add
new_idx
todf1.index
reindex by union of index (make NaN rows)
finally reset_index
new_idx = df1.groupby('Customer').tail(1).index 0.5
df1.reindex(index=df1.index.union(new_idx)).reset_index(drop=True)
CodePudding user response:
Create a dictionary of Customer dfs. Iterate keys filter with .query() and append a NaN row. Append Customer df to dfs empty list. Finally concat list of dfs, fillna, reset index and drop last row.
df_mapping = {x: pd.DataFrame for x in df["Customer"].unique().tolist()}
dfs = []
for key in df_mapping.keys():
df_mapping[key] = df.query("Customer.eq(@key)").append(pd.Series(dtype="object"), ignore_index=True)
dfs.append(df_mapping[key])
final_df = pd.concat(dfs).fillna("").reset_index(drop=True).iloc[:-1, :]
print(final_df)
Customer Manager Time Period Amount
0 Samsung Rosalie 90D 46726190.0
1 Samsung Rosalie 18M 46726190.0
2 Samsung Rosalie 18M 46726190.0
3 Samsung Rosalie 18M 46726190.0
4 Samsung Rosalie 30D 46726190.0
5 Samsung Rosalie 30D 46726190.0
6
7 Apple Abir 24M 359233179.0
8 Apple Abir 30D 359233179.0
9 Apple Abir 30D 25000000.0
10
11 Nokia Abir 90D 571711209.0
12 Nokia Abir 24M 571711209.0
13 Nokia Abir 24M -284461.0
14 Nokia Abir 1M 571711209.0
15 Nokia Abir 1M 49715539.0
16
17 Google Michael 90D 49850165.0
18 Google Michael 12M 49850165.0
19 Google Michael 12M 49850165.0
20 Google Michael 12M 33048028.0
21 Google Michael 12M 49850165.0
22 Google Michael 12M 33048028.0
23
24 Blackberry Zec 90D 27975518.0
25 Blackberry Zec 18M 27975518.0
26 Blackberry Zec 30D 27975518.0
27 Blackberry Zec 30D 27975518.0