I have a pandas DataFrame as shown below. i want to replace the values for 3Monday to 3Sunday with median.
|wk&day | %dict |
|----------|---------|
|1Friday | 6 |
|1Monday | 6 |
|1Saturday | 1.7 |
|1Sunday |1.698750 |
|1Thursday |6.538169 |
|1Tuesday |7.117872 |
|1Wednesday|6.545507 |
|2Friday |2.3775163|
|2Monday |2.843480 |
|2Saturday |1.918890 |
|2Sunday |1.7391091|
|2Thursday |2.5646356|
|2Tuesday |2.7847760|
|2Wednesday|2.7921140|
|3Friday |2.8728322|
|3Monday |2.7994520|
|3Saturday |2.2601081|
|3Sunday |1.7170951|
|3Thursday |2.8581562|
|3Tuesday |2.8838392|
|3Wednesday|2.6526918|
if i want to replace the value for 3Monday in the same dataframe for that i need to calculate median of 1Monday,2Monday. 3monday = median(1Monday,2Monday).
CodePudding user response:
Use:
#extract digit(s) starting column wk&day and names of days
df1 = df['wk&day'].str.extract(r'^(\d )(\D )')
#filter 1,2 rows and count median per names of days
s = df.groupby(df1.loc[df1[0].isin(['1','2']), 1])['%dict'].median()
print (s)
1
Friday 4.188758
Monday 4.421740
Saturday 1.809445
Sunday 1.718930
Thursday 4.551402
Tuesday 4.951324
Wednesday 4.668811
Name: %dict, dtype: float64
#convert wk&day to index, so possible selec by start and end position
#replace by medians
df = df.set_index('wk&day')
df.loc['3Monday':'3Sunday', '%dict'] = s.add_prefix('3')
df = df.reset_index()
print (df.tail(7))
wk&day %dict
14 3Friday 2.872832
15 3Monday 4.421740
16 3Saturday 1.809445
17 3Sunday 1.718930
18 3Thursday 2.858156
19 3Tuesday 2.883839
20 3Wednesday 2.652692