Given a DataFrame
df1 :
value mesh
0 10 2
1 12 3
2 5 2
obtain a new DataFrame df2 in which for each value of df1 there are mesh
values, each one obtained by dividing the corresponding value of df1 by its mesh
:
df2 :
value/mesh
0 5
1 5
2 4
3 4
4 4
5 2.5
6 2.5
More general:
df1 :
value mesh_value other_value
0 10 2 0
1 12 3 1
2 5 2 2
obtain:
df2 :
value/mesh_value other_value
0 5 0
1 5 0
2 4 1
3 4 1
4 4 1
5 2.5 2
6 2.5 2
CodePudding user response:
You can do map
df2['new'] = df2['value/mesh'].map(dict(zip(df1.eval('value/mesh'),df1.index)))
Out[243]:
0 0
1 0
2 1
3 1
4 1
5 2
6 2
Name: value/mesh, dtype: int64
CodePudding user response:
Try as follows:
- Use
Series.div
forvalue / mesh_value
, and applySeries.reindex
usingnp.repeat
withdf.mesh_value
as the input array for therepeats
parameter. - Next, use
pd.concat
to combine the result withdf.other_value
alongaxis=1
. - Finally, rename the column with result of
value / mesh_value
(its default name will be0
) usingdf.rename
, and chaindf.reset_index
to reset to a standard index.
df2 = pd.concat([df.value.div(df.mesh_value).reindex(
np.repeat(df.index,df.mesh_value)),df.other_value], axis=1)\
.rename(columns={0:'value_mesh_value'}).reset_index(drop=True)
print(df2)
value_mesh_value other_value
0 5.0 0
1 5.0 0
2 4.0 1
3 4.0 1
4 4.0 1
5 2.5 2
6 2.5 2
Or slightly different:
- Use
df.assign
to add a column with the result ofdf.value.div(df.mesh_value)
, and reindex / rename in same way as above. - Use
df.drop
to get rid of columns that you don't want (value
,mesh_value
) and usedf.iloc
to change the column order (e.g. we want['value_mesh_value','other_value']
instead of other way around (hence:[1,0]
). And again, reset index. - We put all of this between brackets and assign it to
df2
.
df2 = (df.assign(tmp=df.value.div(df.mesh_value)).reindex(
np.repeat(df.index,df.mesh_value))\
.rename(columns={'tmp':'value_mesh_value'})\
.drop(columns=['value','mesh_value']).iloc[:,[1,0]]\
.reset_index(drop=True))
# same result