I want to create a column based upon another column which have values like 0,25,50,75,100 and so on . So I want to create a dynamic code so that no need to define manually.
So suppose I have a data frame with 10000 rows
Input [enter image description here][1] https://i.stack.imgur.com/pcPkS.jpg
Output [enter image description here][2] https://i.stack.imgur.com/hHt0W.jpg
So basically I want 0-25 as 1 , 25 to 50 as 2 , 50 to 75 as 3 and so on
Please help me out how to do this in a dynamic way [1]: https://i.stack.imgur.com/pcPkS.jpg [2]: https://i.stack.imgur.com/hHt0W.jpg
Please click on the image to check the input and output
CodePudding user response:
Could use pandas .cut()
import pandas as pd
data = [0,25,50,75,100,125,150]
df = pd.DataFrame(data, columns=['A'])
lower = 0
upper = max(df['A'])
bins = range(lower, upper 1, 25)
df['B'] = pd.cut(x=df['A'], bins=bins, include_lowest=True, right=True)
bins = list(set(df['B']))
bins.sort()
bin_label = {k: idx for idx, k in enumerate(bins, start=1)}
df['C'] = df['B'].map(bin_label)
CodePudding user response:
You could do it with assign
and lambda
:
import pandas as pd
import numpy as np
# prepare dataframe
rows = 10000
step_size = 25
df = pd.DataFrame({"A":np.arange(0, limit*step_size, step_size)})
# create new column C
df = df.assign(C= lambda x: np.where((x.A/ 25).astype(int) == 0, 1, (x.A/ 25).astype(int)))
Output:
df
A C
0 0 1
1 25 1
2 50 2
3 75 3
4 100 4
... ... ...
9995 249875 9995
9996 249900 9996
9997 249925 9997
9998 249950 9998
9999 249975 9999
10000 rows × 2 columns