I am trying to convert some columns that include number and alphabet at the same time. I want to convert to simple numerical value such as int. I don't want to just transform randomly because they are isbn involved. I want to able to match again. I want to transform that matching as below.
d = {'id': [156972, 102154, 214717, 84897, 275220, 165759, 42099, 265749, 130474, 15822],
'isbn': ['2842630521', '037570745X','689710879','783547528','786014091','1561581747','553571095','451210220','034540288X','345832710X',]}
df = pd.DataFrame(data=d)
I want to transform this way.
for id
2842630521 as 0
783547528 as 1
.
.
.
130474 as 9
15822 as 10
for isbn
2842630521 as 0
037570745X as 1 # alphabet involved
.
.
.
034540288X as 9
345832710X as 10
Is there anyway to do this to column?
Optional: I would also like a way to track back if the number 0
in isbn column to 2842630521.
CodePudding user response:
Use:
#random shuffle order of rows
df1 = df.sample(frac=1)
#create mapping dictionary by id
a, b = pd.factorize(df1['id'])
d = dict(zip(b, a))
#map original column
df['id'] = df['id'].map(d)