Home > Blockchain >  Convert to ASCII portions of numpy/pandas then reconvert back
Convert to ASCII portions of numpy/pandas then reconvert back

Time:03-19

I have either a numpy or pandas dataframe that contains on most cells numerical values, on the other hand there are spare character values (they are not column based so I cant use label encoder). I am searching for a method to convert these sparse character values that could be anywhere, into their ASCII code, in order to feed the array in deep learning models. After that I need to know which ones are the ones that was converted so I could reconvert them back to characters. Any idea would be highly appreciated!

Example values could be (1,2,f,5,3) on row 1 and (7,k,1,j,9) on some row k. This in a numpy array or in a pandas dataframe. Question is how can I encode the letters to ascii in order to have numbers, then how do I decode them back?

CodePudding user response:

A possible solution could be to use ord() and chr() to encode and decode your characters using "an integer representing the Unicode code point of that character".

>>> df
  characters
0          f
1          k
>>> df["encoded"] = df["characters"].apply(ord)
>>> df["encoded"]
0    102
1    107
>>> df["decoded"] = df["encoded"].apply(chr)
>>> df["decoded"]
0    f
1    k
  • Related