How to write some encoding function in Python Pandas?-CodePudding

I need to write a string encoding function that returns the output as follows:

encode_string("Jack000Jons000446") ➞ { "given_name": "Jack",
"family_name": "Jons",
"id": "446" 
} 
encode_string("Ann0funs003567") ➞ { "given_name": "Ann",
"family_name": "funs",
"id": "3567" 
}

Furthermore:

The string will always come in the same positional order
The field "id" will never contain 0's

How can I write this type of function in Python Pandas/Numpy ?

CodePudding user response：

You can use re.split(), splitting the string on one or more zeroes. numpy and pandas aren't really necessary for solving the problem.

import re

def encode_string(s):
    given_name, family_name, id = re.split(r"0 ", s)
    return {
        'given_name': given_name,
        'family_name': family_name,
        'id': id
    }

# Prints {'given_name': 'Jack', 'family_name': 'Jons', 'id': '446'}
print(encode_string("Jack000Jons000446"))

CodePudding user response：

I'm going to rename the function because it's really decoding rather than encoding. Therefore:

import re
def decode_string(s):
    return dict(zip(['given_name','family_name','id'], re.split('0 ',s)))
print(decode_string('Jack000Jons000446'))

Output:

{'given_name': 'Jack', 'family_name': 'Jons', 'id': '446'}