Home > Software design >  camelcase to underscore pythonically, Most logical approach
camelcase to underscore pythonically, Most logical approach

Time:12-22

like the title without regex

word = 'theDayAMan'
newWord = ''
for i in word:
    if i.isupper():
        newWord  = '_'   i.lower()
    else:
        newWord  = i

or is regex better? just looking for fastest conversion. Thank you

the above, and so i make more than twenty characters to ask a simple question to the wiser lol

CodePudding user response:

List comp:

word = 'theDayAMan'
newWord = ''.join((i,'_' i.lower())[i.isupper()] for i in word)
print(newWord)

# the_day_a_man

or map:

word = 'theDayAMan'
newWord = ''.join(map(lambda i:(i,'_' i.lower())[i.isupper()],word))
print(newWord)

# the_day_a_man

But list comprehension is slightly faster on my machine in this case:

import timeit

Map = """
word = 'theDayAMan'
newWord = ''.join(map(lambda i: (i,'_' i.lower())[i.isupper()],word))
"""

Lst = """
word = 'theDayAMan'
newWord = ''.join((i,'_' i.lower())[i.isupper()] for i in word)
"""

Reg = """
import re
word = 'theDayAMan'
newWord = re.sub('(.)([A-Z][a-z] )', r'\1_\2', word)
newWord = re.sub('([a-z0-9])([A-Z])', r'\1_\2', newWord).lower()
"""

print('map',timeit.timeit(Map, number=1000))
print('lst',timeit.timeit(Lst, number=1000))
print('reg',timeit.timeit(Reg, number=1000))

# Map 0.004132400004891679
# Lst 0.003967999975429848
# Reg 0.00403300000471063

CodePudding user response:

Regex are relatively easy for this:

>>> import re
>>> word = 'theDayAMan'
>>> re.sub(r'([a-z])([A-Z])', lambda m:f'{m.group(1)}_{m.group(2).lower()}', word)
'the_day_aMan'

See https://docs.python.org/3/library/re.html#re.sub for details.

To handle the AMan case (and not use a lambda):

def to_snake(m):
    return f'{m.group(1)}_{m.group(2).lower()}'


while True:
    res = re.sub(r'([a-z])([A-Z])', to_snake, word)
    if res == word:
        break
    word = res

print(res)
  • Related