Home > Enterprise >  Version strings in python using patterns
Version strings in python using patterns

Time:01-11

I created a code to version names in python. The idea is to add v1, v2... if a name already exists in a list. I tried the following code:

import pandas as pd

list_names = pd.Series(['name_1', 'name_1_v1'])
name = 'name_1'
new_name = name
i = 1
while list_names.str.contains(new_name).any() == True:
    new_name = f'{name}_v{i}'
    if list_names.str.contains(new_name).any() == False:
        break
    i = i   1

It works fine when I input 'name_1' (output: 'name_1_v2'), however, when I enter 'name_1_v1', the output is 'name_1_v1_v1' (correct would be 'name_1_v2'). I thought of using a regex with pattern _v[0-9]$, but I wasnt able to make it work.

<<< edit >>>

Output should be new_name = 'name_1_v2'. The idea is to find an adequate versioned name, not change the ones in the list.

CodePudding user response:

Proposed code :

import pandas as pd
import re

basename = 'name_1'

def new_version(lnam, basename):
    i, lat_v = 0, 0
    # looks for latest version
    while i < len(lnam):
        if re.search('v\d*', lnam[i]) is not None:
            lat_v = max(int(re.findall('v\d*', lnam[i])[0][1:]), lat_v)
        i =1
    if lat_v == 0:
        return basename   '_v1'
    else:
        return basename   '_v%s'%(lat_v 1)


lnam = pd.Series(['name_1'])
new_name = new_version(lnam, basename)
print("new_name : ", new_name)
# new_name :  name_1_v1

lnam = pd.Series(['name_1', 'name_1_v1'])
new_name = new_version(lnam, basename)
print("new_name : ", new_name)
# new_name :  name_1_v2

Result :

new_name :  name_1_v2

Let's try now with an unordered list of names (next version is 101) :

lnam = pd.Series(['name_1', 'name_1_v4', 'name_1_v100', 'name_1_v12', 'name_1_v17'])
new_name = new_version(lnam, basename)
print("new_name : ", new_name)
# new_name :  name_1_v101

In bonus : basename automatic identification

def get_basename(lnam, pos=2):
    l = lnam[pos].split('_')
    return '_'.join(l[:len(l)-1])

basename = get_basename(lnam)
print(basename)
# name_1
  • Related