Home > Back-end >  Using Regex to move some letter of a string to a new location in the same string in a Series of stri
Using Regex to move some letter of a string to a new location in the same string in a Series of stri

Time:03-03

I have a list of 4000 strings. The naming convention needs to be changed for each string and I do not want to go through and edit each one individually.

The list looks like this:

data = list()


data = ['V2-FG2110-EMA-COMPRESSION',
'V2-FG2110-SA-COMPRESSION',
'V2-FG2110-UMA-COMPRESSION',
'V2-FG2120-EMA-DISTRIBUTION',
'V2-FG2120-SA-DISTRIBUTION',
'V2-FG2120-UMA-DISTRIBUTION',
'V2-FG2140-EMA-HEATING',
'V2-FG2140-SA-HEATING',
'V2-FG2140-UMA-HEATING',
'V2-FG2150-EMA-COOLING',
'V2-FG2150-SA-COOLING',
'V2-FG2150-UMA-COOLING',
'V2-FG2160-EMA-TEMPERATURE CONTROL']

I need all each 'SA' 'UMA' and 'EMA' to be moved to before the -FG.

Desired output is:

V2-EMA-FG2110-Compression
V2-SA-FG2110-Compression
V2-UMA-FG2110-Compression
...

The V2-FG2 does not change throughout the list so I have started there and I tried re.sub and re.search but I am pretty new to python so I have gotten a mess of different results. Any help is appreciated.

CodePudding user response:

You can rearrange the strings.

new_list = []
for word in data:
    arr = word.split('-')
    new_word = '%s-%s-%s-%s'% (arr[0], arr[2], arr[1], arr[3])
    new_list.append(new_word)



CodePudding user response:

You can replace matches of the following regular expression with the contents of capture group 1:

(?<=^V2)(?=.*(-(?:EMA|SA|UMA))(?=-))|-(?:EMA|SA|UMA)(?=-)

Demo

The regular expression can be broken down as follows.

(?<=^V2)             # current string position must be preceded by 'V2'
                     # at the beginning of the string
(?=                  # begin a positive lookahead
  .*                 # match zero or more characters other than a
                     # line terminator
  (-(?:EMA|SA|UMA))  # match a hyphen followed by one of the three strings
                     # and save to capture group 1
  (?=-)              # the next character must be a hyphen
)                    # end positive lookahead
|                    # or
-(?:EMA|SA|UMA)      # match a hyphen followed by one of the three strings
(?=-)                # the next character must be a hyphen

Evidently this may not work for versions of Python prior to 3.5, because the match in the second part of the alternation does not assign a value to capture group 1: "Before Python 3.5, backreferences to failed capture groups in Python re.sub were not populated with an empty string.. This quote is from @WiktorStribiżew 's (accepted) answer at the link. For what it's worth I confirmed that Ruby has the same behaviour ("V2-FG2110-EMA-COMPRESSION".gsub(rgx,'\1') #=> "V2-EMA-FG2110-COMPRESSION").

One could of course substitute a match of the first three parts of the string that is partitioned into three capture groups, with the substitution being $1 $3 $2. That's probably more sensible even if it's less interesting.

  • Related