Home > OS >  How to efficiently extract substrings from a string (with - or _)
How to efficiently extract substrings from a string (with - or _)

Time:10-03

I have a list of wall names of a building, and it looks like below:

wall_list = ['W1_1F-12F', 'W2_1F-9F', 'W3_10F-12F']

I want to separate them into three or extract each of the elements (like W1, 1F, 12F) so that I can use wall names or floor information in another process.

wall_name = [W1, W2, W3...]
Floor_from = [1F, 1F, 10F...]
Floor_to = [12F, 9F, 12F...]

This is the result I want to get in the end.

I think it will be efficient to solve this problem by reading strings before or after _ and -, if this kind of method exists.

CodePudding user response:

wall_list = ["W1_1F-12F", "W2_1F-9F", "W3_10F-12F"]
wall_name = [] #[W1, W2, W3...]
Floor_from = [] #[1F, 1F, 10F...]
Floor_to = [] #[12F, 9F, 12F...]
for i in wall_list:
    wall_name.append(i.split("_")[0])
    Floor_from.append(i.split("_")[1].split("-")[0])
    Floor_to.append(i.split("_")[1].split("-")[1])
print(wall_name,Floor_from,Floor_to)

CodePudding user response:

You can the regex version of the split function with a simple pattern:

import re

wall_list = ['W1_1F-12F', 'W2_1F-9F', 'W3_10F-12F']

for s in wall_list:
    print(re.split('[_-]', s))

Which will give:

['W1', '1F', '12F']
['W2', '1F', '9F']
['W3', '10F', '12F']

And to separate them to elements just put the result into zip:

import re

wall_list = ['W1_1F-12F', 'W2_1F-9F', 'W3_10F-12F']

walls, floor_from, floor_to = zip(*(re.split('[_-]', s) for s in wall_list))
print(walls, floor_from, floor_to, sep='\n')

Will now give:

('W1', 'W2', 'W3')
('1F', '1F', '10F')
('12F', '9F', '12F')

CodePudding user response:

import re

def extract_components(wall):
    match = re.match("^(W\d )_(\d F)-(\d F)", wall)
    return match.groups()

def extract(walls):
    return list(zip(*[extract_components(wall) for wall in walls]))

wall_name, floor_from, floor_to = extract(wall_list)

Results:

( ) >>> wall_name
('W1', 'W2', 'W3')
( ) >>> floor_from
('1F', '1F', '10F')
( ) >>> floor_to
('12F', '9F', '12F')

CodePudding user response:

Try this

wallList = ["W1_1F-12F", "W2_1F-9F", "W3_10F-12F"]
wallName = []
floorFrom = []
floorTo = []

for element in wallList:
    wallName.append( element.split("_")[0] )
    floorFrom.append( element.split("-")[0].split("_")[1] )
    floorTo.append( element.split("-")[1] )

print(wallName)
print(floorFrom)
print(floorTo)
  • Related