Home > database >  Python String remove everything after specific character showed up 3 times
Python String remove everything after specific character showed up 3 times

Time:10-06

I have the following question.

I have a list full of file names and I want to filter out a specific part of it. Thing is, I won't be able to know the exact position of the information as it can change depending on the file itself. I can only be sure of the relative positions of the underscores.

What a example would look like:

'C:\\Path...\\SomeInfo_MoreInfo_123_456_789.PDF'

What would I have to do to only get the 123? My initial idea would be to remove everything before the third _ and fourth _ but I do not know how to accomplish that with .split()

CodePudding user response:

To make clear how .split() works, take a look at its output:

>>>a = 'C:\\Path...\\SomeInfo_MoreInfo_123_456_789.PDF'

>>>a.split('_')
['C:\\Path...\\SomeInfo', 'MoreInfo', '123', '456', '789.PDF']

Now, if the part of the string that you want is always after the first two '_', you can extract it doing:

>>>a.split('_')[2]

'123'

CodePudding user response:

string = 'C:\\Path...\\SomeInfo_MoreInfo_123_456_789.PDF'
new_list = string.split("_")

The new_list output would be:

['C:\\Path...\\SomeInfo', 'MoreInfo', '123', '456', '789.PDF']

and now we want to get the index of 123:

target = new_list.index("123")

the out put would be:

2

and now we get the result:

result = new_list[target]

the output would be:

123

NOTE: with this method you can get any thing of string just replace 123 with your word

Of course we could do new_list[2] but what about if 123 path change?!

CodePudding user response:

You could use:

parts = a.split('_') ; ['C:\\Path...\\SomeInfo', 'MoreInfo', '123', '456', '789.PDF']

and finally:

result = parts[2] ; '123'

You must know that split return and array with the element separated by the value (_) you specified.

  • Related