I have a string that I've split almost to the level of what I need but not completely. My string looks like this to start:
str1 =
Out[135]: 'C:\\\\Users\\\\U321103\\\\OneDrive - IBERDROLA S.A\\VARIABILIDAD CLIMATICA\\\\VORTEX\\\\WIND8\\349039.SPAIN.ESTE.CARRASCOSA.Power.csv'
I have used a split technique to get it to here:
str2 = str1.split('WIND8\\')[1].split('.csv')[0]
Out[132]: '349039.SPAIN.ESTE.CARRASCOSA.Power'
However, I really need this final answer:
str3 = SPAIN.ESTE.Power
And, I'm not sure how to remove the string content before "SPAIN" and between "ESTE" and ".Power". The word "ESTE" will change - meaning that "ESTE" is a region of a country and will change each time the script is run. In the str1 variable, these subset strings will change each time the script is run: "349039", "SPAIN", "ESTE", "CARRASCOSA" so I think that the code needs to select by position between the periods "." in str2. Thank you for your help!
CodePudding user response:
_str = 'C:\\\\Users\\\\U321103\\\\OneDrive - IBERDROLA S.A\\VARIABILIDAD CLIMATICA\\\\VORTEX\\\\WIND8\\349039.SPAIN.ESTE.CARRASCOSA.Power.csv'
# split by '\', than by '.', than slice.
_str = _str.split('\\')[-1].split('.')[1:-1]
Output:
['SPAIN', 'ESTE', 'CARRASCOSA', 'Power']
If you want to join:
_str = '.'.join(_str)
CodePudding user response:
As McLovin said in a comment, you should be able to split by .
and then rejoin by index, assuming that the structure remains the same.
str2 = '349039.SPAIN.ESTE.CARRASCOSA.Power'
substrs = str2.split('.')
str3 = '.'.join([substrs[i] for i in [1,2,-1]])
str3
>> 'SPAIN.ESTE.Power'
For more complicated / flexible splitting and parsing, consider using regular expressions with the re
module. There is a bit of a learning curve but they're very useful and there are lots of tutorials out there