I would like to select a text from a file in Python and replace only from the selected phrase until a certain text.
with open ('searchfile.txt', 'r' ) as f:
content = f.read()
content_new = re.sub('^\S*', '(.*?\/)', content, flags = re.M)
with open ('searchfile.txt', 'w') as f:
f.write(content_new)
searchfile.txt contains the below text:
abc/def/efg 212 234 asjakj
hij/klm/mno 213 121 ashasj
My aim is to select everything from the line until the first space and then replace it with the text until the first occurance of backslash /
Example:
^\S*
selects everything until the first space in my file which is "abc/def/efg".
I would like to replace this text with only "abc"
and "hij" in different lines
My regexp (.*?\/)
does not work for me here.
CodePudding user response:
You can split the content
with whitespace, get the first item and split it with /
and take the first item:
content_new = content.split()[0].split('/')[0]
See the Python demo.
If you plan to use a regex, you may use
match = re.search(r'^[^\s/] ', content, flags = re.M)
if match:
content_new = match.group()
See the Python demo. Details:
^
- start of a line (due tore.M
)[^\s/]
- one or more chars other than whitespace and/
.
CodePudding user response:
Try this:
>>> s = 'abc/def/efg 212 234 asjakj'
>>> p = s.split(' ', maxsplit=1)
>>> p
['abc/def/efg', '212 234 asjakj']
>>> p[0] = p[0].split('/', maxsplit=1)[0]
>>> p
['abc', '212 234 asjakj']
>>> s = ' '.join(p)
>>> s
'abc 212 234 asjakj'
One-liner solution:
>>> s.replace(s[:s.index(' ')], s[:s.index('/')], 1)
'abc 212 234 asjakj'
CodePudding user response:
May be this can help
import re
s = "abc/def/efg 212 234 asjakj"
pattern = r"^(.*?\/)"
replace = "xyz/"
op = re.sub(pattern, replace, s)
print (op)
CodePudding user response:
Rephrased expected behavior
- Given a string that has this pattern:
<path><space>
. - If the first part of given string (
<path>
) has at least one slash/
surrounded by words. - Then return the string before the slash.
- Else return empty string.
Where path is words delimited by slashes. For example abc/de
. But but not one of those:
abc
/de
abc/file.txt
abc/
Solution
Matching lines
Could also match for the pattern and only extract the first path-element before the slash then.
import re
line = "abc/def/efg 212 234 asjakj"
extracted = '' # default
if re.match(r'^(\w /\w ) ', line):
extracted = line.split('/')[0] # even simpler than Wiktors split
print(extracted)
Extraction
The extraction can be done in two ways:
(1) Just the first path-element, like Wiktor answered.
first_path_element = "abc/def/efg 212 234 asjakj".split('/')[0]
print(first_path_element)
(2) Some may find a regex shorter and more expressive:
import re
first_path_element = re.findall(r'^(\w )/', "abc/def/efg 212 234 asjakj")[0]
print(first_path_element)