I have the following string in python:
datastring = """
Animals {
idAnimal
nameAnimal
animalko5854hg[name="Jazz"]
animal6ljkjh[name="Pinky"]
animal595s422d1252g55[name="Steven"]
animalko5854hg[name="David"]
}
"""
print(type(datastring))#->str
My string is data than a read before from a file text, now I have that data in datastring
.
In datastring
always in the fourth line, the data is showed in the next way: animalidAnimal[name="nameAnimal"
So I would like to code a function that takes as a parameter a string like above, and return the part of idAnimal
of the first line that starts in the following way: animalidAnimal[name="nameAnimal"
So for example in the first string my expected output would be:
ko5854hg
Other example:
datastring = """
Animals {
idAnimal
nameAnimal
animal456jlk165ut[name="Dalty"]
animal6ljkj[name="Moon"]
}
Expected output:
456jlk165ut
Last example:
datastring = """
Animals {
idAnimal
nameAnimal
animalk45lil69lhfr5942lk[name="Jazz"]
animal6ljkjh[name="Pinky"]
animal595s422d1252g55[name="Steven"]
animalko5854hg[name="David"]
animalko5854hg[name="Oty"]
animalko5854hg[name="Dan"]
}
Expected output:
k45lil69lhfr5942lk
I don´t want to be considered as a lazy person, but I don´t really know how to start coding that, I read about startswith
and endswith
functions, but those only return True/False
values.
Thanks.
CodePudding user response:
Have you tried using regexes?
Using re.find_all(r"(?<=animal)(.*?)(?=\[)",datastring)
would show up the list of IDs so if you want the first occurence you can get the ID with the 0 index, good luck
Thanks for notifying me about that, here's a simplier way thanks again for letting me know:
for line in datastring.splitlines():
if line.startswith("animal"):
id = line.replace("animal","").split("]")[0]
I think KillerRebooted's answer is more effective but as I said this is more simple
CodePudding user response:
You can start the match with {
and use a capture group for the animalId:
{[^{}]*?\banimal(\w )\[name="[^\s"*]*"]
The pattern matches:
{
Match a{
char[^{}]*?
Match any character except { and } as few as possible\banimal
Match animal with a leading word boundary(\w )
Capture group 1, match 1 word characters\[name="[^\s"*]*"]
Match the `[name="...."]
Example code
import re
pattern = r"{[^{}]*?\banimal(\w )\[name=\"[^\s\"*]*\"]"
s = ("Animals {\n"
" idAnimal\n"
" nameAnimal\n"
" animal456jlk165ut[name=\"Dalty\"]\n"
" animal6ljkj[name=\"Moon\"]\n\n"
"}")
m = re.search(pattern, s)
if m:
print(m.group(1))
Output
456jlk165ut
CodePudding user response:
You should probably allow for the line starting with 'animal' not necessarily being the fourth line. This might be more robust:
datastring = """
Animals {
idAnimal
nameAnimal
animalko5854hg[name="Jazz"]
animal6ljkjh[name="Pinky"]
animal595s422d1252g55[name="Steven"]
animalko5854hg[name="David"]
}
"""
ANIMAL = 'animal'
def get_animal_id(ds):
for line in map(str.lstrip, ds.splitlines()):
if line.startswith(ANIMAL):
return line[len(ANIMAL):line.index('[')]
print(get_animal_id(datastring))
Output:
ko5854hg
Note:
If the first line observed starting with 'animal' does not contain '[' this will fail with ValueError
You could also do this using a regular expression thus:
import re
print(re.search(r'(?<=animal)(.*?)(?=\[)', datastring).group(1))