I needs to convert base string to target string. I have a working code right now, but if there is a "," character where it says tvg-name, the code is broken and doesn't work. How can I fix this bug?
Base Working String: {tvg-id: , tvg-name: A beautiful Day - 2016, tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg, group-title: 2017-16-15 Germany Cinema}
Base Problem String: {tvg-id: , tvg-name: Antonio, ihm schmeckt's nicht! (2016), tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg, group-title: 2017-16-15 Germany Cinema}
Target: {"tvg-id": "None", "tvg-name": "Antonio, ihm schmeckt's nicht! (2016)", "tvg-logo": "https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg", "group-title": "2017-16-15 Germany Cinema"}
My Convert Function
def convert(example):
#split the string into a list
example= example.replace("{", "").replace("}", "").split(",")
#create a dictionary
final = {}
#loop through the list
for i in example:
#split the string into a list
i = i.split(":")
#if http or https is in the list merge with next item
if "http" in i[1] or "https" in i[1]:
i[1] = i[1] ":" i[2]
i.pop(2)
#remove first char whitespace
if i[0][0] == " ":
i[0]=i[0][1:]
#remove first char whitespace
if i[1][0] == " ":
i[1]=i[1][1:]
final[i[0]] = i[1]
#return the dictionary
return final
CodePudding user response:
Instead of normal .split(',')
, we can use regular expression to help us handle the split.
import re
def convert(example):
kv_pairs = re.split(', (?=\w -?\w :)', example[1:-1])
result = {}
for kv_pair in kv_pairs:
key, value = kv_pair.split(': ', 1)
result[key] = value
return result
In re.split(', (?=\w -?\w :)', example[1:-1])
, we only split those commas that are followed by the pattern (?=\w -?\w :)
, for example tvg-logo:
.
In key, value = kv_pair.split(': ', 1)
, we specify maxsplit=1
, so that we don't need to worry about colons in values (like URLs).
Hope it helps.
CodePudding user response:
You can't really do this without some heuristics.
Here's a code that works -
from typing import Dict, Optional
def convert(input: str) -> Dict[str, Optional[str]]:
input = input.strip()[1:-1] # Remove the curly braces {...}
result: Dict[str, Optional[str]] = {}
carryover = ''
for pair in input.split(','):
kv = (carryover pair).strip().split(':', 1)
if len(kv) == 1:
carryover = pair ','
continue
result[kv[0]] = kv[1] if kv[1] else None
carryover = ''
return result
This works by preventing an output if there's no ':'
up to the current string.
Note that this will break if you have strings like '{ab,cd:ef,gh}'
since it won't know what to do with 'gh'. It's actually a bit ambiguous.
To handle all cases correctly, the only option is to change the input source to quote the string if possible. If that's not possible, or if it's a one-time thing, you can try to extend the heuristics to cover all your cases.
CodePudding user response:
Regex does good things:
import re
def convert(s):
s = s[1:-1] # Remove {}
# Split on commas followed by a space then group of characters that end in ':'
s = re.split(', (?=\S :)', s)
# Split each of these groups on the first ': '. Now it's basically a dict.
return dict(i.split(': ', 1) for i in s)
>>> x = '{tvg-id: , tvg-name: A beautiful Day - 2016, tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg, group-title: 2017-16-15 Germany Cinema}'
>>> print(convert(x))
# Output:
{'tvg-id': '', 'tvg-name': 'A beautiful Day - 2016', 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg', 'group-title': '2017-16-15 Germany Cinema'}
>>> x = "{tvg-id: , tvg-name: Antonio, ihm schmeckt's nicht! (2016), tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg, group-title: 2017-16-15 Germany Cinema}"
>>> print(convert(x))
# Output:
{'tvg-id': '', 'tvg-name': "Antonio, ihm schmeckt's nicht! (2016)", 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg', 'group-title': '2017-16-15 Germany Cinema'}