Home > Mobile >  How to fix converting a string bug?
How to fix converting a string bug?

Time:07-10

I needs to convert base string to target string. I have a working code right now, but if there is a "," character where it says tvg-name, the code is broken and doesn't work. How can I fix this bug?

Base Working String: {tvg-id: , tvg-name: A beautiful Day - 2016, tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg, group-title: 2017-16-15 Germany Cinema}

Base Problem String: {tvg-id: , tvg-name: Antonio, ihm schmeckt's nicht! (2016), tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg, group-title: 2017-16-15 Germany Cinema}

Target: {"tvg-id": "None", "tvg-name": "Antonio, ihm schmeckt's nicht! (2016)", "tvg-logo": "https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg", "group-title": "2017-16-15 Germany Cinema"}

My Convert Function

def convert(example):
    #split the string into a list
    example= example.replace("{", "").replace("}", "").split(",")
    #create a dictionary
    final = {}
    #loop through the list
    for i in example:
        #split the string into a list
        i = i.split(":")

        #if http or https is in the list merge with next item
        if "http" in i[1] or "https" in i[1]:
            i[1] = i[1]   ":"   i[2]
            i.pop(2)
        

        #remove first char whitespace
        if i[0][0] == " ":
            i[0]=i[0][1:]

        #remove first char whitespace
        if i[1][0] == " ":
            i[1]=i[1][1:]


        final[i[0]] = i[1]
        
            
    #return the dictionary
    return final

CodePudding user response:

Instead of normal .split(','), we can use regular expression to help us handle the split.

import re

def convert(example):
    kv_pairs = re.split(', (?=\w -?\w :)', example[1:-1])
    result = {}
    for kv_pair in kv_pairs:
        key, value = kv_pair.split(': ', 1)
        result[key] = value
    return result

In re.split(', (?=\w -?\w :)', example[1:-1]), we only split those commas that are followed by the pattern (?=\w -?\w :), for example tvg-logo:.

In key, value = kv_pair.split(': ', 1), we specify maxsplit=1, so that we don't need to worry about colons in values (like URLs).

Hope it helps.

CodePudding user response:

You can't really do this without some heuristics.

Here's a code that works -

from typing import Dict, Optional

def convert(input: str) -> Dict[str, Optional[str]]:
  input = input.strip()[1:-1]  # Remove the curly braces {...}
  result: Dict[str, Optional[str]] = {}
  carryover = ''
  for pair in input.split(','):
    kv = (carryover   pair).strip().split(':', 1)
    if len(kv) == 1:
      carryover  = pair   ','
      continue
    result[kv[0]] = kv[1] if kv[1] else None
    carryover = ''
  return result

This works by preventing an output if there's no ':' up to the current string.

Note that this will break if you have strings like '{ab,cd:ef,gh}' since it won't know what to do with 'gh'. It's actually a bit ambiguous.

To handle all cases correctly, the only option is to change the input source to quote the string if possible. If that's not possible, or if it's a one-time thing, you can try to extend the heuristics to cover all your cases.

CodePudding user response:

Regex does good things:

import re

def convert(s):
    s = s[1:-1] # Remove {}
    # Split on commas followed by a space then group of characters that end in ':'
    s = re.split(', (?=\S :)', s) 
    # Split each of these groups on the first ': '. Now it's basically a dict.
    return dict(i.split(': ', 1) for i in s)

>>> x = '{tvg-id: , tvg-name: A beautiful Day - 2016, tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg, group-title: 2017-16-15 Germany Cinema}'
>>> print(convert(x))

# Output: 
{'tvg-id': '', 'tvg-name': 'A beautiful Day - 2016', 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg', 'group-title': '2017-16-15 Germany Cinema'}

>>> x = "{tvg-id: , tvg-name: Antonio, ihm schmeckt's nicht! (2016), tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg, group-title: 2017-16-15 Germany Cinema}"
>>> print(convert(x))

# Output:
{'tvg-id': '', 'tvg-name': "Antonio, ihm schmeckt's nicht! (2016)", 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg', 'group-title': '2017-16-15 Germany Cinema'}
  • Related