Home > Back-end >  How to remove the double quotes around numbers in a string
How to remove the double quotes around numbers in a string

Time:11-16

I have several strings similar like this

"apple cost": "2.78" 
"orange cost": "12.59"
"melone cost": "42.12"

the number can change/is variable and I want to remove the double quotes around the number. So the result should be like this

"apple cost": 2.78
"orange cost": 12.59
"melone cost": 42.12

The price are between 0.01 and 999.99 and the text varies only in relation to the name of the fruit and should always be in the form

"fruit name coast":. This part of the string does not need to be changed

How can I do this in Python?

I tried this with a expression like this, but the double quotes werent removed.

string = '"apple cost": "2.78"'
string = string.replace('"apple cost": "([0-9]).([1-9]|[0-9][0-9])"', '"apple cost": ([0-9]).([1-9]|[0-9][0-9])')¨
string = string.replace('"orange cost": "([0-9]).([1-9]|[0-9][0-9])"', '"orange cost": ([0-9]).([1-9]|[0-9][0-9])')

CodePudding user response:

Does this get you where you want to go?

strings = ['"apple cost": "2.78"', 
           '"orange cost": "12.59"', 
           '"melone cost": "42.12"']

return_dict = {sub[0]: float(sub[-1].split('"')[-2]) for sub in [string.split(":") for string in strings]}

for key, val in return_dict.items():
    print (f'{key} : {val}')

Producing this output:

"apple cost" : 2.78
"orange cost" : 12.59
"melone cost" : 42.12

This approach keeps the numbers as strings, just removes the double quotes - which is specifically what you've asked about. You could of course also convert the numbers from strings to floats if you wished, like so:

return_dict = {sub[0].split('"')[-2]: float(sub[-1].split('"')[-2]) for sub in [string.split(":") for string in strings]}

EDIT

If you're looking to return a list of strings identical to the input ones EXCEPT for the removal of the double quotes around the number, this should work:

newstrings = []

for string in strings:
    key, val = string.split(':')
    key  = ':'
    val = val.split('"')[-2]
    newstrings.append(f'{key} {val}')
    
for newstring in newstrings:
    print (newstring)

producing the output:

"apple cost": 2.78
"orange cost": 12.59
"melone cost": 42.12

This obviously assumes that your input data will always look the same as the sample of data you've provided... If you have more edge cases, please provide them for a more expansive answer.

CodePudding user response:

You can use 2 capture groups and use those in the replacement using \1\2

("[^"]* cost":\s*)"(\d{1,3}(?:\.\d{1,2})?)"

The pattern matches:

  • ( Capture group 1
    • "[^"]* cost":\s* Match ", then optional chars other than ", then cost": followed by optional whitespace chars
  • ) Close group 1
  • " Match the double quote that you don't want to keep
  • ( Capture group 2
    • \d{1,3}(?:\.\d{1,2})? Match 1-3 digits and optionally match . and 1-2 digits
  • ) Close group 2
  • " Match the double quote that you don't want to keep

See a regex demo and a Python demo

Example

import re
 
pattern = r"(\"[^\"]* cost\":\s*)\"(\d{1,3}(?:\.\d{1,2})?)\""
 
s = ("\"apple cost\": \"2.78\" \n"
    "\"orange cost\": \"12.59\"\n"
    "\"melone cost\": \"42.12\"")
 
print(re.sub(pattern, r"\1\2", s))

Output

"apple cost": 2.78 
"orange cost": 12.59
"melone cost": 42.12
  • Related