Old title: Convert scientific notation string without 'e' to float in Python
I use a program that has some ...odd formats allowed. One format for real numbers is scientific notation without the letter character 'E'. For example, "-1.67E-6" could be written as "-1.67-6". Obviously, float() doesn't like this. I am writing many classes that would need this same check in multiple fields, so I'm in need of a function to do this cleanly. Is there a way to overwrite the builtin definition of float() so that it could handle this format? I think that would be ideal, but I'm not sure if it's possible.
My current work-around is to use my own string-to-float function that looks something like the following.
str2float(s):
if re.search(r'([- ][0-9]*.[0-9]*)([- ][0-9]*)', s):
base, exp = re.findall(r'([- ][0-9]*.[0-9]*)([- ][0-9]*)', X1)[0]
s = f'{base}E{exp}'
return float(s)
CodePudding user response:
How about something like this?
def str2float(s):
sign = ' ' if ' ' in s else '-'
l = s.split(sign)
if (len(l) == 3) and (l[0] == ''):
return float(sign l[1] "E" sign l[2])
elif (len(l) == 2) and (l[0] != ''):
return float(l[0] "E" sign l[1])
else:
return float(s)
str2float("-1.67 6") # -1670000.0
str2float("-1.67-6") # 1.67e-06
str2float("1.67 6") # 1670000.0
str2float("1.67") # 1.67
str2float("-1.67") # -1.67
CodePudding user response:
You could utilize re.sub
in a way that also accounts for current valid scientific notation.
import re
def str_to_float(s: str) -> float:
s = re.sub(r'(?<=\d)-(?=\d)', 'E-', s)
s = re.sub(r'(?<=\d)\ (?=\d)', 'E', s)
return float(s)
print(str_to_float('-1.67E-6')) # -1.67e-06
print(str_to_float('-1.67-6')) # -1.67e-06
print(str_to_float('1.67 6')) # 1670000.0
print(str_to_float('1.67E6')) # 1670000.0
print(str_to_float('-1.67')) # -1.67
print(str_to_float('1.67')) # 1.67
(?<=\d)
- positive lookbehind for a single digit (0 through 9).
-
and \
- match the characters -
and
literally.
(?=\d)
- positive lookahead for a single digit (0 through 9).
This locates the invalid scientific notation (as float
sees it) and replaces them with valid scientific notation.
CodePudding user response:
I think I've found what I was looking for.
class float(float):
def __new__(cls, s):
if re.search(r'([- ]?[0-9]*.[0-9]*)([- ][0-9]*)', s):
base, exp = re.findall(r'([- ]?[0-9]*.[0-9]*)([- ][0-9]*)', s)[0]
s = f'{base}E{exp}'
return super(float, cls).__new__(cls, s)
This appears to allow me to over-ride the base behavior of float()
in the way that I'm desiring. Is this a good idea? Bad idea? Why?