We have a table in Hana with temperature data. Years ago the genius decision was made in our old database to make the temperature field String since temps were added manually and they could use it to add exception codes and text when a temp was bad or couldn't be taken.
Now I'm trying to extract only the rows with valid temps (some form of decimal or integer) so I can cast the temps as decimal and do analysis. Using regex, I can filter out all non-numeric fields...except those like this:
52.3.
I'm currently using /^[ -]?((\d (.\d*)?)|(.\d ))$/ as my expression.
There are a lot of weird decimals formats this does catch, but not numbers with an additional, separated period at the end.
They're not going to fix the data, even if they did it would take them forever to get around to it. So I need a new expression to handle these. Hoping someone has an idea because my google-fu has failed me so far.
CodePudding user response:
This should do the trick:
/^[ -]?\d \.\d \.$/
CodePudding user response:
You could use this:
^[ -]?\d (?:\.\d )?$
\d
to match 1 or more digits(?:\.\d )?
Non-capturing group with a decimal separator followed by 1 or more digits. So it can either exist or not, ensures only one decimal separator
So it matches:
52.3
42
-6
52.0
0.61
0.6
but doesn't match:
52.
test
192.0.0.1
-6.0.1
.8
6
-6-
-3.
CodePudding user response:
That's a pretty sensible looking regex,
it's a good start.
Kudos for using ^
$
anchors at front and back.
What you have to understand is that in
a regex .
is quite
different from \.
-- the 1st matches any character while the 2nd matches just
a literal .
dot character.
So you'll want
/^[ -]?((\d (.\d*)?)|(.\d ))$/
to become
/^[ -]?((\d (\.\d*)?)|(\.\d ))$/