I have a string in Python like that one:
'speed=36.2448,course=331.35,gps_time=2021-11-22T00:43:22.678Z,fix=1,message_source=device_gps,period_km=0.436,location=Middle of no where,x=3.2'
and I need to add double quotes to non-numerical string situated between a '='
and a ','
. Result should looks like this:
'speed=36.2448,course=331.35,gps_time="2021-11-22T00:43:22.678Z",fix=1,message_source="device_gps",period_km=0.436,location="Middle of no where",x=3.2'
I'm trying with regex since hours but turning crazy. Any help would be welcome. In advance thank you.
CodePudding user response:
If you don't care about handling escaped commas, you can simply split the string on commas, then split on =
, handle the right side based on if it's numeric or not, and finally join everything.
s = 'speed=36.2448,course=331.35,gps_time=2021-11-22T00:43:22.678Z,fix=1,message_source=device_gps,period_km=0.436,location=Middle of no where,x=3.2'
items = []
for item in s.split(','):
lhs, rhs = item.split('=', 1)
try:
float(rhs)
# Could convert rhs to float, so leave item unchanged
items.append(item)
except ValueError:
# Could not convert rhs to float, so is not numeric. Surround rhs with quotes
items.append(f'{lhs}="{rhs}"')
modified_s = ",".join(items)
which gives
modified_s = 'speed=36.2448,course=331.35,gps_time="2021-11-22T00:43:22.678Z",fix=1,message_source="device_gps",period_km=0.436,location="Middle of no where",x=3.2'
CodePudding user response:
You might use a pattern with a capture group, and in the replacement use the capture group between double quotes.
=(?!\d (?:\.\d )?(?:,|$))([^",\n] )
The pattern matches:
=
Match literally(?!
Negative lookahead, assert what is directly to the right is not\d (?:\.\d )?(?:,|$)
Match 1 digits with an optional decimal part followed by either,
or end of the string
)
Close lookahead(
Capture group 1[^",\n]
match 1 times any character except"
,
or a newline
)
Close group 1
For example
import re
regex = r"=(?!\d (?:\.\d )?(?:,|$))([^\",\n] )"
s = 'speed=36.2448,course=331.35,gps_time=2021-11-22T00:43:22.678Z,fix=1,message_source=device_gps,period_km=0.436,location=Middle of no where,x=3.2'
result = re.sub(regex, r'="\1"', s)
print(result)
Output
speed=36.2448,course=331.35,gps_time="2021-11-22T00:43:22.678Z",fix=1,message_source="device_gps",period_km=0.436,location="Middle of no where",x=3.2