Home > Software design >  Encoding json to bytes
Encoding json to bytes

Time:01-07

I have a problem that bites its own tail. In the code below I am creating a json element with file-paths, these contain special characters. Encoding results in a unicode-escape characters and the path is not readable server receiving the json. If I try to encode the strings, before the json the json library can't serialize the content.

import urllib.request
import json

SERVER_URL = "The_server_url:80"
REPOSITORY = "Dashboards"
WORKSPACE = "workspace.fmw"
TOKEN = "Here_goes_the_token"

# Set up the published parameters as object
params = {
    "publishedParameters" : [
        {
            "name" : "NuvAfvPath",
            "value" : 'T:/Projects/362/2021/3622100225 - Høje Gladsaxe Parken - naturlig hydrologi/Stofberegninger/Hestefolden/Hestefolden_afvandingskort_nuværende.tif'
        },
        {
            "name" : "DestDataset_XLSXW_7",
            "value" : '//corp.pbwan.net/dk/Projects/362/2021/3622100225 - Høje Gladsaxe Parken - naturlig hydrologi/Stofberegninger/20230105'
            
        },
        {
            "name" : "ProjAfvPath",
            "value": 'T:/Projects/362/2021/3622100225 - Høje Gladsaxe Parken - naturlig hydrologi/Stofberegninger/Hestefolden/Hestefolden_afvandingskort_projekt.tif'
        },
        {
            "name" : "ProjOmrPath",
            "value": 'T:/Projects/362/2021/3622100225 - Høje Gladsaxe Parken - naturlig hydrologi/Stofberegninger/Hestefolden/Projektområde_Hestefolden.shp'
        }
    ]
}

url = '{0}/fmerest/v2/transformations/commands/submit/{1}/{2}'.format(SERVER_URL, REPOSITORY, WORKSPACE)

# Request constructor expects bytes, so we need to encode the string

body = json.dumps(params).encode('utf-8')

headers = {
    'Content-Type' : 'application/json',
    'Accept' : 'application/json',
    'Authorization' : 'fmetoken token={0}'.format(TOKEN)
}
print(url)
print(body)
print(headers)

The print of the request body:

b'{"publishedParameters": [{"name": "NuvAfvPath", "value": "T:/Projects/362/2021/3622100225 - H\\u00f8je Gladsaxe Parken - naturlig hydrologi/Stofberegninger/Hestefolden/Hestefolden_afvandingskort_nuv\\u00e6rende.tif"}, {"name": "DestDataset_XLSXW_7", "value": "//corp.pbwan.net/dk/Projects/362/2021/3622100225 - H\\u00f8je Gladsaxe Parken - naturlig hydrologi/Stofberegninger/20230105"}, {"name": "ProjAfvPath", "value": "T:/Projects/362/2021/3622100225 - H\\u00f8je Gladsaxe Parken - naturlig hydrologi/Stofberegninger/Hestefolden/Hestefolden_afvandingskort_projekt.tif"}, {"name": "ProjOmrPath", "value": "T:/Projects/362/2021/3622100225 - H\\u00f8je Gladsaxe Parken - naturlig hydrologi/Stofberegninger/Hestefolden/Projektomr\\u00e5de_Hestefolden.shp"}]}'

Declaring the encoding in the top of the code doesn't change anything.

CodePudding user response:

Solution 1 . Url encode only the "Value" properties, then you can URL decode them at the receiver. Also try putting double quotes around values of URLs.

Solution 2 (quick solution - ADHOC). Base64 encode the "Value" properties, then base64 decode them at receiver.

I hope this helps! Cheers!

CodePudding user response:

File paths depend on the specific file system and its encoding. "ø" may be encoded in UTF-8, UTF-16, or some ANSI encoding in the actual file system. You have two options:

  1. You know the required encoding and can decode the JSON to the character "ø", then encode this character to the correct encoding used in the file system when trying to access the file. (This may or may not happen transparently depending on what API you use to access the file.)
  2. You read the file path from the file system in its binary encoding, and never treat the path as text but keep it as binary blob. Since JSON cannot represent binary data, you need to—for example—base 64 encode the binary path and store the base 64 encoded blob as path in JSON.
  • Related