Home > Net >  Uploading Office Open XML files using API POST method in Python
Uploading Office Open XML files using API POST method in Python

Time:03-01

I am trying to write a script to help me automate some work with our CAT tool (Memsource). To this end, I need to upload some files using API.

I rely on Memsource API documentation available here: https://cloud.memsource.com/web/docs/api#operation/createJob

I wrote a short code to test file uploading before moving to making it async, and I have some serious problem: text files are uploaded correctly, although the body of the text contains some additions after uploading:

--4002a5507da490554ad71ce8591ccf69    
Content-Disposition: form-data; name="file"; filename=“test.txt"

I also tried to upload DOCX file, but it cannot be even opened in Memsource online editor — I guess the content is modified along the way, but I am unable to find where...

The code responsible for the upload is as follows:

def test_upload(self):
    # Assemble "Memsource" header as mentioned in the API docs
    Memsource_header = {
        "targetLangs": ["pl"],
    }

    # Open the file to be uploaded and extract file name
    f = open("Own/TMS_CAT/test.txt", "rb")
    f_name = os.path.basename(f.name)

    # Assemble the request header
    header = {
        "Memsource": json.dumps(Memsource_header),
        "Content-Disposition": f'attachment; filename="{f_name}"',
        "Authorization": f"ApiToken {self.authToken}",
        "Content-Type": "application/octet-stream; charset=utf-8",
    }

    # Make POST request and catch results
    file = {"file": f}

    req = requests.post(
        "https://cloud.memsource.com/web/api2/v1/projects/{project-id}/jobs",
        headers=header,
        files=file,
    )
    print(req.request.headers)
    print(req.json())

The request header:

{
   "User-Agent":"python-requests/2.27.1",
   "Accept-Encoding":"gzip, deflate",
   "Accept":"*/*",
   "Connection":"keep-alive",
   "Memsource":"{\"targetLangs\": [\"pl\"]}",
   "Content-Disposition":"attachment; filename=\"test.txt\"",
   "Authorization":"ApiToken {secret}",
   "Content-Type":"application/octet-stream; charset=utf-8",
   "Content-Length":"2902"
}

And the response from Memsource:

    {
   "asyncRequest":{
      "action":"IMPORT_JOB",
      "dateCreated":"2022-02-22T18:36:30 0000",
      "id":"{id}"
   },
   "jobs":[
      {
         "workflowLevel":1,
         "workflowStep":{
            "uid":"{uid}",
            "order":2,
            "id":"{id}",
            "name":"Tra"
         },
         "imported":false,
         "dateCreated":"2022-02-22T18:36:30 0000",
         "notificationIntervalInMinutes":-1,
         "updateSourceDate":"None",
         "dateDue":"2022-10-10T12:00:00 0000",
         "targetLang":"pl",
         "continuous":false,
         "jobAssignedEmailTemplate":"None",
         "uid":"{id}",
         "status":"NEW",
         "filename":"test.txt",
         "sourceFileUid":"{id}",
         "providers":[
            
         ]
      }
   ],
   "unsupportedFiles":[
      
   ]
}

both look okay to me...

I will appreciate any suggestions on how to get this thing working! :-)

CodePudding user response:

I managed to fix this problem — noticed that requests are adding some limited headers to the body of the request, i.e., the content of the file passed in files parameter.

I simply got rid of that and changed the code as follows:

# Open the file to be uploaded and extract file name
        with open(
            "/file.ext", "rb"
        ) as f:
            f_name = os.path.basename(f.name)

            # Assemble the request header
            header = {
                "Memsource": json.dumps(Memsource_header),
                "Content-Disposition": f'attachment; filename="{f_name}"',
                "Authorization": f"ApiToken {self.authToken}",
                "Content-Type": "application/octet-stream; charset=utf-8",
            }

            req = requests.post(
                "https://cloud.memsource.com/web/api2/v1/projects/{project-id}/jobs",
                headers=header,
                data=f,
            )
  • Related