I am trying to write a script to help me automate some work with our CAT tool (Memsource). To this end, I need to upload some files using API.
I rely on Memsource API documentation available here: https://cloud.memsource.com/web/docs/api#operation/createJob
I wrote a short code to test file uploading before moving to making it async, and I have some serious problem: text files are uploaded correctly, although the body of the text contains some additions after uploading:
--4002a5507da490554ad71ce8591ccf69
Content-Disposition: form-data; name="file"; filename=“test.txt"
I also tried to upload DOCX file, but it cannot be even opened in Memsource online editor — I guess the content is modified along the way, but I am unable to find where...
The code responsible for the upload is as follows:
def test_upload(self):
# Assemble "Memsource" header as mentioned in the API docs
Memsource_header = {
"targetLangs": ["pl"],
}
# Open the file to be uploaded and extract file name
f = open("Own/TMS_CAT/test.txt", "rb")
f_name = os.path.basename(f.name)
# Assemble the request header
header = {
"Memsource": json.dumps(Memsource_header),
"Content-Disposition": f'attachment; filename="{f_name}"',
"Authorization": f"ApiToken {self.authToken}",
"Content-Type": "application/octet-stream; charset=utf-8",
}
# Make POST request and catch results
file = {"file": f}
req = requests.post(
"https://cloud.memsource.com/web/api2/v1/projects/{project-id}/jobs",
headers=header,
files=file,
)
print(req.request.headers)
print(req.json())
The request header:
{
"User-Agent":"python-requests/2.27.1",
"Accept-Encoding":"gzip, deflate",
"Accept":"*/*",
"Connection":"keep-alive",
"Memsource":"{\"targetLangs\": [\"pl\"]}",
"Content-Disposition":"attachment; filename=\"test.txt\"",
"Authorization":"ApiToken {secret}",
"Content-Type":"application/octet-stream; charset=utf-8",
"Content-Length":"2902"
}
And the response from Memsource:
{
"asyncRequest":{
"action":"IMPORT_JOB",
"dateCreated":"2022-02-22T18:36:30 0000",
"id":"{id}"
},
"jobs":[
{
"workflowLevel":1,
"workflowStep":{
"uid":"{uid}",
"order":2,
"id":"{id}",
"name":"Tra"
},
"imported":false,
"dateCreated":"2022-02-22T18:36:30 0000",
"notificationIntervalInMinutes":-1,
"updateSourceDate":"None",
"dateDue":"2022-10-10T12:00:00 0000",
"targetLang":"pl",
"continuous":false,
"jobAssignedEmailTemplate":"None",
"uid":"{id}",
"status":"NEW",
"filename":"test.txt",
"sourceFileUid":"{id}",
"providers":[
]
}
],
"unsupportedFiles":[
]
}
both look okay to me...
I will appreciate any suggestions on how to get this thing working! :-)
CodePudding user response:
I managed to fix this problem — noticed that requests are adding some limited headers to the body of the request, i.e., the content of the file passed in files parameter.
I simply got rid of that and changed the code as follows:
# Open the file to be uploaded and extract file name
with open(
"/file.ext", "rb"
) as f:
f_name = os.path.basename(f.name)
# Assemble the request header
header = {
"Memsource": json.dumps(Memsource_header),
"Content-Disposition": f'attachment; filename="{f_name}"',
"Authorization": f"ApiToken {self.authToken}",
"Content-Type": "application/octet-stream; charset=utf-8",
}
req = requests.post(
"https://cloud.memsource.com/web/api2/v1/projects/{project-id}/jobs",
headers=header,
data=f,
)