Home > OS >  Encoding problem while processing a multipart request on Indy HTTP server
Encoding problem while processing a multipart request on Indy HTTP server

Time:12-01

I have a web server based on TIdHTTPServer. It is built in Delphi Sydney. From a webpage I'm receiving following multipart/form-data post stream:

-----------------------------16857441221270830881532229640 
Content-Disposition: form-data; name="d"

83AAAFUaVVs4Q07z
-----------------------------16857441221270830881532229640 
Content-Disposition: form-data; name="dir"

Upload
-----------------------------16857441221270830881532229640 
Content-Disposition: form-data; name="file_name"; filename="česká tečka.png"
Content-Type: image/png

PNG_DATA    
-----------------------------16857441221270830881532229640--

Problem is that text parts are not received correctly. I read the Indy MIME decoding of Multipart/Form-Data Requests returns trailing CR/LF and changed transfer encoding to 8bit which helps to receive file correctly, but received file name is still wrong (dir should be Upload and filename should be česká tečka.png).

d=83AAAFUaVVs4Q07z
dir=UploadW
??esk?? te??ka.png 75

To demonstrate the issue I simplified my code to a console app (please note that the MIME.txt file contains the same as is in post stream above):

program MIMEMultiPartTest;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.Classes, System.SysUtils,
  IdGlobal, IdCoder, IdMessage, IdMessageCoder, IdGlobalProtocols, IdCoderMIME, IdMessageCoderMIME,
  IdCoderQuotedPrintable, IdCoderBinHex4;


procedure ProcessAttachmentPart(var Decoder: TIdMessageDecoder; var MsgEnd: Boolean);
var
  MS: TMemoryStream;
  Name: string;
  Value: string;
  NewDecoder: TIdMessageDecoder;
begin
  MS := TMemoryStream.Create;
  try
    // http://stackoverflow.com/questions/27257577/indy-mime-decoding-of-multipart-form-data-requests-returns-trailing-cr-lf
    TIdMessageDecoderMIME(Decoder).Headers.Values['Content-Transfer-Encoding'] := '8bit';
    TIdMessageDecoderMIME(Decoder).BodyEncoded := False;
    NewDecoder := Decoder.ReadBody(MS, MsgEnd);
    MS.Position := 0; // nutne?
    if Decoder.Filename <> EmptyStr then // je to atachment
    begin
      try
        Writeln(Decoder.Filename   ' '   IntToStr(MS.Size));
      except
        FreeAndNil(NewDecoder);
        Writeln('Error processing MIME');
      end;
    end
    else // je to parametr
    begin
      Name := ExtractHeaderSubItem(Decoder.Headers.Text, 'name', QuoteHTTP);
      if Name <> EmptyStr then
      begin
        Value := string(PAnsiChar(MS.Memory));
        try
          Writeln(Name   '='   Value);
        except
          FreeAndNil(NewDecoder);
        Writeln('Error processing MIME');
        end;
      end;
    end;
    Decoder.Free;
    Decoder := NewDecoder;
  finally
    MS.Free;
  end;
end;

function ProcessMultiPart(const ContentType: string; Stream: TStream): Boolean;
var
  Boundary: string;
  BoundaryStart: string;
  BoundaryEnd: string;
  Decoder: TIdMessageDecoder;
  Line: string;
  BoundaryFound: Boolean;
  IsStartBoundary: Boolean;
  MsgEnd: Boolean;
begin
  Result := False;
  Boundary := ExtractHeaderSubItem('multipart/form-data; boundary=---------------------------16857441221270830881532229640', 'boundary', QuoteHTTP);
  if Boundary <> EmptyStr then
  begin
    BoundaryStart := '--'   Boundary;
    BoundaryEnd := BoundaryStart   '--';
    Decoder := TIdMessageDecoderMIME.Create(nil);
    try
      TIdMessageDecoderMIME(Decoder).MIMEBoundary := Boundary;
      Decoder.SourceStream := Stream;
      Decoder.FreeSourceStream := False;
      BoundaryFound := False;
      IsStartBoundary := False;
      repeat
        Line := ReadLnFromStream(Stream, -1, True);
        if Line = BoundaryStart then
        begin
          BoundaryFound := True;
          IsStartBoundary := True;
        end
        else
        begin
          if Line = BoundaryEnd then
            BoundaryFound := True;
        end;
      until BoundaryFound;
      if BoundaryFound and IsStartBoundary then
      begin
        MsgEnd := False;
        repeat
          TIdMessageDecoderMIME(Decoder).MIMEBoundary := Boundary;
          Decoder.SourceStream := Stream;
          Decoder.FreeSourceStream := False;
          Decoder.ReadHeader;
          case Decoder.PartType of
            mcptText,
            mcptAttachment:
              begin
                ProcessAttachmentPart(Decoder, MsgEnd);
              end;
            mcptIgnore:
              begin
                Decoder.Free;
                Decoder := TIdMessageDecoderMIME.Create(nil);
              end;
            mcptEOF:
              begin
                Decoder.Free;
                MsgEnd := True;
              end;
          end;
        until (Decoder = nil) or MsgEnd;
        Result := True;
      end
    finally
      Decoder.Free;
    end;
  end;
end;

var
  Stream: TMemoryStream;
begin
  Stream := TMemoryStream.Create;
  try
    Stream.LoadFromFile('MIME.txt');
    ProcessMultiPart('multipart/form-data; boundary=---------------------------16857441221270830881532229640', Stream);
  finally
    Stream.Free;
  end;
  Readln;
end.

Could someone help me what is wrong with my code? Thank you.

CodePudding user response:

The stream data you have shown is malformed, as most of the required line breaks are missing, as is the ending MIME boundary after the PNG data. The data should look more like this:

-----------------------------16857441221270830881532229640
Content-Disposition: form-data; name="d"

83AAAFUaVVs4Q07z
-----------------------------16857441221270830881532229640
Content-Disposition: form-data; name="dir"

Upload
-----------------------------16857441221270830881532229640
Content-Disposition: form-data; name="file_name"; filename="Äeská teÄka.png"
Content-Type: image/png

[PNG_DATA]
-----------------------------16857441221270830881532229640--

Your call to ExtractHeaderSubItem() in ProcessMultiPart() is wrong, it needs to pass in the ContentType string parameter, not a hard-coded string literal.

Your call to ExtractHeaderSubItem() in ProcessAttachmentPart() is also wrong, it needs to pass in only the content of just the Content-Disposition header, not the entire Headers list.

Regarding the dir MIME part, there is no reason why Indy should be returning the body data as UploadW instead of Upload. I don't know where that W is coming from, you are going to have to debug that one for yourself.

But, regarding the Decoder.FileName, that value is not affected by the Content-Transfer-Encoding header at all. MIME headers simply do not allow unencoded Unicode characters. Currently, Indy's MIME decoder supports RFC2047-style encodings for Unicode characters in headers, per RFC 7578 Section 5.1.3, but your stream data is not using that format. It looks like your data is using raw UTF-8 octets (which 5.1.3 also mentions as a possible encoding, but Indy does not currently support that), except that ÄŤeská teÄŤka.png is not the correct UTF-8 form of česká tečka.png, it looks like that data has possibly been double-encoded (ie, česká tečka.png was UTF-8 encoded, and then the resulting bytes were UTF-8 encoded again). That is not a fault of Indy, your source data is just wrong to begin with. So, you are going to have to decode the original filename value manually.

CodePudding user response:

Nowadays the filename parameter should only be added for fallback reasons, while filename* should be added to clearly tell which text encoding the filename has. Otherwise each client only guesses and supposes. Which may go wrong.

  • RFC 5987 §3.2 defines the format of that filename* parameter:

    charset ' [ language ] ' value-chars

    ...whereas:

    charset can be UTF-8 or ISO-8859-1 or any MIME-charset

    ...and the language is optional.

  • RFC 6266 §4.3 defines that filename* should be used and comes up with examples in §5:

    Content-Disposition: attachment; filename="EURO rates"; filename*=utf-8''           
  • Related