I am getting some binary text data that can be in different file formats. How can I save this to a "real" file.
I tried using the BinaryWriter but when opening the file it is not correct, I get an encoding error. I do set the encoding.
https://docs.microsoft.com/en-us/dotnet/api/system.io.binarywriter?redirectedfrom=MSDN&view=net-6.0
I can provide code later if needed, but I am not sure If the binarywriter is the correct class for this.
Below is what the binary string looks like for a word document (truncated)
------=_Part_174495_1036280534.1637933726817
Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Content-Transfer-Encoding: binary
Content-Disposition: attachment; filename="Dummy_attachment_Ariba.docx"
Content-ID: <000D3A2BB3F41EEC928A7BA5E05A5B2C>
PK ! ?b\?x ? [Content_Types].xml ?(? !jgs6?, ??v????Sz???*a???? ????b4?y??4?m????q?J3??R?p?Hj?^?w? ~=?p?,?? 6=@!V??-??I?????????h)??|m???I?H??K??50~4??|??^h4A H?"?(??o\P\9?*I???9??BKh???NB?4??dm?????3?????D??8w"l`??'?N??9????u'X????s?D17????M?sx6???T$uN??6[?õ??R?ta??I??d}????
?o??*? ??m????Of? ?? PK ! ?U~? ? _rels/.rels ?(? ??MK1???!?;?*"??^D?Md?C2????????(?.??3y??3C??? ?4xW??(A??????yX?JB???Wp????b??#InJ????*?E?b?=[J???M?%???a ??????9m?.?????3???Y? ?? PK ! ??f1? ?b?R???1?EF7Z?n???hY?jy??#1'?<???7
word/document.xml??[o?0??'?? ?[CBsAM???=L???yr?V?E????C?Tt?/??|????????I??????? 2a"]??~~?????X$8??.?#5?????"N$?s*a?B???Y?b??(??3???[{M$Gr?e??B???0(??????8`?p?-? e?????e?Cn???? D8
???U r^u@? x?!?#??di?%M???]?l?SN?[?RQ?[?9???)?X???
?
?'??^?????">?_5??????5?????:e?H?r!??jv8J???????Z?Pa????iU???q???W??O? ??F^?=?P???A?9Kn?? ??`BX??U6!?<?z??#o?z??U??{????h??_?[????w???3?Vp$pK??x??GPC??W???ªxn??Kx*ldrt???????i4~??v???h~?oWt???=?)1k?]5?Hp???G??y=?N?U~??@l??j?????b???{?6??J?J??????,W?V`Y??$?`?????"i$ ????n??_B???.&85?p??"??2*?*???J8??(*=?,?l??Hk%o?9??f'?N???n??g?to?nG??|? ?d?axW>iW=q?]3K?????????
9 word/_rels/document.xml.rels ?(? ???N?0??H???w?@A?N/?R??M6?"YG???c??PE=??c???Zu??@?C
?(?????J?[??y?XS?[C?`@???j???f???w»?SP3?OR???N???H??4???G[?^??B???SHO<YP`??-?l??oS?M??&?wH|&B~??????BV0#?<?CH????
CodePudding user response:
BinaryWriter
is almost never the right tool for any job - it doesn't do what people usually think. What you probably want here is simply: a Stream
(i.e. File.Create(...)
. You would obtain the data from ... wherever it is coming from, and use the various Write
APIs to append it, usually in chunks.
If the data is not known to be encoded text, then any moment you have string
or char[]
(or similar) data: you've corrupted it, so: don't do that. Stay purely in binary.
If the data is known to be encoded text, but you don't know the precise encoding used, then frankly: treat it as binary.
Anything more than that: would require specific examples of what you're doing.
CodePudding user response:
The question is really what encoding it is. may it's already corrupted. Maybe its the wrong way but cause it is a word file i would try to bruteforce it by checking all possible encodings und try to open the files with an word api and maybe one will work or fail but wouldn't take that long
var encodings = Encoding.GetEncodings().ToList();
encodings.ForEach(encoding =>
{
File.WriteAllBytes($"{encoding.Name}.docx", Encoding.GetEncoding(encoding.Name).GetBytes(data));
});
encodings.ForEach(encoding =>
{
try
{
/*to do: open $"{encoding.Name}.docx" with an word api*/
Console.WriteLine($"{encoding.Name} works");
}
catch { }
});
Console.WriteLine("finished");
Console.ReadKey();
In case you have control over the sending part use base64 worked usually fine for me for http requests. But If I understand it right it is not the case