Home > Net >  C# winform - Find file type based on data, not extensions
C# winform - Find file type based on data, not extensions

Time:11-03

So here's my problem - I've got a ton of files that all have the same file extension (e.g., .aabc) but are all widely different file types. Some could be emails, others could be videos. Point is, I don't know what these files are and I need to convert them to something usable but to do that, I need to find out what they are.

I've tried the Mime-Detective nuget extension but that doesn't want to work. The "ContentInspectorBuilder" class doesn't have an inspect method like in the example code on the GitHub. Below is what I've done for it

using System;
using System.IO;
using System.Security;
using System.Windows.Forms;
using MimeDetective;

namespace MimeDetectiveTest
{
    public partial class Form1 : Form
    {
        private OpenFileDialog openFile;
        private ContentInspectorBuilder inspector;
        public Form1()
        {
            InitializeComponent();
            openFile = new OpenFileDialog();
            inspector = new ContentInspectorBuilder()
            {
                Definitions = new MimeDetective.Definitions.ExhaustiveBuilder()
                {
                    UsageType = MimeDetective.Definitions.Licensing.UsageType.PersonalNonCommercial
                }.Build()
            };
        }

        private String findFileType(String path)
        {
            var content = ContentReader.Default.ReadFromFile(path);
            var results = ContentInspectorExtensions.Inspect(inspector, content);
            return results.ByFileExtension().ToString();
        }

        private void openFileButton_Click(object sender, EventArgs e)
        {
            if (openFile.ShowDialog() == DialogResult.OK)
            {
                try
                {
                    var filePath = openFile.FileName;
                    var fileExtension = Path.GetExtension(filePath);
                    var newFileExtension = findFileType(filePath);
                    filePathLabel.Text = filePath;
                    extensionLabel.Visible = true;
                    extensionLabel.Text = "File Identified - "   newFileExtension;
                }
                catch (SecurityException ex)
                {
                    MessageBox.Show($"Security error.\n\nError message: {ex.Message}\n\n"  
                    $"Details:\n\n{ex.StackTrace}");
                }
            }
        }
    }
}

The closest I've gotten to it working how I wanted is using this Github for another Mime-Detective but if I send specific file formats (.xml or .flac as examples), the entire thing crashes or it outputs nothing (.htm as an example). Code is similar to the top one, just remove the inspector from the starting bit of code and change the findFileType method to this:

private String findFileType(String path)
    {
        Stream fileDataStream = File.Open(path, FileMode.Open);
        FileType fileType = fileDataStream.GetFileType();
        return fileType.Extension;
    }

CodePudding user response:

I think you just missed one Build().

If you compare the example code on github:

var Inspector = new ContentInspectorBuilder() {
    Definitions = new Definitions.ExhaustiveBuilder() {
        UsageType = Definitions.Licensing.UsageType.PersonalNonCommercial
    }.Build()
}.Build(); // <=====

to your code:

inspector = new ContentInspectorBuilder()
{
    Definitions = new MimeDetective.Definitions.ExhaustiveBuilder()
    {
        UsageType = MimeDetective.Definitions.Licensing.UsageType.PersonalNonCommercial
    }.Build()
}; // <=====

you'll find that they are operating on the result that the builder builds (which is a ContentInspector, not a ContentInspectorBuilder), not on the builder itself. Which also explains the missing method.

And actually, I am wondering how this line

var results = ContentInspectorExtensions.Inspect(inspector, content);

didn't throw. Or did it? I couldn't find a matching extension on the Builder type. So, I would have expected a related error.

  • Related