The files are named as follows: 5000023_abc_2000045.pdf, 5000023_def_2000045.pdf.
All files are in the same directory.
I want to write all these files sorted into an array and then put them to a 3rd party program to merge them. It is about 60000 files.
I tried with getfiles, but that didn't work. sorry, i'm a total beginner.
many thanks in advance
Heres my Code
using System.Diagnostics;
using System.IO;
using bin_vMergePdfNeu.Properties;
namespace System.Configuration
{
class Program
{
static void Main(string[] args)
{
Console.Title = "bin_vMergePDF";
// Ordner aus dem die PDF gezogen werden
string altDirIn = Settings.Default.AlternativeInputDir;
// Ordner in dem die zusammengeführte PDF abgelegt werden soll
string altDirOut = Settings.Default.AlternativeOutputDir;
////////////////////////////////////////////////////////////////////
//var paths = new Collections.Generic.List<string>();
//paths.Add(altDirIn);
//var cmd = String.Join(" ", altDirIn) " cat output " altDirOut;
////////////////////////////////////////////////////////////////////
string[] rchg = Directory.GetFiles(altDirIn, "*.pdf".Split('_')[0]);
string cmd = string.Join(" ", rchg) " cat output " altDirOut;
Process p = new Process();
p.StartInfo.WorkingDirectory = Environment.CurrentDirectory;
p.StartInfo.FileName = "pdftk.exe";
p.StartInfo.Arguments = cmd;
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.RedirectStandardError = true;
p.Start();
Console.WriteLine(p.StandardError.ReadToEnd());
Console.WriteLine();
Console.ReadKey();
p.WaitForExit();
}
}
}
CodePudding user response:
This would do the job:
var folder = new DirectoryInfo(altDirIn);
var files = folder.EnumerateFiles("*.pdf").OrderBy(fi => Convert.ToInt32(fi.Name.Split('_')[0]));
var cmd = $"{string.Join(" ", files.Select(fi => fi.FullName))} cat output {altDirOut}";
That assumes that every file starts with a number and an underscore.
It's also worth noting that, if you want to sort file names the same way File Explorer does, you can use the same Windows API that File Explorer uses. You can create a comparer that incorporates that API:
public class LogicalStringComparer : IComparer, IComparer<string>
{
[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string x, string y);
int IComparer.Compare(object x, object y)
{
return Compare((string) x, (string) y);
}
public int Compare(string x, string y)
{
return StrCmpLogicalW(x, y);
}
}
You can then use that class to sort your file paths:
var filePaths = Directory.EnumerateFiles(altDirIn, "*.pdf").OrderBy(s => s, new LogicalStringComparer());
var cmd = $"{string.Join(" ", filePaths)} cat output {altDirOut}";
It's also worth noting that that code, which is based on what you wrote yourself, is probably going to fail if the folder path or any file name has a space in it. You could quote each path to handle that:
var cmd = $"""{string.Join(""" """, filePaths)}"" cat output {altDirOut}";
CodePudding user response:
So, it seems that you want to query files within directory, and you are going to do
- Obtain (enumerate) all
*.pdf
files within directory - Check if file name matches
number_name
pattern - Obtain
number
part to be sorted by it - Since
number
is in fact string of arbitrary length, we can't convert it to int; so we sort first by length and the for value - Get rid of
number
part; we want to getfile
only - Materialize results as an array
Code:
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
...
string[] rchg = Directory
.EnumerateFiles(altDirIn, "*.pdf")
.Select(file => Path.GetFileName(file))
.Where(file => Regex.IsMatch(file, "^[0-9] _"))
.Select(file => (file : file, number : file.Substring(0, file.IndexOf('_'))))
.OrderBy(pair => pair.number.TrimStart('0').Length)
.ThenBy(pair => pair.number)
.Select(pair => pair.file)
.ToArray();
Then be careful: if you have 60000
files, the array can well be huge and hardly you pass it via command line. You can try saving the results into a file and pass its name to the exe