I have a Java program that goes over all the folders and files inside a given folder and it prints them. The issues is that now I want to count all the files in the folder by their extension. How can I do that?
This is my code so far:
package ut5_3;
import java.io.File;
import java.util.ArrayList;
import org.apache.commons.io.FilenameUtils;
public class UT5_3 {
ArrayList<String> extensionArray = new ArrayList<String>();
ArrayList<String> extensionArrayTemp = new ArrayList<String>();
public static void main(String[] args) {
File directoryPath = new File("C://Users//Anima//Downloads//MiDir");
File[] directoryContent = directoryPath.listFiles();
UT5_3 prueba = new UT5_3();
prueba.goOverDirectory(directoryContent);
}
public void goOverDirectory(File[] directoryContent) {
for (File content : directoryContent) {
if (content.isDirectory()) {
System.out.println("===========================================");
System.out.println(" " content.getName());
goOverDirectory(content.listFiles()); // Calls same method again.
} else {
String directoryName = content.getName();
System.out.println(" -" directoryName);
String fileExtension = FilenameUtils.getExtension(directoryName);
}
}
}
}
So far I've tried making two ArrayList. One to store all the extensions from all the files and another one to store the distinct extensions, but I got really confused and I didn't know how to continue with that idea.
If it is too complex, it'd be great if you could explain an easier way to do it too. Thanks!
CodePudding user response:
So if I understand you correctly, you want to store the count of all of the file extensions you see. Here is how I would approach it.
First initialize a map with a String as the key and int as the value, at the top of your class like so.
private HashMap<String, Integer> extensions = new HashMap<String, Integer>();
Then you can run this whenever you need to add an extension to the count.
String in = "exe";
if(extensions.containsKey(in))
extensions.put(in, extensions.get(in) 1);
else
extensions.put(in, 1);
If you want to retrieve the count, simply run extensions.get(in)
.
If you didn't know already, a HashMap allows you to assign a key
to a value
. In this case, the key
is the file extension and value
is the count.
Good day
CodePudding user response:
Using Java 8 or higher is actually not too difficult. First, you need to collect the items. Then, you need to group them by extension and create a map of extensions with the count.
public static void main(String[] args) throws IOException {
Path start = Paths.get("C:/Users/Hector/Documents");
List<String> fileNames = new ArrayList<>();
try (DirectoryStream<Path> stream = Files.newDirectoryStream(start)){
for (Path entry : stream) {
if (!Files.isDirectory(entry)) {
fileNames.add(entry.getFileName().toString());// add file name in to name list
}
}
}
fileNames.stream().map(String::valueOf).sorted()
.collect(Collectors.groupingBy(fileName -> fileName.substring(fileName.lastIndexOf(".") 1)))
.entrySet().forEach(entry -> {
System.out.println("File extension: " entry.getKey() ", Count: " entry.getValue().size());
});
}
Produced the following output
File extension: zargo~, Count: 3
File extension: txt, Count: 1
File extension: exe, Count: 1
File extension: pdf, Count: 3
File extension: ini, Count: 1
File extension: log, Count: 1
File extension: svg, Count: 1
File extension: zargo, Count: 3
File extension: lnk, Count: 1
The reason why I chose this solution over using Files.walk()
is because this method can throw an AccessDeniedException
which is problematic when you need to traverse different folders. However, if you can control access to a folder by calling canRead()
method on the file object, then you could use it. I personally don't like it and many Java developers have issue with that function.
You could modify by adding it to a function that could be called recursively if the current path is a directory. Since the OP requirement is unclear if the count has to be for a given folder or the folder and all subfolders, I decided to just show the count for the current folder.
One last observation, when collecting the extensions, notice that I pass to the substring
function to start from the last index of the period character. This is important to capture the real extension for files with multiple periods in the name. For example: Setup.x64.en-US_ProPlusRetail_RDD8N-JF2VC-W7CC6-TVXJC-3YF3D_TX_PR_act_1_OFFICE_13.exe
. In a file, the extension always starts after the last period.