I have gone through the link of how to extract a .tar
file and several link on SOF using Java.
However, I didnt find any which can relate to my concerns which is multilevel or nested .tar/.tgz/.zip file
.
my concern is with something like below
Abc.tar.gz
--DEF.tar
--sample1.txt
--sample2.txt
--FGH.tgz
--sample3.txt
-sample4.txt
This is the simple one which I can give here . As it can be in any compressed combination with the folder like .tar
inside .tar
and .gz
and again .tgz
and so on....
My problem is I am able to extract till the first level
using Apache Commons Compress library
. that is if Abc.tar.gz gets extracted then in the destination/output folder its only DEF.tar available
. beyond that my extraction is not working.
I tried to give the output of first to the input to the second on the fly but I got stuck with FileNotFoundException. As at that point of time output file would have not been in place and the second extraction not able to get the file.
Pseudocode:
public class CommonExtraction {
TarArchiveInputStream tar = null;
if((sourcePath.trim().toLowerCase.endsWith(".tar.gz")) || sourcePath.trim().toLowerCase.endsWith(".tgz")) {
try {
tar=new TarArchiveInputStream(new GzipCompressorInputStream(new BufferedInputStream(new FileInputStream(sourcePath))));
extractTar(tar,destPath)
} catch (Exception e) {
e.printStackTrace();
}
}
}
Public static void extractTar(TarArchiveInputStream tar, String outputFolder) {
try{
TarArchiveEntry entry;
while (null!=(entry=(TarArchiveEntry)tar.getNextTarEntry())) {
if(entry.getName().trim().toLowerCase.endsWith(".tar")){
final String path = outputFolder entry.getName()
tar=new TarArchiveInputStream(new BufferedInputStream(new FileInputStream(path))) // failing as .tar folder after decompression from .gz not available at destination path
extractTar(tar,outputFolder)
}
extractEntry(entry,tar,outputFolder)
}
tar.close();
}catch(Exception ex){
ex.printStackTrace();
}
}
Public static void extractEntry(TarArchiveEntry entry , InputStream tar, String outputFolder){
final String path = outputFolder entry.getName()
if(entry.isDirectory()){
new File(path).mkdirs();
}else{
//create directory for the file if not exist
}
// code to read and write until last byte is encountered
}
}
Ps: please ignore the syntax and all in the code.
CodePudding user response:
Try this
try (InputStream fi = file.getInputStream();
InputStream bi = new BufferedInputStream(fi);
InputStream gzi = new GzipCompressorInputStream(bi, false);
ArchiveInputStream archive = new TarArchiveInputStream(gzi)) {
withArchiveStream(archive, result::appendEntry);
}
As i see what .tar.gz and .tgz is same formats. And my method withArchiveEntry is:
private void withArchiveStream(ArchiveInputStream archInStream, BiConsumer<ArchiveInputStream, ArchiveEntry> entryConsumer) throws IOException {
ArchiveEntry entry;
while((entry = archInStream.getNextEntry()) != null) {
entryConsumer.accept(archInStream, entry);
}
}
private void appendEntry(ArchiveInputStream archive, ArchiveEntry entry) {
if (!archive.canReadEntryData(entry)) {
throw new IOException("Can`t read archive entry");
}
if (entry.isDirectory()) {
return;
}
// And for example
String content = new String(archive.readAllBytes(), StandardCharsets.UTF_8);
System.out.println(content);
}
CodePudding user response:
You have a recursive problem, so you can use recursion to solve it. Here is some pseudocode to show how it can be done:
public class ArchiveExtractor
{
public void extract(File file)
{
List<File> files; // list of extracted files
if(isZip(file))
files = extractZip(file);
else if(isTGZ(file))
files = extractTGZ(file);
else if(isTar(file))
files = extractTar(file);
else if(isGZip(file))
files = extractGZip(file);
for(File f : files)
{
if(isArchive(f))
extract(f); // recursive call
}
}
private List<File> extractZip(File file)
{
// extract archive and return list of extracted files
}
private List<File> extractTGZ(File file)
{
// extract archive and return list of extracted files
}
private List<File> extractTar(File file)
{
// extract archive and return list of extracted files
}
private List<File> extractGZip(File file)
{
// extract archive and return list of extracted file
}
}