Home > OS >  How can I preserve directory structure when creating a TAR file?
How can I preserve directory structure when creating a TAR file?

Time:09-17

I'm using microtar in a Qt 6 C project.

I'm trying to archive a directory. I want to archive a folder's contents, its subfolders and (text only) files, and preserve the directory structure within the outputted TAR file.

However, while I'm able to produce a TAR file containing all a given directory's contents and files, the directory structure is not preserved, and the contents of the TAR file all sit at the same depth/level.

I'm using the below code, where I've tried to iterate through a given directory and recurse on folders to grab any subfolders and files within, and etc.


tar_test.h:

#pragma once

#include "microtar.h"
#include <filesystem>
#include <QDir>
#include <QFile>
#include <QIODevice>
#include <QString>
#include <QTextStream>

const QString readFile(QString path);
void writeFile(QString text, QString path);
void dirToTar_entry(QString readPath, QString writePath);
void dirToTar_recursor(mtar_t& tar, QString readPath);
void checkMakeDir(QString path);

tar_test.cpp:

#include "tar_test.h"

const QString readFile(QString path)
{
    QString text;
    QFile file(path);
    if (file.open(QFile::ReadOnly | QIODevice::Text))
    {
        QTextStream in(&file);
        text = in.readAll();
        file.close();
    }
    return text;
}

void writeFile(QString text, QString path)
{
    QFile file(path);
    if (file.open(QIODevice::WriteOnly | QIODevice::Text))
    {
        QTextStream out(&file);
        out << text;
        file.close();
    }
}

void dirToTar_entry(QString readPath, QString writePath)
{
    auto tar_path_tmp = writePath.toLocal8Bit();
    auto tar_path = tar_path_tmp.data();
    mtar_t tar;
    mtar_open(&tar, tar_path, "w");
    dirToTar_recursor(tar, readPath);
    mtar_finalize(&tar);
    mtar_close(&tar);
}

void dirToTar_recursor(mtar_t& tar, QString readPath)
{
    const auto dirs = QDir(readPath).entryList(QDir::Dirs | QDir::NoDotAndDotDot);
    for (const auto& dir : dirs)
    {
        auto name_tmp = dir.toLocal8Bit();
        auto name = name_tmp.data();
        mtar_write_dir_header(&tar, name);
        //mtar_write_data(&tar, 0, 0); // I don't know if this is the issue, but the archive writes the same even if I omit this line?
        dirToTar_recursor(tar, QDir(readPath   "/"   dir).absolutePath());
    }
    const auto files = QDir(readPath).entryList(QDir::Files | QDir::NoDotAndDotDot);
    for (const auto& file : files)
    {
        auto name_tmp = file.toLocal8Bit();
        auto name = name_tmp.data();
        auto str_tmp = readFile(readPath   "/"   file).toLocal8Bit();
        auto str = str_tmp.data();
        mtar_write_file_header(&tar, name, strlen(str));
        mtar_write_data(&tar, str, strlen(str));
    }
}

void checkMakeDir(QString path)
{
    if (QDir(path).exists()) return;
    std::filesystem::path fs_path = path.toStdString();
    std::filesystem::create_directories(fs_path);
}

And then I have a test button where I create a test directory and output the TAR file:

connect(testButton, &QPushButton::clicked, this, [&]()
        {
            const QString path = "C:\\Dev\\NPT\\NewPaneTest\\TestProjects\\TestRoot\\";
            auto temp = QString::fromStdString(tempFolder.string())   "\\TestRoot\\";
            checkMakeDir(temp);
            auto test_file = temp   "\\test.tar";
            dirToTar_entry(path, test_file);
        });

I appreciate any and all help very much. I'm so sure I'm missing something obvious, but I just can't see it at the moment.


EDIT:

The solution, in my case, is that QDir::entryList is returning just dir or file names and not full or relative paths. So, without relative paths, everything was being archived at the same level. Here's what worked for me:

void dirToTar_entry(QString readPath, QString writePath)
{
    auto tar_path_tmp = writePath.toLocal8Bit();
    auto tar_path = tar_path_tmp.data();
    mtar_t tar;
    mtar_open(&tar, tar_path, "w");
    dirToTar_recursor(tar, readPath);
    mtar_finalize(&tar);
    mtar_close(&tar);
}

void dirToTar_recursor(mtar_t& tar, QString rootPath, QString currentReadPath)
{
    if (currentReadPath.isEmpty())
        currentReadPath = rootPath;
    QString relative_path;
    if (currentReadPath != rootPath)
        relative_path = QDir(rootPath).relativeFilePath(currentReadPath)   "/";
    else
        relative_path = "";

    const auto dirs = QDir(currentReadPath).entryList(QDir::Dirs | QDir::NoDotAndDotDot);
    for (const auto& dir : dirs)
    {
        auto path = relative_path   dir;
        auto name_tmp = path.toLocal8Bit();
        auto name = name_tmp.data();
        mtar_write_dir_header(&tar, name);
        mtar_write_data(&tar, 0, 0);
        dirToTar_recursor(tar, rootPath, QDir(currentReadPath   "/"   dir).absolutePath());
    }
    const auto files = QDir(currentReadPath).entryList(QDir::Files | QDir::NoDotAndDotDot);
    for (const auto& file : files)
    {
        auto path = relative_path   file;
        auto name_tmp = path.toLocal8Bit();
        auto name = name_tmp.data();
        auto str_tmp = readFile(currentReadPath   "/"   file).toLocal8Bit();
        auto str = str_tmp.data();
        mtar_write_file_header(&tar, name, strlen(str));
        mtar_write_data(&tar, str, strlen(str));
    }
}

Dividing the readPath parameter into a static rootPath, plus the optional currentReadPath to be passed only on recursions, allowed me to always get the relative path using QDir::relativeFilePath at the start of the function.

CodePudding user response:

Below is a working example that uses different directory recursion based on the answer here using QDirIterator. Also the answer uses the approach demonstrated in the nest program here.

I believe the details of creating the tar directory versus tar file headers are causing the issues in the sample posted. There is a lot of legacy formatting in tar archive as described in the format document here

# FILE: test1.cpp
#include "microtar.h"
#include <QDirIterator>
#include <QDir>
#include <QFile>
#include <QIODevice>
#include <QString>

const QString readFile(QString path)
{
    QString text;
    QFile file(path);
    if (file.open(QFile::ReadOnly | QIODevice::Text))
    {
        QTextStream in(&file);
        text = in.readAll();
        file.close();
    }
    return text;
}

int main() {
 const QString path = "abc";
 auto test_file = "test.tar";
 mtar_t tar;
 mtar_open(&tar, test_file, "w");
 QDirIterator it( path, QStringList() << "*.csv", QDir::Files, QDirIterator::Subdirectories);
 while (it.hasNext())
 {
   auto name = it.next().toLocal8Bit().data();
   auto str = readFile(name).toLocal8Bit().data();
   mtar_write_file_header(&tar, name, strlen(str));
   mtar_write_data(&tar, str, strlen(str));
 }
 mtar_finalize(&tar);
 mtar_close(&tar);
 return 0;
}

Created using these commands:

qmake -project
echo "CONFIG -= app_bundle" >> test1.pro
qmake
make
./test1
tar -tf test.tar 
abc/one/x00.csv
abc/two/x01.csv
abc/two/three/x02.csv

You can explore microtar like this in C code.

git clone https://github.com/rxi/microtar.git
git clone https://github.com/domeengine/nest.git
cd nest/src
cp ../../microtar/src/microtar.c .
cp ../../microtar/src/microtar.h .
gcc -I -o nest main.c
mv ../../abc .
./nest -z -o abc.tar abc
Bundling: abc/one/x00.csv
Bundling: abc/two/x01.csv
Bundling: abc/two/three/x02.csv
Created bundle abc.tar.
tar -tf abc.tar 
one/x00.csv
two/x01.csv
two/three/x02.csv

The tar format described here indicates:

New archives should be created using REGTYPE. Also, for backward compatibility, tar treats a regular file whose name ends with a slash as a directory.

The example code only crates new archives using mtar_write_file_header.

Another option is to examine mtar_write_directory_header and refer to the TAR file format that says things like this:

The directory name in the name field should end with a slash

CodePudding user response:

The solution, in my case, is that QDir::entryList is returning just dir or file names and not full or relative paths. So, without relative paths, everything was being archived at the same level. Here's what worked for me:

void dirToTar_entry(QString readPath, QString writePath)
{
    auto tar_path_tmp = writePath.toLocal8Bit();
    auto tar_path = tar_path_tmp.data();
    mtar_t tar;
    mtar_open(&tar, tar_path, "w");
    dirToTar_recursor(tar, readPath);
    mtar_finalize(&tar);
    mtar_close(&tar);
}

void dirToTar_recursor(mtar_t& tar, QString rootPath, QString currentReadPath)
{
    if (currentReadPath.isEmpty())
        currentReadPath = rootPath;
    QString relative_path;
    if (currentReadPath != rootPath)
        relative_path = QDir(rootPath).relativeFilePath(currentReadPath)   "/";
    else
        relative_path = "";

    const auto dirs = QDir(currentReadPath).entryList(QDir::Dirs | QDir::NoDotAndDotDot);
    for (const auto& dir : dirs)
    {
        auto path = relative_path   dir;
        auto name_tmp = path.toLocal8Bit();
        auto name = name_tmp.data();
        mtar_write_dir_header(&tar, name);
        mtar_write_data(&tar, 0, 0);
        dirToTar_recursor(tar, rootPath, QDir(currentReadPath   "/"   dir).absolutePath());
    }
    const auto files = QDir(currentReadPath).entryList(QDir::Files | QDir::NoDotAndDotDot);
    for (const auto& file : files)
    {
        auto path = relative_path   file;
        auto name_tmp = path.toLocal8Bit();
        auto name = name_tmp.data();
        auto str_tmp = readFile(currentReadPath   "/"   file).toLocal8Bit();
        auto str = str_tmp.data();
        mtar_write_file_header(&tar, name, strlen(str));
        mtar_write_data(&tar, str, strlen(str));
    }
}

Dividing the readPath parameter into a static rootPath, plus the optional currentReadPath to be passed only on recursions, allowed me to always get the relative path using QDir::relativeFilePath at the start of the function.

  • Related