Home > Software design >  Cleanup empty categories from category tree
Cleanup empty categories from category tree

Time:11-04

I have created a category tree where each category contains sub-categories and each category contains its associated content. Some of the categories in this tree contain sub-categories but have no associated content. How can I clean up the category tree so that the tree structure contains only categories that have associated content, or that have sub-categories that have associated content? That is, in the category tree only paths should exist that lead to a category that has associated content.

The structure I have is an array:

[uid_of_category]
   => (array)content
      => empty
   => (array)sub_categories
      => [uid_of_category]
         => (array)content
             => empty
         => (array)sub_categories
      => [uid_of_category]
         => (array)content
             => [...associated content...]
         => (array)sub_categories
 [uid_of_category]
   => (array)content
      => empty
   => (array)sub_categories
      => [uid_of_category]
         => (array)content
             => [...associated content...]
         => (array)sub_categories
      => [uid_of_category]
         => (array)content
             => empty
         => (array)sub_categories
            => [uid_of_category]
               => (array)content
                 => [...associated content...]
            => (array)sub_categories
               ...

I tried to use a recursive function to get down to the lowest elements of the tree, but I don't know how to implement that even those elements remain in the tree that have no associated content, but whose sub-element has associated content.

CodePudding user response:

Obviously, recursion will help. But let use two functions instead of one. We will use one function to determine if a category meets the conditions to be cleaned (deleted). An another function to do the clean. In this function, the param will be passed by reference (&$categories). All of this assuming the structure is an array.

cleanCategories($categories);

function cleanCategories(&$categories)
{
    foreach ($categories as $key=>&$category) {
        if (isCleanable($category)) {
            unset($categories[$key]);
        } else {
            cleanCategories($category['sub_categories']);
        }
    }
}

function isCleanable($category)
{
    if (!empty($category['content'])) {
        return false;
    }

    foreach ($category['sub_categories'] as $category) {
        if (!isCleanable($category)) {
            return false;
        }
    }

    return true;
}

From this approach, you can do a better, more effient solution.

And don't forget Stack Overflow is not a writting code service. You must think by yourself, think again, find a way (just thinking), put it in code, try the code, debug the code...

CodePudding user response:

Tried to build an array based on the OP's post:

$arr = [
    "uid_of_category_0" => [
        "content"        => [],
        "sub_categories" => [
            "uid_of_category_0_0" => [
                "content"        => [],
                "sub_categories" => []
            ],
            "uid_of_category_0_1" => [
                "content"        => [
                    "associated_content_0_1_0",
                    "associated_content_0_1_1",
                    "associated_content_0_1_2"
                ],
                "sub_categories" => []
            ]
        ]
    ],
    "uid_of_category_1" => [
        "content"        => [],
        "sub_categories" => [
            "uid_of_category_1_0" => [
                "content"        => [
                    "associated_content_1_0_0",
                    "associated_content_1_0_1",
                    "associated_content_1_0_2"
                ],
                "sub_categories" => []
            ],
            "uid_of_category_1_1" => [
                "content"        => [],
                "sub_categories" => [
                    "uid_of_category_1_1_0" => [
                        "content"        => [
                            "associated_content_1_1_0_0",
                            "associated_content_1_1_0_1",
                            "associated_content_1_1_0_2"
                        ],
                        "sub_categories" => []
                    ]
                ]
            ]
        ]
    ],
];

This would be the code to unset all empty arrays within $arr:

function unset_empty_arrays(array $arr): array {
  foreach (array_keys($arr) as $key) {
    if (is_array($arr[$key])) {
      $arr[$key] = unset_empty_arrays($arr[$key]);
    }
    if ($arr[$key] === [] && $key === 'sub_categories') {
      unset($arr[$key]);
    }
  }
  return $arr;
}

print_r(unset_empty_arrays($arr));

Edit: added && $key === 'sub_categories'

CodePudding user response:

This case is best solved with postorder traversal of tree.

In postorder traversal you want to do recursion over all sub-trees first. Then do required operation on the current node.

function cleanUp(&$categories) {
    if (empty($categories)) {
        //no categories, nothing to do here
        return;
    }

    foreach ($categories as $key => &$category) {
        //first clean up sub categories
        cleanUp($category['sub_categories']);
 
        if (empty($category['sub_categories']) && empty($category['content'])) {
            //If there are no sub_categories left and there is no content remove category
            unset($categories[$key]);
        }
    }
}

After doing clean up on sub categories, you know that if there are any sub categories left, they must have content or sub categories with content. So, at that point you can easily decide if current category needs to be removed.

CodePudding user response:

One solution is to iterate the array recursively and copy the "non-empty" nodes into another array, leaving the original array as-is. But there is a catch:

For example, you encounter a non-empty node three levels deep and all of its parents are empty nodes, then you need to create the whole path in the destination array before you can add that node. Fortunately there are solutions. So we have:

/**
 * Copy non-empty categories inside the specified array of categories recursively
 * The definition of "non-empty" could be found inside the OP
 *
 * @param array $categories An array of categories - the root category or sub_categories of a category
 * @param string $path Ordered list of parent uids
 * @param array &$result An array into which categories will be copied
 * @return void
 */
function copy_categories_recursive($categories, $path, &$result) {
    foreach ($categories as $uid => $category) {
        if ($category['content']) {
            $ref = & $result;
            foreach ($path as $p_uid) {
                $ref = & $ref[$p_uid]['sub_categories'];
            }
            $ref[$uid]['content'] = $category['content'];
            unset($ref);
        }
        copy_categories_recursive($category['sub_categories'], array_merge($path, [$uid]) , $result);
    }
}

/**
 * Usage
 */
$source = [];
$result = [];
copy_categories_recursive($source, [], $result);

Sample $source:

[
    "1" => [
        "content" => ["foo"],
        "sub_categories" => [
            "1.1" => [
                "content" => ["foo"],
                "sub_categories" => []
            ],
            "1.2" => ["
                content" => [],
                "sub_categories" => []
            ]
        ]
    ],
    "2" => [
        "content" => ["foo"],
        "sub_categories" => []
    ],
    "3" => [
        "content" => [],
        "sub_categories" => [
            "3.1" => [
                "content" => ["foo"],
                "sub_categories" => []
            ],
            "3.2" => [
                "content" => [],
                "sub_categories" => []
            ]
        ]
    ]
];

Result:

[
    "1" => [
        "content" => ["foo"],
        "sub_categories" => [
            "1.1" => [
                "content" => ["foo"]
            ]
        ]
    ],
    "2" => [
        "content" => ["foo"]
    ],
    "3" => [
        "sub_categories" => [
            "3.1" => [
                "content" => ["foo"]
            ]
        ]
    ]
]

CodePudding user response:

As I understood your request, you have a tree and you want to clean it so as no branch without leaf remains in it.

In that case, I would like to propose a generic solution, working no matter the key names you chose. This solution is based on array_walk() and array_walk_recursive() functions. The first one allowing to walk the tree (or the branch) and the second one allowing to count the number of leafs in a tree (or branch), no matter the number of dimensions in it.

Here is a proposition:

function cleanTree($tree, $cleanTree = []) {

    // Walk the tree
    array_walk($tree, function ($branch, $node) use (&$cleanTree) {

        // Here is a leaf (the lowest element in the tree), we keep it
        if (!is_array($branch)) {
            $cleanTree[$node] = $branch;

        // Here is a branch that may contain some leafs
        } else {

            // Count the leafs of this branch
            $leafs = 0;
            array_walk_recursive($branch, function () use (&$leafs) { $leafs  ; });

            // There is at least one leaf on this branch, so we walk it
            if ($leafs) {
                $cleanTree[$node] = cleanTree($branch);
            }
        }

    });

    return $cleanTree;

}

Using this function, all the branches without leaf are not kept.

Declare your tree as an array then clean it:

$myTree = [...];
$myCleanTree = cleanTree($myTree);

I tried it with the array sample provided by @lukas.j (thank you for this effort) and it worked.

  • Related