Home > front end >  How to find all Confluence pages with PowerShell
How to find all Confluence pages with PowerShell

Time:05-10

I'm trying to pull all Confluence pages in my instance that haven't been modified since 1/1/21. I was able to get all of the parent pages that haven't been modified since 1/1/21 fairly easily. However, I'm now trying to get all of the child pages.

I know get-confluencechildpage has a -recurse option (Output of script

CodePudding user response:

I've just tested Atlassian's ConfluencePS module (v2.5.1) on a self-hosted Confluence instance (v7.3.5), and the Get-ConfluencePage cmdlet appears to return a flattened list of the entire document tree for a given space.

Based on that, your code would simply be:

# get all pages from a Confluence "Space"
$all_pages = Get-ConfluencePage -SpaceKey "myspacekey";

# filter all the pages to just get those last edited before a specified date
$timestamp = (get-date -Year 2021 -Month 1 -Day 1).Date;
$filtered  = $pages | where-object { $_.Version.When -lt $timestamp };

$filtered | format-table "Id"

   ID
   --
17714
67261
..etc

Update

If, for whatever reason, you don't get all the pages in the Space returned from Get-ConfluencePage, you could do a depth-first search of the root pages in the tree using Get-ConfluenceChildPage:

# get root pages in the space
$rootPages = ...

# push root pages onto a stack
$stack = new-object System.Collections.ArrayList;
foreach( $rootPage in $rootPages )
{
    $null = $stack.Add($rootPage);
}

# initialise the result set
$all_pages = new-object System.Collections.ArrayList;

# while stack not empty
while( $stack.Count -gt 0 )
{

    # pop the top page off the stack
    $parent = $stack[$stack.Count - 1];
    $stack.RemoveAt($stack.Count - 1);

    # add the top page to the result set
    $null = $all_pages.Add($parent);

    # get child pages
    write-host "getting child pages for '$($parent.Title)' ($($parent.ID))";
    $children = Get-ConfluenceChildPage -PageId $parent.ID;
    write-host ($children | format-table | out-string);

    # push child pages onto the stack
    foreach( $child in $children )
    {
        $null = $stack.Add($child);
    }

}
  • Related