I have a dataset which represents objects in a hierarchy (there are no cycles). I want to analyse it in Contour and figure out for each object the list of top-level related objects.
Say, my object A depends on objects B and C. Object C in turn depends on objects D and E. Now I want to figure out what are the "final" or highest level dependencies of A, and I expect the result to be B, D and E.
CodePudding user response:
While in the SQL world dedicated constructors are available to perform hierarchical queries (look for CONNECT BY), the underpinning language behind Contour / Palantir Foundry overall (i.e. Apache Spark) has no automatic recursive construct.
So, whilst it is possible to perform recursive queries with custom functions, I strongly doubt it would be feasible to implement them in Contour.
CodePudding user response:
Given that pyspark is used with Palantir, you can simulate this CONNECT BY / Recursive CTE using pyspark dataframes.
This excellent read https://medium.com/globant/how-to-implement-recursive-queries-in-spark-3d26f7ed3bc9 shows you how.
Standardly, there is no such capability with spark sql.