I am trying to find only file names having valid size >0 using jolt.
{
"objectName": "data",
"path": "/user/testuser/test_project/processing",
"type": "directory",
"owner": "testuser",
"group": "testuser",
"length": "2733",
"countFiles": "7",
"countDirs": "1",
"content": [
{
"objectName": "part-00000-f56d8bfa-2a3d-438c-89a5-d9a2460e6c66-c000.json",
"path": "/user/testuser/test_project/processing/data",
"type": "file",
"owner": "testuser",
"group": "testuser",
"length": "0"
},
{
"objectName": "part-00043-f56d8bfa-2a3d-438c-89a5-d9a2460e6c66-c000.json",
"path": "/user/testuser/test_project/processing/data",
"type": "file",
"owner": "testuser",
"group": "testuser",
"length": "782"
}
]
}
Below is my jolt spec
[
{
"operation": "shift",
"spec": {
"content": {
"*": {
"type": {
"file": {
"@2": "files[]"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"files": {
"*": {
"objectName": {
"*": {
"@2": "files[]"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"files": {
"*": {
"objectName": "files.[&1].filename",
"path": "files.[&1].filepath",
"length": "files.[&1].filesize"
}
}
}
}
]
In the output i want only the files having size greater than 0
currently it gives
{
"files": [
{
"filename": "part-00000-f56d8bfa-2a3d-438c-89a5-d9a2460e6c66-c000.json",
"filepath": "/user/testuser/test_project/processing/data",
"filesize": "0"
},
{
"filename": "part-00043-f56d8bfa-2a3d-438c-89a5-d9a2460e6c66-c000.json",
"filepath": "/user/testuser/test_project/processing/data",
"filesize": "782"
}
]
}
I Tried using remove jolt spech but i can only find examples of removing null values , not sure how to remove numbers or using a filter like length > 0 .
Also tried using shift operator with "0" and "*" on length attribute, but it doesn't remove the tag.
CodePudding user response:
You can use successive shift transformations along with a conditional logic to separate the case when length = 0
or !=0
, and lastly use a remove transformation to drop the unnecessary attribute such as
[
{
"operation": "shift",
"spec": {
"*": "&", // elements other than "content" array
"content": {
"*": {
"length": {
"0": "AttributeToRemove", // case when length = 0
"*": { // case when length != 0
"@(2,objectName)": "files.&3.filename",
"@(2,path)": "files.&3.filepath",
"@1": "files.&3.filesize"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"fi*": {
"*": {
"@": "&2[]" // go 2 levels up the tree to grab the literal `files` to replicate by using &2
}
}
}
},
{
// get rid of redundantly generated attribute due to length = 0 case
"operation": "remove",
"spec": {
"AttributeToRemove": ""
}
}
]