Filter keys with specific JSON values in JSON-LD files-CodePudding

I have a zip file(GZ) which when unzipped contains JSON in each line. Below is one sample JSON line. I am trying to extract specific fields only to CSV file using jq. I want to extract these fields with a condition that the type key should have the value dissertation only.

{
  "id": "https://openalex.org/W2777209504",
  "doi": "https://doi.org/10.24026/1818-1384.1(42).2013.77470",
  "display_name": "Hyperandrogenism as a factor of reproductive losses",
  "title": "Hyperandrogenism as a factor of reproductive losses",
  "publication_year": 2013, 
  "publication_date": "2013-03-27",
  "ids": {
    "openalex": "https://openalex.org/W2777209504",
    "doi": "https://doi.org/10.24026/1818-1384.1(42).2013.77470",
    "mag": 2777209504
  },
  "type": "journal-article",
  "counts_by_year": [
    {
      "year": 2019,
      "cited_by_count": 1
    }
  ],
  "cited_by_api_url": "https://api.openalex.org/works?filter=cites:W2777209504",
  "updated_date": "2021-11-03",
  "created_date": "2018-01-05",
  "abstract_inverted_index": {}
}

I tried the below two commands and neither of them worked: \

gzcat -c sample.gz | jq -rc '[.doi,.title, .publication_year, .publication_date, .type] | select(.type |contains("dissertation")) | @csv'>target.csv
gzcat -c sample.gz | jq -rc '[.doi,.title, .publication_year, .publication_date, .type] | select(.type=="dissertation") | @csv'>target.csv

The output received for both of them is:
jq: error (at <stdin>:108753): Cannot index string with string "title"

I tried all possibles ways to filter down my JSON-LD file but I am unable to succeed. Any pointers will be of great help.

CodePudding user response：

In both your attempts, the select is incorrectly formulated (or in the wrong place, depending on your point of view). This would work:

select(.type == "dissertation")
| [.doi,.title, .publication_year, .publication_date, .type]
| @csv