Home > Blockchain >  Merge multiuples JSONL files from a folder using Python
Merge multiuples JSONL files from a folder using Python

Time:10-02

I'm looking for a solution to merge multiples JSONL files from one folder using a Python script.Somthing like the script below that work for a JSON files.

import json
import glob

result = []
for f in glob.glob("*.json"):
    with jsonlines.open(f) as infile:
        result.append(json.load(infile))

with open("merged_file.json", "wb") as outfile:
     json.dump(result, outfile)

Do anyone know how can I handle loading this ? Thank you.

Best,

CodePudding user response:

You can update a main dict with every json object you load. Like

import json
import glob

result = {}
for f in glob.glob("*.json"):
    with jsonlines.open(f) as infile:
        result.update(json.load(infile)) #merge the dicts

with open("merged_file.json", "wb") as outfile:
     json.dump(result, outfile)

But this will overwite similar keys.!

CodePudding user response:

Since each line in a JSONL file is a complete JSON object, you don't actually need to parse the JSONL files at all in order to merge them into another JSONL file. Instead, merge them by simply concatenating them:

with open("merged_file.json", "w") as outfile:
    for filename in glob.glob("*.json"):
        with open(filename) as infile:
            outfile.write(infile.read())
  • Related