Home > Back-end >  Splitting a Value JSON
Splitting a Value JSON

Time:09-30

Given a json file, I would like to split a value into parts based on seeing [..] and |..|. If they are seen at least 2 times. By split, I mean taking the one line from time and separating the string if it sees 2 or more [..] or |..| of these characters.

[
     {
         "Action":"Walk",
         "Time":"1 hour [c] 2 hour [dog] 1 hour [p]",
     },
     {
         "Action":"Pet",
         "Time":"1 hour [cat] 2 hour |d|",
       
     },
     {
         "Action":"F",
         "Time":"1 hour [cat]",
       
     },
]

Desired Result

[
     {
         "Action":"Walk",
         "Time":[
               "1 hour [c]",
               "2 hour [dog]",
                "1 hour [p]"   ],
     },
     {
         "Action":"Pet",
         "Time":[
            "1 hour [cat]", 
            "2 hour |d|"
                        ],
       
     },
     {
         "Action":"F",
         "Time":"1 hour [cat]",
       
     },
]

Here is my code:

with open(filenames,"r") as f:
        data=json.load(f)


CodePudding user response:

A regex can solve that easily, but is quite hard to read. I recommend you checking out the cheatsheet on regexr and possibly look into the regex documentation of re.findall

But here you go - that code should do what you asked for:

import re

with open(filenames, "rw") as f:
    data=json.load(f)
    for action in data:
        action["Time"] = [
            time_part.strip()
            for time_part 
            in re.findall(r".*?(?:(?:\[.*?\])|(?:\|.*?\|))", action["Time"])
        ]
        if len(action["Time"]) == 1: # when only a single action was done don't store it as an array
            action["Time"] = action["Time"][0]
    json.dump(data, f)

The regex when removing special character escape (like \[ to [), non-capturing groups (these (:?) and lazy matching (regex matches as much chars as possible by default. Using *? is called lazy matching) - just so it's easier to understand:

This is basically the logic used above: .*([.*] OR |.*|)

  • Related