Home > Net >  convert text into json using pandas dataframe with customer delimiter
convert text into json using pandas dataframe with customer delimiter

Time:04-20

i have one text file called as sample.txt.That text file contains some data like this

0: 480x640 2 persons, 1 tv, 1: 480x640 5 persons,  2 tvs, 1 oven, Done. (0.759s) Mon, 04 April 11:39:48 status : Low
0: 480x640 2 persons, 1 tv, 1: 480x640 4 persons, 3 chairs,  1 oven, Done. (0.763s) Mon, 04 April 11:39:50 status : High

This kind of data inside in sample text.

I tried this code to convert text file into json format

cam_details =  pd.read_csv('sample.txt', sep=r'(?:,\s*|^)(?:\d : \d x\d |Done[^)] \)\s*)',
                 header=None, engine='python', names=(None, 'a', 'b', 'date', 'status')).iloc[:, 1:]


cam_details.to_json('output.json', orient = "records", date_format = "epoch", double_precision = 10, 
                        force_ascii = True, date_unit = "ms", default_handler = None)

i have tried this code but i didnt get right format json. now how to convert text into above mention json format using pandas dataframe delimiter.

i got output like this

{
        "a": " 2 persons, 1 tv, 1 laptop, 1 clock",
        "b": " 4 persons, 1 car, 1 bottle, 3 chairs, 2 tvs, 1 oven",
        "date": "Mon, 04 April 11:39:51 status : Low"
    }

Now i expected to convert it into as json file like this

[
    {
        "a": " 2 persons, 1 tv, 1 laptop, 1 clock",
        "b": " 5 persons, 1 bottle, 3 chairs, 2 tvs, 1 cell phone, 1 oven",
        "date": "Mon, 04 April 11:39:48" , 
        "status": "Low"
    },
    {
        "a": " 2 persons, 1 tv, 1 laptop, 2 clocks",
        "b": " 4 persons, 1 car, 3 chairs, 2 tvs, 1 laptop, 1 oven",
        "date": "Mon, 04 April 11:39:50",
        "status": "Low"
    } ]

CodePudding user response:

It seems the problem is with the status part not being separated by a delimiter. You can fight it by adding some processing in pandas to split the date column on the status keyword and stripping out the colon before writing to json:

# Splits the date part and the status part into two columns (your status is being dragged into the date column)
cam_details[['date', 'status']] = cam_details['date'].map(lambda x: x.split('status')).tolist()

# Clean up the status column which still has the colons and extra whitespaces
cam_details['status'] = cam_details['status'].map(lambda x: x.replace(':', '').strip())
  • Related