Home > Software engineering >  remove rows having specific word in a loop
remove rows having specific word in a loop

Time:01-17

I have a file events.txt and there is multiple records having deleted events , how can we remove the delete events?

events.txt file records are like below -

delete|109393509715446004

{"id": 109472787571426436, "created_at": "2022-12-07T14:09:27 00:00", "in_reply_to_id": null, "in_reply_to_account_id": null, "sensitive": false}

{"id": 109472787901758948, "created_at": "2022-12-07T14:09:37 00:00", "in_reply_to_id": null, "in_reply_to_account_id": null, "sensitive": false}

delete|109393512606515336

{"id": 109472787957427984, "created_at": "2022-12-07T14:09:38 00:00","in_reply_to_id": null, "in_reply_to_account_id": null, "sensitive": false}

USed below approach to read the file data and transform :

with open('events.txt',encoding='utf-8') as f:
    for line in f:
        event = line.replace('update|', '').replace('status.update|', '').replace('status.','')
        print(type(event))
        print(event)

type of event - <class 'str'>

Please suggest how can we remove or skip the delete event rows while processing in loop above.

CodePudding user response:

It looks like the lines in the file you care about are valid JSON, while the lines you want to ignore are not. If true, and assuming there is no possibility of a JSON decode error with your valid entries, then you could leverage that difference like this:

import json

with open("temp.txt") as file:
    for line in file:
        try:
            d = json.loads(line)
            print(d)
        except json.JSONDecodeError:
            pass

Output:

{'id': 109472787571426436, 'created_at': '2022-12-07T14:09:27 00:00', 'in_reply_to_id': None, 'in_reply_to_account_id': None, 'sensitive': False}
{'id': 109472787901758948, 'created_at': '2022-12-07T14:09:37 00:00', 'in_reply_to_id': None, 'in_reply_to_account_id': None, 'sensitive': False}
{'id': 109472787957427984, 'created_at': '2022-12-07T14:09:38 00:00', 'in_reply_to_id': None, 'in_reply_to_account_id': None, 'sensitive': False}
  • Related