I have a dataframe with a wide amount of text in one of the columns. It looks something like this:
[{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:07:58', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097683000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:08:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097712000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:08:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097742000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:09:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097772000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:09:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097803000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:10:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097833000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:10:58', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097863000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:11:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097892000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:11:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097922000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:12:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097952000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:12:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097983000'}}, {'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:13:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620098013000'}}
I want to be able to delete everything that isn't in a line that matches minute 59 and keep the line where it does so the column would only show:
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:59:58', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '16200976821350'}}
I tried using regex but I can't seem to make it work.
/({'occupancy_state': '[1],[0-9] ,[0-9] ,[0-9] ', 'rtc_utc_time': '[2][0][0-9][0-9]-[0-1][0-9]-[0-9][0-9] [0-2][0-9]:*59:[0-9][0-9]', 'sdcard_site': '[a-zA-Z] ', 'sdcard_chain': '[a-zA-Z] ', 'sdcard_line': '[0-9] ,[a-zA-Z] ', 'utc': {'.numberLong': '[0-9] '}})/g
Any help is appreciated
CodePudding user response:
Using following Code you can select dictionaries with 59 minutes:
[i for i in myList if i['rtc_utc_time'].partition(":")[2][:2] == '59']
NB : Your provided input not contaning any dictionary
with 59
minute.
CodePudding user response:
use a list comprehension and split to find the minutes on the rtc_utc_time
data=[{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:07:58', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097683000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:08:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097712000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:08:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097742000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:09:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097772000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:59:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097803000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:10:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097833000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:10:58', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097863000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:11:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097892000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:59:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097922000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:12:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097952000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:12:57', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620097983000'}},
{'occupancy_state': '1,0,0,0', 'rtc_utc_time': '2021-05-04 03:13:27', 'sdcard_site': 'BE', 'sdcard_chain': 'DAG', 'sdcard_line': '1,BE', 'utc': {'$numberLong': '1620098013000'}}]
[print (row) for row in data if row["rtc_utc_time"].split(':')[1]!= '59']