I have a .txt file. The format is like this:
12:12 {"name": "alice", "id":"1", "password":"123"}
12:14 {"name": "bob", "id":"2", "password":"1fsdf3"}
12:18 {"name": "claire", "id":"3", "password":"12fs3"}
I want to convert it to a pandas dataframe. The columns would be [timestamp, name, id, password]. Each column would have the corresponding information. Any idea how to do it? Much appreciated!
CodePudding user response:
Create the rows of the DataFrame by processing one line of the file at a time. Then, call the DataFrame constructor:
import pandas as pd
import json
rows = []
with open("data.txt") as input_file:
for line in input_file:
line = line.strip()
if line:
timestamp, blob = line.split(maxsplit=1)
# Use dict(**json.loads(blob), timestamp=timestamp) for Python <3.9
blob = json.loads(blob) | dict(timestamp=timestamp)
rows.append(blob)
result = pd.DataFrame(rows)
print(result)
This outputs:
name id password timestamp
0 alice 1 123 12:12
1 bob 2 1fsdf3 12:14
2 claire 3 12fs3 12:18