Does panda can convert the key value to customized table. Here is the sample of the data.
1675484100 customer=A.1 area=1 height=20 width={10,10} length=1
1675484101 customer=B.1 area=10 height=30 width={20,11} length=2
1675484102 customer=C.1 area=11 height=40 width={30,12} length=3 remarks=call
Generate a table with key as a header and the associated value. First field as a time.
CodePudding user response:
I would use a regex to get each key/value pair, then reshape:
data = '''1675484100 customer=A.1 area=1 height=20 width={10,10} length=1
1675484101 customer=B.1 area=10 height=30 width={20,11} length=2
1675484102 customer=C.1 area=11 height=40 width={30,12} length=3 remarks=call'''
df = (pd.Series(data.splitlines()).radd('time=')
.str.extractall(r'([^\s=] )=([^\s=] )')
.droplevel('match').set_index(0, append=True)[1]
# unstack keeping order
.pipe(lambda d: d.unstack()[d.index.get_level_values(-1).unique()])
)
print(df)
Output:
0 time customer area height width length remarks
0 1675484100 A.1 1 20 {10,10} 1 NaN
1 1675484101 B.1 10 30 {20,11} 2 NaN
2 1675484102 C.1 11 40 {30,12} 3 call
CodePudding user response:
Assuming that your input is a string
defined as data
, you can use this :
L = [{k: v for k, v in (x.split("=") for x in l.split()[1:])}
for l in data.split("\n") if l.strip()]
df = pd.DataFrame(L)
df.insert(0, "time", [pd.to_datetime(int(x.split()[0]), unit="s")
for x in data.split("\n")])
Otherwise, if the data are stored in some sort of a (.txt
) file, add this at the beginning :
with open("file.txt", "r") as f:
data = f.read()
Output :
print(df)
time customer area height width length remarks
0 2023-02-04 04:15:00 A.1 1 20 {10,10} 1 NaN
1 2023-02-04 04:15:01 B.1 10 30 {20,11} 2 NaN
2 2023-02-04 04:15:02 C.1 11 40 {30,12} 3 call