Panda key value pair data frame-CodePudding

Does panda can convert the key value to customized table. Here is the sample of the data.

1675484100 customer=A.1 area=1 height=20 width={10,10} length=1
1675484101 customer=B.1 area=10 height=30 width={20,11} length=2
1675484102 customer=C.1 area=11 height=40 width={30,12} length=3 remarks=call

Generate a table with key as a header and the associated value. First field as a time.

CodePudding user response：

I would use a regex to get each key/value pair, then reshape:

data = '''1675484100 customer=A.1 area=1 height=20 width={10,10} length=1
1675484101 customer=B.1 area=10 height=30 width={20,11} length=2
1675484102 customer=C.1 area=11 height=40 width={30,12} length=3 remarks=call'''

df = (pd.Series(data.splitlines()).radd('time=')
      .str.extractall(r'([^\s=] )=([^\s=] )')
      .droplevel('match').set_index(0, append=True)[1]
      # unstack keeping order
      .pipe(lambda d: d.unstack()[d.index.get_level_values(-1).unique()])
      )

print(df)

Output:


0        time customer area height    width length remarks
0  1675484100      A.1    1     20  {10,10}      1     NaN
1  1675484101      B.1   10     30  {20,11}      2     NaN
2  1675484102      C.1   11     40  {30,12}      3    call

CodePudding user response：

Assuming that your input is a string defined as data, you can use this :

L = [{k: v for k, v in (x.split("=") for x in l.split()[1:])}
     for l in data.split("\n") if l.strip()]

df = pd.DataFrame(L)

df.insert(0, "time", [pd.to_datetime(int(x.split()[0]), unit="s")
                      for x in data.split("\n")])

Otherwise, if the data are stored in some sort of a (.txt) file, add this at the beginning :

with open("file.txt", "r") as f:
    data = f.read()

Output :

print(df)

                 time customer area height    width length remarks
0 2023-02-04 04:15:00      A.1    1     20  {10,10}      1     NaN
1 2023-02-04 04:15:01      B.1   10     30  {20,11}      2     NaN
2 2023-02-04 04:15:02      C.1   11     40  {30,12}      3    call