Convert a txt file to dataframe-CodePudding

How do I convert a .txt file with the following form of data to a pandas dataframe?

For example, this is the txt file with the structure (x1, y1, z1), (x2, y2, z2), ... (xn, yn, zn),

(108.222994365147, 16.077177357808345, 17.5), (108.22299074891866, 16.07718225312858, 17.5), (108.2229869226013, 16.077186986051835, 17.5), (108.22298289347849, 16.077191547568788, 17.5),

And after converting I want it to be like this

                    x                     y       z
1    108.222994365147    16.077177357808345    17.5   
2  108.22299074891866     16.07718225312858    17.5
3   108.2229869226013    16.077186986051835    17.5
4  108.22298289347849    16.077191547568788    17.5

CodePudding user response：

This approach would solve your problem

import pandas as pd
import re

with open({YOUR_FILE_LOCATION}, "r") as f:
    s = f.read()

pattern = re.compile("\(([\d\.] ),[ ]*([\d\.] ),[ ]*([\d \.] )\)")
pd.DataFrame(pattern.findall(s), columns=["x","y","z"]).astype(float)

OUTPUT

            x          y     z
0  108.222994  16.077177  17.5
1  108.222991  16.077182  17.5
2  108.222987  16.077187  17.5
3  108.222983  16.077192  17.5

Once the file is imported, all the patterns of interest (3 comma separated floats between brackets) are matched and passed to a DataFrame constructor as a list of lists. Then everything is cast to float.

CodePudding user response：

data = pd.read_csv('file1.txt', sep=" ", header=None)

data.columns = ["x", "y", "z"]

try this

CodePudding user response：

An alternative solution using io.StringIO and pd.read_csv:

import pandas as pd
from io import StringIO

with open({YOUR_FILE_LOCATION}, "r") as file:
    data = file.read()

data = data.replace('(', '').replace('),', '\n')[:-1]
df = pd.read_csv(StringIO(data), header=None)
df.columns = ["x", "y", "z"]

Output:

    x           y           z
0   108.222994  16.077177   17.5
1   108.222991  16.077182   17.5
2   108.222987  16.077187   17.5
3   108.222983  16.077192   17.5