Home > OS >  How to convert a text file with variable number of delimiters into a dataframe?
How to convert a text file with variable number of delimiters into a dataframe?

Time:11-22

How to convert the below text data into a dataframe? Also is there a way to use explode function on certain columns only?say data3, data4 only ignoring first two data points data1,data2

Attribute1,data1,data2
Attribute2,data1,data2,data3,data4
Attribute3,data1,data2,data3
Attribute4,data1,data2,data3,data4,data5,data6

Output of text to dataframe should be like:

Attribute1|data1|data2
Attribute2|data1|data2|data3|data4
Attribute3|data1|data2|data3
Attribute4|data1|data2|data3|data4|data5|data6

Output of dataframe explode should be like:

Attribute2|data3
Attribute2|data4
Attribute3|data3
Attribute4|data3
Attribute4|data4
Attribute4|data5
Attribute4|data6

CodePudding user response:

df = pd.read_csv('test.txt', header=None, sep=';')

df = df[0].str.split(',', expand=True)
df.set_index(0, inplace=True)
df = df.stack().droplevel(1)
print(df)

output:

0
Attribute1    data1
Attribute1    data2
Attribute2    data1
Attribute2    data2
Attribute2    data3
Attribute2    data4
Attribute3    data1
Attribute3    data2
Attribute3    data3
Attribute4    data1
Attribute4    data2
Attribute4    data3
Attribute4    data4
Attribute4    data5
Attribute4    data6
  • Related