Home > Enterprise >  How to split text file with Pipe delimiter using Python and then pick columns based on condition?
How to split text file with Pipe delimiter using Python and then pick columns based on condition?

Time:11-23

Trying to split text and selecting data based on 2nd columns:

Attribute1|Number|7
Attribute2|Text||"sample text"
Attribute3|Columns|4||"data1"|"data2"|"data3"|"data4"

If it says Number then, it should pick data in the third field. If it says Text then, it should pick data in the fourth field. If it says Columns then it has to make a number of columns based on the third field.

Final data should look be in a data frame like this:

         Col_1          Col_2
    Attribute1_value    7
    Attribute2_value    "sample text"
    Attribute3_value_0  data1
    Attribute3_value_1  data2
    Attribute3_value_2  data3
    Attribute3_value_3  data4

CodePudding user response:

You can store your splitted lines in a dictionary and make a Series out of it:

output_dict = {}
with open("file.txt", "r") as f:
    while True:
        line = f.readline()
        if not line:
            break
        fields = line.strip("\n").split('|')
        if fields[1] == "Number":
            output_dict[fields[0]] = fields[2]
        elif fields[1] == "Text":
            output_dict[fields[0]] = fields[3]
        elif fields[1] == "Columns":
            output_dict[fields[0]] = fields[4:4   int(fields[2])]

#print(output_dict)

series = pd.Series(output_dict)
print(series.explode())

Output:

Attribute1                7
Attribute2    "sample text"
Attribute3          "data1"
Attribute3          "data2"
Attribute3          "data3"
Attribute3          "data4"
  • Related