How to remove everything excluding what's between given characters?-CodePudding

I am automating a task, where I need it to open a .csv or .txt file, that will have a bunch of information I don't need. I want to remove everything excluding what's in a specific range.

For example: adhjnwadk'symbol:xxx'abcahjda.

So I would want to keep everything in 'symbol:xxx' and remove everything outside of it. Please note that xxx is a variable and is different in every line.

Is this possible?

CodePudding user response：

Is this a dataset? If yes, then you can use pandas module in python to get the data in tabular format and segregate accordingly.

CodePudding user response：

There are many ways to skin this particular cat:

with open("data.txt", "r") as infile:
    symbols = []
    for line in infile.readlines():
        _, _, lineend = line.partition("'symbol:")
        if lineend:
            symbol = lineend.split("'")[0]
            symbols.append(f"'symbol:{symbol}'")

This finds everything after symbol:, discards everything after the ', and stores the 'symbol:xxx' in a list.