I am automating a task, where I need it to open a .csv or .txt file, that will have a bunch of information I don't need. I want to remove everything excluding what's in a specific range.
For example: adhjnwadk'symbol:xxx'abcahjda
.
So I would want to keep everything in 'symbol:xxx'
and remove everything outside of it. Please note that xxx is a variable and is different in every line.
Is this possible?
CodePudding user response:
Is this a dataset? If yes, then you can use pandas module in python to get the data in tabular format and segregate accordingly.
CodePudding user response:
There are many ways to skin this particular cat:
with open("data.txt", "r") as infile:
symbols = []
for line in infile.readlines():
_, _, lineend = line.partition("'symbol:")
if lineend:
symbol = lineend.split("'")[0]
symbols.append(f"'symbol:{symbol}'")
This finds everything after symbol:
, discards everything after the '
, and stores the 'symbol:xxx'
in a list.