I believe this is a 3 step process but please bear with me. I'm currently reading Shell output which is being saved to a file and the output looks like this:
Current Output:
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 123.345.789:1234 0.0.0.0:* LISTEN 23044/test
tcp 0 0 0.0.0.0:5915 0.0.0.0:* LISTEN 99800/./serv
tcp 0 0 0.0.0.0:1501 0.0.0.0:* LISTEN -
I'm trying to access each columns information based on the header value. This is something I was able to do in Powershell but not sure how to achieve it in Python.
Expected Output:
Proto,Recv-Q,Send-Q,Local Address,Foreign Address,State,PID/Program name
tcp,0,0,123.345.789:1234,0.0.0.0:*,LISTEN,23044/test
tcp,0,0,0.0.0.0:5915,0.0.0.0:*,LISTEN,99800/./serv
tcp,0,0,0.0.0.0:1501,0.0.0.0:*,LISTEN,-
proto = data["Proto"]
for p in proto:
print(p)
Output: tcp tcp tcp
What I've tried?:
Where do I begin.. I've tried Splitting, Replacing and Translate. Also, I did try Regex but couldn't quite figure it out :/
Proto,Recv-Q,Send-Q,Local,Address,,,,,,,,,,,Foreign Address,,,,,,,,,State,,,,,, PID/Program,name
tcp,,,,,,,,0,,,,,,0 123.345.789:1234,,,,,,,,0.0.0.0:*,,,,,,,,,,,,,,,LISTEN,,,,,,23021/java,,,,,,,,
tcp,,,,,,,,0,,,,,,0 0.0.0.0:5915,,,,,,,,,,,,0.0.0.0:*,,,,,,,,,,,,,,,LISTEN,,,,,,99859/./statserv
tcp,,,,,,,,0,,,,,,0 0.0.0.0:1501,,,,,,,,,,,,0.0.0.0:*,,,,,,,,,,,,,,,LISTEN,,,,,,-
Since some of the headers contain a space in between them it's sort of difficult to map the columns accordingly.
Looking for the best way to approach this.
Thank you.
CodePudding user response:
You are post-processing the output of the netstat
command. netstat
itself is just reformatting the information in /proc/net/tcp
, which you can also read. As with the netstat
output, you may need to make your own header line, but the data lines are all space separated. A simple line.split()
should do it.
If you still want to use netstat
, as I said, just throw away the header line and use split
. You know what the columns are.
for ln in output:
fields = ln.split()
print( ','.join(fields) )
CodePudding user response:
Skip the first row, indicate that there is no header, assign header names and then split on one or more spaces.
df = pd.read_csv('netstat.txt', skiprows=1, header=None, sep='\s ',
names=['cv-Q','Send-Q','Local Address','Foreign Address','State','PID/Program name'])
print(df)
Proto cv-Q Send-Q Local Address Foreign Address State PID/Program name
0 tcp 0 0 123.345.789:1234 0.0.0.0:* LISTEN 23044/test
1 tcp 0 0 0.0.0.0:5915 0.0.0.0:* LISTEN 99800/./serv
2 tcp 0 0 0.0.0.0:1501 0.0.0.0:* LISTEN -
df.to_csv('output.csv', index=None)
CodePudding user response:
Split based on a string with two or more spaces using a regex.
for ln in testset:
splitted = re.split(r'\s{2,}', ln.replace("\n", ""))
print(splitted)