I have the following source CSV
"TreeDepth","PID","PPID","ImageFileName","Offset(V)","Threads","Handles","SessionId","Wow64","CreateTime","ExitTime"
0,4,0,"System","0xac818d45d080",158,,,False,"2021-04-01 05:04:58.000000 ",
1,88,4,"Registry","0xac818d5ab040",4,,,False,"2021-04-01 05:04:54.000000 ",
1,404,4,"smss.exe","0xac818dea7040",2,,,False,"2021-04-01 05:04:58.000000 ",
0,556,548,"csrss.exe","0xac81900e4140",10,,0,False,"2021-04-01 05:05:00.000000 ",
0,632,548,"wininit.exe","0xac81901ee080",1,,0,False,"2021-04-01 05:05:00.000000 ",
1,768,632,"services.exe","0xac8190e52100",7,,0,False,"2021-04-01 05:05:01.000000 ",
2,1152,768,"svchost.exe","0xac8191034300",2,,0,False,"2021-04-01 05:05:02.000000 ",
2,2560,768,"svchost.exe","0xac8191485080",6,,0,False,"2021-04-01 05:05:03.000000 ",
In my python script I've trying to print the 4th cell value of every row (i.e. the process name). the following function prints it fine but it repeats it self for like 3 times.. What am i doing wrong?
dfProcs = pd.read_csv( args.path '/windows.pstree.PsTree.csv')
for ImageFileName in dfProcs:
print(dfProcs[ImageFileName].values[0])
Get a list of all the Image File Names from csv data.
CodePudding user response:
Your for loop is iterating over the column names and it just seems that you are just printing out the column value for the first row.
Instead it seems like you want to iterate over the fourth column and print every value. This is one way of doing that
for fileName in dfProcs["ImageFileName"]:
print(fileName)
CodePudding user response:
I would recommend you to use itterrows()
because it's more versatile, and quite often you will use other parameters like just index that itterrows returns (index, Series)
How to use it in your case ->
for index, row in dfProcs.iterrows():
print(row["ImageFileName"])
return
System
Registry
smss.exe
csrss.exe
wininit.exe
services.exe
svchost.exe
svchost.exe
How to use it and theory -> https://towardsdatascience.com/how-to-iterate-over-rows-in-a-panas-dataframe-6aa173fc6c84