Check if csv cell contains a string and modify column value in Python-CodePudding

I have a csv file which contains network packets. Each row contains a column detailing the protocol and packet size. I am trying to differentiate between two different protocols such that if it is a http packet then the packet size is made negative e.g. 168 becomes -168 if it is of type http.

I am using Pandas to read the csv file but have been unable to find a way of iterating through each row and checking if the protocol column contains the string 'http'.

I have tried the following but the output prints all rows not just the http protocol ones

dataset = pandas.read_csv('test.csv', engine='python')
dataset.columns = ['packet', 'Time', 'Source', 'Dest', 'Proto', 'Length', 'Info']
dataset.index.name = 'packet'
for x in dataset.index:
    if dataset['Proto'].eq('http').any():
        print(dataset['Length'])

CodePudding user response：

If I understand the question correctly, this should do what you're looking for.

    import pandas as pd
    data = pd.DataFrame({'protocol':['http', 'http', 'shh'], 'packet':[1,2,3]})
    data.loc[data.protocol=='http', 'packet'] = -data.loc[data.protocol=='http','packet']

In your code, you are asking if any of the columns is equal to http, and if so, return the length of your dataset.

CodePudding user response：

I understood the same as Marc. I have a similar answer, although using fillna at the end:

df=pd.DataFrame({'type':['http','https'],'packet':[168,168]})
df.loc[df.type=='http','new_col']=df.packet*-1
df.new_col.fillna(df.packet, inplace=True)
df


    type    packet  new_col
0   http    168    -168
1   https   168     168