Home > OS >  Iterating through rows in Pandas
Iterating through rows in Pandas

Time:03-19

I am having an issue applying some maths to my dataframe

Current df:

name lastConnected check
test1 1647609274746 Connection
test2 1647609274785 Connection
test3 1647000000000 Connection
test4 1647609274756 Connection

Desired df: i now want to create a new column and check if server is still online

name lastConnected check
test1 1647609274746 Connection
test2 1647609274785 Connection
test3 1647000000000 No connection
test4 1647609274756 Connection

current code:

def checkServer():
        for index, row in df.iterrows():
                timeNow = int((time.time_ns) // 1000000)
                lastSeenTime = row['lastConnected']
                timeDifference = currentTime - lastSeenTime
                if timeDifference > 5000:
                        df['check'] = "No connection"
                else:
                        df['check'] = "Connection"
        return df

My issue:

As you can see in my current dataframe it gives Connection to them all even though test3 should have No connection. From my troubleshooting, i printed the timeDifference into each row and i got the same time difference even though its all got different times. As a result I think my for loop might be the issue.

Use this site to get current time in miliseconds: currentmillis.com

Where am I going wrong?

CodePudding user response:

IIUC, don't use iterrows but a vectorial function:

N = 5000
df.loc[df['lastConnected'].diff().lt(-N), 'check'] = 'No connection'

or to create the column from sratch:

N = 5000
df['check'] = np.where(df['lastConnected'].diff().lt(-N),
                       'No connection', 'connection')

output:

    name  lastConnected          check
0  test1        1647609     Connection
1  test2        1647579     Connection
2  test3        1640009  No connection

CodePudding user response:

Maybe you expect:

df['check'] = np.where(pd.Timestamp.today().timestamp() * 1000 - df['lastConnected'] > 5000,
                       'No connection', 'Connection')
print(df)

# Output
    name  lastConnected          check
0  test1  1647609274746     Connection
1  test2  1647609274785     Connection
2  test3  1647000000000  No connection
3  test4  1647609274756     Connection

Old answer

Use np.where:

df['check'] = np.where(df['lastConnected'].diff().abs().gt(5000),
                       'No connection', 'Connection')
print(df)

# Output
    name  lastConnected          check
0  test1        1647609     Connection
1  test2        1647579     Connection
2  test3        1640009  No connection
  • Related