I am having an issue applying some maths to my dataframe
Current df:
name | lastConnected | check |
---|---|---|
test1 | 1647609274746 | Connection |
test2 | 1647609274785 | Connection |
test3 | 1647000000000 | Connection |
test4 | 1647609274756 | Connection |
Desired df: i now want to create a new column and check if server is still online
name | lastConnected | check |
---|---|---|
test1 | 1647609274746 | Connection |
test2 | 1647609274785 | Connection |
test3 | 1647000000000 | No connection |
test4 | 1647609274756 | Connection |
current code:
def checkServer():
for index, row in df.iterrows():
timeNow = int((time.time_ns) // 1000000)
lastSeenTime = row['lastConnected']
timeDifference = currentTime - lastSeenTime
if timeDifference > 5000:
df['check'] = "No connection"
else:
df['check'] = "Connection"
return df
My issue:
As you can see in my current dataframe it gives Connection to them all even though test3 should have No connection. From my troubleshooting, i printed the timeDifference into each row and i got the same time difference even though its all got different times. As a result I think my for loop might be the issue.
Use this site to get current time in miliseconds: currentmillis.com
Where am I going wrong?
CodePudding user response:
IIUC, don't use iterrows
but a vectorial function:
N = 5000
df.loc[df['lastConnected'].diff().lt(-N), 'check'] = 'No connection'
or to create the column from sratch:
N = 5000
df['check'] = np.where(df['lastConnected'].diff().lt(-N),
'No connection', 'connection')
output:
name lastConnected check
0 test1 1647609 Connection
1 test2 1647579 Connection
2 test3 1640009 No connection
CodePudding user response:
Maybe you expect:
df['check'] = np.where(pd.Timestamp.today().timestamp() * 1000 - df['lastConnected'] > 5000,
'No connection', 'Connection')
print(df)
# Output
name lastConnected check
0 test1 1647609274746 Connection
1 test2 1647609274785 Connection
2 test3 1647000000000 No connection
3 test4 1647609274756 Connection
Old answer
Use np.where
:
df['check'] = np.where(df['lastConnected'].diff().abs().gt(5000),
'No connection', 'Connection')
print(df)
# Output
name lastConnected check
0 test1 1647609 Connection
1 test2 1647579 Connection
2 test3 1640009 No connection