I would like to fetch the start datetime and end datetime while the value of data is zero. The data is in Postgresql. If I get the Postgresql solution much help full or Python using numpy or pandas. for example
column 1 will contain datetime column 2 will contain values.
DateTime Value
06-07-2021 12:00 -521362.8779
06-07-2021 12:15 -57275.52732
06-07-2021 12:30 0
06-07-2021 12:45 0
06-07-2021 13:00 0
06-07-2021 13:15 0
06-07-2021 13:30 0
06-07-2021 13:45 0
06-07-2021 14:00 -57275.52732
06-07-2021 14:15 -377411.4886
06-07-2021 14:30 -377411.4886
06-07-2021 14:45 0
06-07-2021 15:00 0
06-07-2021 15:15 0
06-07-2021 15:30 -889863.5254
06-07-2021 15:45 -1194683.49
06-07-2021 16:00 0
06-07-2021 16:15 0
06-07-2021 16:30 0
06-07-2021 16:45 0
06-07-2021 17:00 -89539.05766
06-07-2021 17:15 -1117269.624
06-07-2021 17:30 -857357.2725
The required output shall be Column 1 serial no, Column 2 Start DateTime, Column 3 End DateTime
Serial No Start DateTime End DateTime
1 06-07-2021 12:30 06-07-2021 13:45
2 06-07-2021 14:45 06-07-2021 15:15
3 06-07-2021 16:00 06-07-2021 16:45
CodePudding user response:
Assuming the type of your DateTime column is already datetime
or you transform your above string into a dataframe
using
df = pd.read_csv(io.StringIO(df_string), sep='\s{2,}',engine='python',parse_dates=['DateTime'])
then you do
x = df['Value'].to_numpy()
mask = np.empty(x.shape[0], 'bool')
mask[0] = x[0] == 0
mask[1:] = (x[1:] == 0) & (x[:-1] != 0)
mask2 = np.empty(x.shape[0], 'bool')
mask2[-1] = x[0] == 0
mask2[:-1] = (x[1:] != 0) & (x[:-1] == 0)
df2 = pd.DataFrame({'Start': df['DateTime'][mask].reset_index(drop=True),
'End' :df['DateTime'][mask2].reset_index(drop=True)})
and you get
Start End
0 2021-06-07 12:30:00 2021-06-07 13:45:00
1 2021-06-07 14:45:00 2021-06-07 15:15:00
2 2021-06-07 16:00:00 2021-06-07 16:45:00
CodePudding user response:
I just compare the current row with next/previous row values. If one is zero and the other is not, then it's a Start or End.
You can use the shift method to shift the rows.
df1 = pd.DataFrame()
df1['Start DateTime'] = (
df[(df['Value'] == 0) & (df['Value'].shift() != 0)]
['DateTime'].reset_index(drop=True) )
df1['End DateTime'] = (
df[(df['Value'] == 0) & (df['Value'].shift(-1) != 0)]
['DateTime'].reset_index(drop=True))
Start DateTime | End DateTime | |
---|---|---|
0 | 06-07-2021 12:30 | 06-07-2021 13:45 |
1 | 06-07-2021 14:45 | 06-07-2021 15:15 |
2 | 06-07-2021 16:00 | 06-07-2021 16:45 |