OpenVINO + OpenCV nodded and shook his head identification authentication-CodePudding

The model introduced
OpenVINO support head posture assessment model, the training model for: the head pose estimation -- adas - 0001, implement head movements in three dimensions direction identification, they respectively are:

 pitch is pitching Angle, it is "nod" 
Yaw is yaw Angle, is the 'head' 
Roll is rotation Angle, is the "tumbling

Their Angle range respectively:

 YAW/- 90, living, PITCH [70], ROLL [- 70]

The three professional vocabulary is from unmanned aerial vehicle (uav) and aviation, a hobby is computer vision scientists make new words, just borrow them to head posture assessment, their meaning is as follows:

Corresponding to the head posture assessment of

Input format: x3x60x60 [1] BGR order
The output format:

 name: "angle_y_fc", shape: [1, 1] - Estimated 
Name: "angle_p_fc", shape: [1, 1] - Estimated pitch 
Name: "angle_r_fc", shape: [1, 1] - Estimated roll

Code demo

Face detection
Model based on OpenVINO MobileNetv2 SSD in face detection, face detection, and then get the ROI region, implement head posture assessment based on ROI, complete head gesture recognition, here will only recognize more than plus or minus 20 degrees above head movements, implementation model load and input/output format parsing code is as follows:

 ie=IECore () 
For device in ie. Available_devices: 
Print (device) 

Net=ie. Read_network (model=model_xml, weights=model_bin) 
Input_blob=next (iter (net. Input_info)) 
Out_blob=next (iter (net. Outputs)) 

N, c, h, w=net. Input_info [input_blob] input_data. Shape 
Print (n, c, h, w) 

# cap=CV. VideoCapture (" D:/images/video/Boogie_Up mp4 ") 
Cap=CV. VideoCapture (0) 
Exec_net=ie. Load_network (network=net, device_name="CPU") 

Head_net=ie. Read_network (model=head_xml, weights=head_bin) 
Em_input_blob=next (iter (head_net input_info)) 
Head_it=iter (head_net outputs) 
Head_out_blob1=next (head_it) # angle_p_fc 
Head_out_blob2=next (head_it) # angle_r_fc 
Head_out_blob3=next (head_it) # angle_y_fc 
Print (head_out_blob1 head_out_blob2, head_out_blob3) 

En, ec, eh, ew=head_net input_info [em_input_blob] input_data. Shape 
Print (en, ec, eh, ew) 

Em_exec_net=ie. Load_network (network=head_net, device_name="CPU")

For detecting head movements
Analytical model of the output of video streaming to realize the face detection and head movements identification code is as follows:

 
While True: 
Ret, frame=cap. The read () 
If ret is not True: 
Break 
Image=CV. Resize (frame, (w, h)) 
Image=image transpose (2, 0, 1) 
Inf_start=time. Time () 
Res=exec_net. Infer (inputs={input_blob: [image]}) 
Inf_end=time. Time () - inf_start 
# print (" infer the time (ms) : %, 3 f "% (inf_end * 1000)) 
Ih, iw, IC=frame shape 
Res=res [out_blob] 
For obj res in [0] [0] : 
If obj [2] & gt; 0.75: 
Xmin=int (obj [3] * iw) 
Ymin=int (obj [4] * ih) 
Xmax=int (obj [5] * iw) 
Ymax=int (obj [6] * ih) 
If xmin & lt; 0: 
Xmin=0 
If ymin & lt; 0: 
Ymin=0 
If xmax & gt;=iw: 
Xmax=iw - 1 
If ymax & gt; Ih=: 
Ymax=ih - 1 
ROI=frame [ymin: ymax, xmin: xmax, :) 
Roi_img=CV. Resize (ROI, (ew, eh)) 
Roi_img=roi_img. Transpose (2, 0, 1) 
Head_res=em_exec_net. Infer (inputs={em_input_blob: [roi_img]}) 
Angle_p_fc=head_res [head_out_blob1] [0] [0] 
Angle_r_fc=head_res [head_out_blob2] [0] [0] 
Angle_y_fc=head_res [head_out_blob3] [0] [0] 
Head_pose="" 
If angle_p_fc & gt; 20 the or angle_p_fc & lt; - 20: 
Head_pose +="pitch," 
If angle_r_fc & gt; 20 the or angle_r_fc & lt; - 20: 
Head_pose +="roll," 
If angle_y_fc & gt; 20 the or angle_y_fc & lt; - 20: 
Yaw head_pose +=", "
CV. A rectangle (frame, (xmin, ymin), (xmax, ymax), (0, 255, 255), 2, 8) 
CV. PutText (frame, head_pose, (xmin, ymin), CV. FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 255), 2, 8) 

CV. PutText (frame, "infer the time (ms) : %, 3 f, FPS: %, 2 f" % (inf_end * 1000, 1/inf_end), (50, 50), 
CV. FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 255), 2, 8) 
CV. Imshow (" Face + emotion Detection ", a frame) 
C=CV. WaitKey (1) 
If c==27: 
Break 
CV. WaitKey (0) 
CV. DestroyAllWindows ()

The results are as follows:

This proposal is of interest to try and change video files to the camera, real-time identification basically nodded, shook his head, turned to such action without any pressure, I from screenshot so you don't have my own test, main myself long too ugly! Another speed in real time! Really good!

CodePudding user response:

This interesting, can be used to realize our factory application scenario