Learn computer vision development history and the concept of computer vision tasks-CodePudding

Job name (steps) :

1. What are main source of the image data (a few)

What is 2. Sift feature, can be used to do, what is thought, pyramid matching can be used to do, what hog features, can be used to do,

3. The neural network to why early neural network recently (hint: consider from the aspects of data, and hardware)

4. What are image task, solve the problem of what kind of image (eg: image classification is to see what is the specific images of objects,)

Homework:
1. ImageNet recognition program is a computer vision system, image recognition in the world is the largest database
PASCALVOC data set is the classification of the visual object recognition and detection of a benchmark, provides the detection performance and learning algorithm of image annotation data sets and standard of evaluation system
Labelme dataset's main features include
(1) designed specifically for object classification and recognition, rather than merely instance recognition
(2) specifically for studying embedded objects in a scene and design
(3) the high quality of the pixel level, including polygon box (polygons) and background (segmentation masks)
(4) the object category diversity is large, the differences of each object, diversity is also big,
(5) all images are themselves through the camera, rather than copy
COCO is a new kind of image recognition, segmentation and add subtitles annotation data sets
2. SIFT, namely the Scale invariant feature transform (Scale - invariant feature transform, SIFT), is used in the field of image processing, a description, this description have Scale invariance, can detect the key points in the image, is a kind of local feature descriptor

Image pyramid is a kind of multi-resolution to explain the structure of image, through the study of the multi-scale pixel of the original image sampling, generate N images of different resolution, the image on the bottom, with the highest level of resolution to the pyramid shape, upgrade is a series of image pixels (size) gradually reduce, has been to the top of the pyramid contains only a single pixel image, this constitutes the traditional sense of the image pyramid, image pyramid is more of a build the thought of different scale space, Internet search more may be in the application of the SIFT algorithm, in addition, can also be applied in optical flow, attitude estimation of slam, and accelerate the template matching, etc.

Hog is characterized by Gradient direction Histogram (the Histogram of Oriented Gradient, hog) is characterized by a computer vision and image processing for object detection feature descriptor, it calculated and statistical local area of the image Gradient direction Histogram to form features first, because the hog is on the image of the local grid unit operation, so it the image geometric and optical can keep good invariance, the deformation of the two kinds of deformation will only appear on the larger space, and second, in coarse spatial sampling, sampling and strong in the direction of the fine local optical normalized condition, as long as the pedestrians can keep upright posture, can allow pedestrians have some subtle body movements, these subtle movements can be ignored without affecting the detection effect, therefore hog feature is particularly suited to do image of human body detection,
Pedestrian detection aspects: as the HOG features can reflect the outline of the human body, and it is in the image brightness and color change of the body is not sensitive, has excellent performance in the test,
Vehicle detection aspects: due to HOG feature is not sensitive to light, even if there are some shade can also be detected, in a variety of complex traffic road and parking places, has good robustness,
Tracking aspects: HOG feature in a moving target tracking has obvious edge contour

3. The hardware can't keep up with, can get fewer data,

4. The objects in image classification is to look at the pictures and exactly what is
Target detection is found in a given image object position
Semantics is the identification of image segmentation in the content and position of
Instance is in pixel level identification object contour segmentation task