J. Cent. South Univ. Technol. (2010) 17: 580-586
DOI: 10.1007/s11771-010-0526-0
Novel histogram descriptor for global feature extraction and description
ZHANG Gang(张 刚)1, 2, MA Zong-min(马宗民)1, DENG Li-guo(邓立国)1, XU Chang-ming(徐长明)1
1. College of Information Science and Engineering, Northeastern University, Shenyang 110004, China;
2. School of Software, Shenyang University of Technology, Shenyang 110023, China
? Central South University Press and Springer-Verlag Berlin Heidelberg 2010
Abstract: A novel histogram descriptor for global feature extraction and description was presented. Three elementary primitives for a 2×2 pixel grid were defined. The complex primitives were computed by matrix transforms. These primitives and equivalence class were used for an image to compute the feature image that consisted of three elementary primitives. Histogram was used for the transformed image to extract and describe the features. Furthermore, comparisons were made among the novel histogram descriptor, the gray histogram and the edge histogram with regard to feature vector dimension and retrieval performance. The experimental results show that the novel histogram can not only reduce the effect of noise and illumination change, but also compute the feature vector of lower dimension. Furthermore, the system using the novel histogram has better retrieval performance.
Key words: feature extraction and description; histogram descriptor; gray histogram; edge histogram
1 Introduction
Histogram is one of the most important approaches to feature extraction and description [1-2]. With simple computation and invariance to translation and rotation, moreover invariance to scale after being normalized, the histogram is one of the commonly used approaches for global feature extraction and description [3-6]. However, the approach is susceptible to noise and illumination change [7-9].
Edge histogram [10] is based on research results of psychophysiology that human eyes are sensitive to edge features for image perception, and extract and describe edge features by the frequency and the directionality of brightness changes in the image. The information unit in the edge histogram is not a pixel, but a primitive compared with the gray histogram. Besides, it is not sensitive to noise and illumination change. The dimension of the texture feature vectors computed by the edge histogram is 80. The approach can be used for local feature extraction and description. WON et al [11] proposed a modified approach. The approach can not only extract and describe the local features, but also the global features. When the edge histogram and its modified version are used for content-based image retrieval, retrieval performance of the system can be improved greatly [10-11]. However, statistics of primitives are computed only in the sub-images, regardless of the primitives among the adjacent sub-images.
A two-dimensional histogram descriptor, whose idea is similar to that in Refs.[10-11], was presented by JHANWAR et al [12]. The approach defines six primitives, i.e., z, n, u, c, γ and α. An image is divided into a group of 2×2 pixel grids. Then, the primitive of each pixel grid is computed. The feature image consists of the primitives. Occurrence matrix is used to extract and describe the image features. The approach is not sensitive to noise and illumination change.
HE et al [13] presented a histogram descriptor. The approach computes the primitives by gray changes in the neighborhood of an image pixel, and the primitive for each pixel in the image. The histogram is used for the feature image to compute statistics of distribution. The approach is similar to that in Refs.[10-12]. Besides, it solves the problems in Refs.[10-12] effectively. However, the dimension of feature vector is high. SHI et al [14] presented a texture spectrum histogram descriptor. The approach can solve the problems in Ref.[13] effectively. It defines primitives by the approach which is similar to that in Ref.[13], and computes the feature image and statistics of distribution. Besides, the primitives are categorized into several equivalence classes by symmetry invariance, which reduces the dimension of a feature vector. The approach computes a primitive by eight pixels except the center in a 3×3 neighborhood of a pixel. Although the approach can reduce the amount of different primitives, the used primitive set is only sub-set of the set that consists of all primitives in the 3×3 neighborhood.
The contribution of this work is to present a novel histogram descriptor for global feature extraction and description. The approach defines the primitives according to Ref.[12], and defines three elementary primitives. Three equivalence classes of primitives are computed by three elementary primitives and matrix transforms. The approach uses the idea that is similar to that in Ref.[14], and introduces a transform matrix set that contains 19 transform matrix. The primitive class is computed for each 2×2 pixel grid in an image, and the feature image consists of computed primitives. The histogram is used for the feature image to extract and describe the features. The approach keeps the merits of the approaches in Refs.[12,14], and uses different approaches to extract and describe the features. Besides, the approach can be used for global feature extraction and description.
2 Novel histogram descriptor for global feature extraction and description
The novel histogram defines three elementary primitives and computes three equivalence classes of primitives by matrix transforms. Then, the feature image is computed by the equivalence classes. The definition of primitives is different from that in the edge histogram. Moreover, the approach does not compute the local histogram, but the global histogram.
Assume that f (x, y) denotes an image of size M×N and the gray level 256. The centers of pixels are connected to form a primitive according to gray values from high to low for a 2×2 pixel grid of f (x, y), as shown in Fig.1(a). There are 12 primitives for a 2×2 pixel grid, as shown in Fig.1(b). It is found from Fig.1(b) that if the primitives are invariant to rotation and symmetry, there will be only three different primitives, i.e., (1), (3) and (5). These primitives are called the elementary primitives, and denoted as 1, 2, and 3, respectively.
Twelve primitives can be computed by use of matrix transforms for three different primitives. The matrix transform set contains rotation transforms, symmetry transforms, and complex transforms. Here, the rotation transforms contains rotation 90?, 180? and 270? transforms. The symmetry transforms contains symmetry to x-axis, y-axis, y=x and y=-x. The elementary transform matrices are shown in Fig.2. Twelve complex transform matrices can be computed by the composition of seven elementary transform matrices, for example, rotation 90? and symmetry to x-axis.
Three equivalence classes of primitives can be computed by the elementary primitives and matrix transforms. If the primitive of a 2×2 pixel grid can be transformed into an elementary primitive by one of 19 transforms, it will be considered to be equivalent to this elementary primitive. There are only three equivalence classes of primitives as there are only three elementary primitives.
Fig.1 Twelve primitives of 2×2 pixel grids: (a) Primitive of 2×2 pixel grid; (b) Twelve primitives
Fig.2 Seven elementary transform matrices
For each 2×2 pixel grid of the image, its equivalence class of primitive can be computed by use of transform matrices for the pixel grid. The feature image consisting of the primitives is of size (M-1)×(N-1).
An image of size 8×8 and its feature image are shown in Fig.3. When computing the primitive of a 2×2 pixel grid, we can find that the primitive of the 2×2 pixel grid in the real line boundary refers to two kinds of elementary primitives, i.e., 2 and 3, and the primitive of the 2×2 pixel grid in the dash line boundary refers to two kinds of primitives, i.e., 2 and 3. To solve the divergence, descending priority is designated for elementary primitives 1, 2 and 3. For other cases of a 2×2 pixel grid, the 2×2 pixel grid is regarded as no primitive and expressed as blank area. Finally, the feature image is shown in Fig.3(b). The histogram was used for the feature image to compute the statistics of primitive distribution.
3 Feature extraction and description by novel histogram descriptor
The novel histogram was used for each image in the database, and then the features were extracted and described. Assume thatdenotes the feature image of f (x, y) and SF denotes the feature vector of . Then, SF can be defined as
SF={h[k]|k?[1, 3]} (1)
where h(k) can be denoted as
(2)
and t(i, j) can be denoted as
(3)
When a query image is provided, the novel histogram is used to extract and describe the image features. Euclidean distance was used to measure similarity between images. Assume that SFa denotes the feature vector of the query image, SFb denotes the feature vector of an image in the image database, M denotes the dimension of the feature vectors, and LM denotes the Euclidean distance between images. Then, LM can be defined as
(4)
If there are similar images, these images will be located in the image database and returned to the user in descending order. If no matching information is found in the feature database, the user will receive a prompting message.
4 Experiments and performance analyses
To verify the retrieval performance of content-based image retrieval system which uses the novel histogram, experiments were conducted on the image database which consists of 3 000 medical images from the image database of hospital to determine the computation time of a feature vector and measure retrieval performance of system. All of the images are gray images of size 481×481, and the numbers of cervical vertebra images, cervical disc images and lumbar disc images are all 1 000. Test codes are compiled by Matlab. Machine configuration for test is Intel(R) Pentium(R) Dual CPU E2180 2.00 GHz and 1.00 GB memory.
4.1 Computation time of feature vectors
Thirty images, where each class contains 10 images, were selected from the image database. The gray histogram, edge histogram and novel histogram were used for feature extraction and description. The computation time of a feature vector was measured by average time. Finally, the computation time of a feature vector of three approaches is 0.02, 0.12 and 1.29 s, respectively. Taking into account the efficiency of retrieval systems, feature extraction and description, retrieval are divided into two stages. Furthermore, the feature extraction and description of images are carried out before retrieval in the image database.
4.2 Retrieval performance of system using novel histogram
Two test sets were formed from the image database. Each test set contained 300 images, and the numbers of cervical vertebra images, cervical disc images and lumbar disc images were 100, respectively. In the first test set, 60 images were selected as the basic images from the image database. Then, each basic image was deformed by three kinds of transformation, including two rotations, one scaling, and one translation (see Fig.4(a)). In the second test set, 300 images were selected as the basic images from the image database (see Fig.4(b)). Three different kinds of images were selected from each test set to form a query image set. Besides, the similarity measure was used to measure similarity among images, and precision and recall in Ref.[15] were used to measure retrieval performance of the system.
Fig.3 8×8 image (a) and its feature image (b)
Fig.4 Two sets of test images: (a) The first test set; (b) The second test set
Three test results for the first and second test sets are shown in Figs.5 and 6, respectively. The first image on the left is a query image. The images on the right are the retrieved images when similarity measures are carried out among the feature vectors from the gray histogram, the edge histogram and the novel histogram descriptor, respectively. For the purpose of comparisons, the most similar eight images are shown. It can be found from Figs.5 and 6 that the retrieval precision of system using the novel histogram is higher than that of systems using the gray histogram and edge histogram.
To measure the retrieval performance of the system using the novel histogram, the gray histogram, the edge histogram and the novel histogram were used for the images in the image database, respectively. The computed features were stored into the feature database. When a query image is provided, the gray histogram, the edge histogram and the novel histogram are used to extract and describe the features, respectively. Statistical analyses are carried out for the former eight images. After nine times retrieval tests, the curves of retrieval precision and recall at the first test set and the second test set are shown in Figs.7 and 8, respectively. The horizontal axis of the curves denotes the images where the first three are cervical disc images, the middle three are lumbar disc images and the last three are cervical vertebra images.
Fig.5 Three test results for the first test set
It can be found from Figs.7 and 8 that the retrieval performance of the system using the gray histogram is better than that of the system using the edge histogram, and the retrieval performance of the system using the novel histogram is better than that of system using the gray histogram. The similar conclusions are drawn for recall.
5 Conclusions
(1) A novel histogram descriptor for global feature extraction and description is presented. The approach uses a primary as an elementary information unit to reduce the effect of noise and illumination change. It uses equivalence class to reduce the dimension of feature vector. Then, the histogram is used to extract and describe the image features. The dimension of the feature rector from the novel histogram is 3.
(2) The novel histogram can reduce the effect of noise and illumination change effectively.
(3) The feature vector from the novel histogram has lower dimension than that of the gray histogram and the edge histogram.
(4) The novel histogram has better retrieval performance than the gray histogram and the edge histogram, which means that the features from the novel histogram have stronger discrimination.
Fig.6 Three test results for the second test set
Fig.7 Curves of retrieval precision (a) and recall (b) for the first test
Fig.8 Curves of retrieval precision (a) and recall (b) for the second test
References
[1] SWAIN M J, BALLARD D H. Color indexing [J]. International Journal of Computer Vision, 1991, 7(1): 11-32.
[2] DING Nan, ZHOU Shu-de, SUN Zeng-qi. Histogram-based estimation of distribution algorithm: A competent method for continuous optimization [J]. Journal of Computer Science and Technology, 2008, 23(1): 35-43.
[3] PI M H, TONG C S, CHOY S K, ZHANG H. A fast and effective model for wavelet subband histograms and its application in texture image retrieval [J]. IEEE Transactions on Image Processing, 2006, 15(10): 3078-3088.
[4] KIM C R, CHUNG C W. A multi-step approach for partial similarity search in large image data using histogram intersection [J]. Information and Software Technology, 2003, 45(4): 203-215.
[5] ZHANG Hong-liang, ZOU Zhong, LI Jie, CHEN Xiang-tao. Flame image recognition of alumina rotary kiln by artificial neural network and support vector machine methods [J]. Journal of Central South University of Technology, 2008, 15(1): 39-43.
[6] ZHOU Jie, XIN Le-ping, ZHANG David. Scale-orientation histogram for texture image retrieval [J]. Pattern Recognition, 2003, 36(4): 1061-1063.
[7] SHAN Y, SAWHNEY H S, MATEI B, KUMAR R. Shapeme histogram projection and matching for partial object recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 568-577.
[8] LEOW W K, LI R. The analysis and applications of adaptive-binning color histograms [J]. Computer Vision and Image Understanding, 2004, 94(1/3): 67-91.
[9] SONG K T, TAI J C. Real-time background estimation of traffic imagery using group-based histogram [J]. Journal of Information Science and Engineering, 2008, 24(2): 411-423.
[10] MANJUNATH B S, OHM J R, VASUDEVAN V V, YAMADA A. Color and texture descriptors [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2001, 11(6): 703-715.
[11] WON C S, PARK D K, PARK S J. Efficient use of MPEG-7 edge histogram descriptor [J]. ETRI Journal, 2002, 24(1): 23-30.
[12] JHANWAR N, CHAUDHURI S, SEETHARAMAN G, ZAVIDOVIQUE B. Content based image retrieval using motif cooccurrence matrix [J]. Image and Vision Computing, 2004, 22(14): 1211-1220.
[13] HE Dong-chen, WANG Li. Texture features based on texture spectrum [J]. Pattern Recognition, 1991, 24(5): 391-399.
[14] SHI Zhi-ping, HU Hong, LI Qing-yong, SHI Zhong-zhi, DUAN Chan-lun. Texture spectrum descriptor based image retrieval [J]. Journal of Software, 2005, 16(6): 1039-1045. (in Chinese)
[15] M?LLER H, MICHOUX N, BANDON D, GEISSBUHLER A. A review of content-based image retrieval systems in medical applications—Clinical benefits and future directions [J]. International Journal of Medical Informatics, 2004, 73(1): 1-23.
Foundation item: Project(60873010) supported by the National Natural Science Foundation of China; Projects(N090504005, N090604012, N090104001) supported by the Fundamental Research Funds for the Central Universities; Project(NCET-05-0288) supported by Program for New Century Excellent Talents in University
Received date: 2009-05-17; Accepted date: 2009-10-11
Corresponding author: MA Zong-min, PhD, Professor; Tel: +86-24-83681582; E-mail: zongmin_ma@yahoo.com
(Edited by LIU Hua-sen)