Nowadays with the development of computer technology, easy access to images has been achieved. On the other hand, with the expansion of digital images, a search for the desired images has become a more complex process. Also, with increasing the number of images, large databases have been created that are essential for many systems to organize, store, and retrieve these databases. Meanwhile, image retrieval methods have also grown . The image retrieval process allows the database users to search for their favorite images. Initially, an image was retrieved with keywords that captured the semantic information of an image. This method was very useful for small databases because the total number of images of databases could be described with just a few hundred keywords. But with an expansion of the database, the size of images has also changed. Larger images have different components that a keyword is not enough to describe . Therefore, content-based image-based retrieval (CBIR) was proposed. The CBIR system has three main components including, extracting low-level image features such as color, texture, and shape, storing these features, and measuring similarity. The main differences between different CBIR systems are in the method of extracting features and how similarity is measured .
One of the most important databases in e-commerce is the tile and ceramic database, for which no specific retrieval method has been provided so far. In this paper, the aim is to design a retrieval method based on the content of tiles and ceramics images that saves the user time and money. For this purpose, to prepare this image database, 520 different tiles and ceramics available in the market were photographed at different angles and directions. Then other information such as thickness, degree, etc. were attached to each image. In such a CBIR system, when the user inquires about any type of tile or ceramic image of interest, in addition to viewing all the similar images in the database, also receives additional information that cannot be detected just by viewing the image. On the other hand, due to the similarity of the images, it was not possible to use retrieval methods based on the keywords. As a result, a CBIR system is used to achieve this goal. Therefore, suitable features have to be selected for different images of tiles and ceramics.
So, after reviewing and implementing several previous research [4-24], the Average Color Dominance (ACD) and HSV (Hue-Saturation-Value) color space features and the texture features including energy and contrast are selected. Then, according to the existing designs of tiles and ceramics in the market, a database including 520 different tiles and ceramics images is created. Then, each image of the database is divided into nine sub-images. Next, all the images in our tiles and ceramics database are grouped into three clusters based on their sub-images. In continuing, selected features are extracted from each image in the database and query image. These selected features are the minimum features that are required to reduce the amount of computations and information stored, as well as speed up the retrieval. Finally, Feature vectors are formed for each image, and the feature vector of the query image is retrieved among the feature vectors of images in our database. The retrieval results of the proposed method showed that its accuracy and speed are improved by 16.55% and 23.88%, respectively, compared to the most similar methods. In the following, related works are presented in section 2. The proposed method is introduced in section 3. In section 4, retrieval results of the proposed method are shown on our database of tile and ceramic, and the proposed method is also compared with a number of the most similar methods. The paper concludes with a conclusion in section 5.
2- Related Work
The use of color and texture visual features in the CBIR was enhanced in  by adding a new color feature called average color dominance (ACD) which was tried to enhance color description using the dominant colors of an image. Images were also retrieved in  by their contents such as color, texture, shape, or objects. Thus, the degree of similarity between the query image and database images was measured by color, texture features extraction, shape feature extraction similarity, or object presence between the two images. It was shown that a better way to retrieve images was using multi-visual features. Furthermore, it was provided in  both scale and rotational invariance in images for the CBIR. A CBIR method was proposed in , which was a modified form of the color averaging technique. In the previous technique, row mean, column mean, forward diagonal mean, backward diagonal mean, sum of row, columns, and diagonal intensity values respectively were calculated for the similarity measure. However, it was possible that the sum was the same, but the intensity values were not the same for the same corresponding positions in both the query image and database image, as a result, irrelevant images were retrieved. To overcome this problem, the position of intensity values presented in both the query image and database images were considered for the similarity measure as well. A CBIR method was also introduced in  using both the color and texture features. To extract the color feature, the color moment was calculated where an image was in the HSV color space. To extract the texture feature, the Ranklet transform was performed on the grey-scale image, and then from the Ranklet images, the texture feature was extracted by calculating the texture moments. Furthermore, it was shown in  that the image retrieval using a single feature did not provide a good solution for the accuracy and efficiency. It was also shown the most important visual features were color and texture. Therefore, a technique for retrieving the images was presented based on their content namely color, texture, and combination of both. In addition, an algorithm was introduced in  that incorporated all three features of color, shape and texture to give the advantages of various other algorithms to improve the accuracy and performance of image retrieval. The accuracy of HSV color space-based color histogram-based matching gave better retrieval results. The speed of shape-based retrieval was enhanced by considering approximate shape rather than the exact shape. The Gray-Level Co-occurrence Matrix (GLCM) was used to extract the texture features. The feature matching procedure was based on the Canberra distance. A CBIR method was proposed in  by extracting both color and texture feature vectors using the discrete Wavelet transform and the Self Organizing Map (SOM) artificial neural networks. At query time, texture vectors were compared using a similarity measure of the Euclidean distance, and the most similar image was retrieved. In addition, other relevant images were retrieved using the neighborhood of the most similar image from the clustered data set via SOM. Two algorithms of the color histogram and Wavelet-based color histogram (WBCH) were implemented in  which were used respectively for the color and combined color and texture features extraction from both the query image and database images. Then the retrieved images by both algorithms were compared on the basic values of parameters that showed the WBCH algorithm was better than the color histogram algorithm in terms of the retrieval time as it took less time for the image retrieval. A method was introduced in  which presented the query image as an input and got related images as the output that matched the content of the query image. The findings were not only based on the color, texture, and shape but also traced the underlying points of an image. At first, the images were retrieved based on the color, then taken after by texture, and finally traced by the underlying graphical structure. Two algorithms of color and texture extraction were introduced in . The Color histogram that was rotation invariant about the view axis, was used to represent color features but it could not entirely characterize the image. The texture feature extraction was presented based on the Gabor filter. A CBIR system was also presented in  that used a combination of the completed local binary pattern (CLBP) and color autocorrelogram. The CLBP features were extracted on a multi-resolution multi-direction filtered domain of value component. The color autocorrelogram features were extracted in two dimensions of hue and saturation components. Furthermore, an image retrieval method was presented in  based on the texture structure histogram (TSH) and Gabor texture feature extraction. In the TSH technique to describe the texture feature, the edge orientation and HSV color information methods were used. To make the image content more reasonable non-equal interval quantization scheme was used. The image texture was retrieved using the mean and variance of the Gabor filtered image. To get the same dominant direction, rotation normalization was used. The comparison of both texture techniques was discussed. A hybrid-feature extraction approach was described in  to solve the problem of designing a CBIR system manually. Two features were used to retrieve images such as color and texture. The color feature was extracted using different color spaces. The texture feature was extracted using the GLCM. An image was retrieved by combining the color and texture features. A method was introduced in  that combined both color and texture features in a hierarchical manner to retrieve an image and showed its advantage. They also introduced a method of image segmentation for feature extraction. The proposed hierarchical approach was applied to the standard INRIA dataset. A two phases approach was proposed in  to retrieve images from the data set based on color and texture. In the first phase, the HSV global color histogram was used and an automatic cropping technique was introduced to accelerate the features extraction process and enhanced the retrieval accuracy. In the second phase, the joint histogram and GLCM were deployed and the color and texture features were combined to enhance the retrieval accuracy. Finally, the images were classified and retrieved using the K-means algorithm. Two experiments were conducted using the WANG database consisting of 10 different classes each with 100 images. A three-level hierarchical CBIR system was presented in  where, each level of the hierarchy used either the texture, shape, or color image features to reduce the size of the image database by discarding the irrelevant images and at the final level of the hierarchy, it extracted the most analogous images from the reduced image database. The adaptive Tetrolet transform was used to extract the texture features from the regions of interest in images. To extract the shape features, the edge joint histogram was proposed which used the orientation of edge pixels and their distance from the origin together to create the joint histogram. For the color feature extraction, another color channel correlation histogram was introduced. The order of the three different feature extraction processes on each level of the hierarchy was not rigid because it was difficult to predict the proper order for the highest retrieval.
A CBIR approach was proposed in  that combined the visual and textual features to retrieve images. Firstly, the method classified the query image as textual and non-textual. The textual query image formed a bag of textual words. The visual salient features were extracted from the non-textual query image and formed a bag of visual words. Next, the method fused the visual and textual features, and the top similar images were retrieved based on the fused feature vector. Three modes of retrieval, image query, keywords, and a combination of both were used. A technique was proposed in  to fuse the spatial color information with the shaped extracted features and object recognition. For the RGB (Red-Blue-Green) channels L2 spatial color arrangements were applied and features were extracted, thereby fused with the intensity-ranged shapes formed by connecting the discovered edges and corners for the grey level image. They used the perifoveal receptive field estimation with 128-bit cascade matching with symmetric sampling on the detected interest points to discover information for the complex, overlay, foreground, and background objects. Firstly, the process was accomplished by reducing the massive feature vectors, selecting a high variance coefficient, and secondly obtaining the indexing and retrieval by employing a Bag-of-Words approach. An image retrieval method was presented in  based on a combination of the local texture information derived from two different texture descriptors. First, the color channels of the input image were separated. The texture information was extracted using two descriptors as the evaluated local binary patterns and predefined pattern units. After extracting the features, the similarity matching was done based on the distance criteria. A deep cross-modality Hashing network was proposed in  that first, the optical images with three channels were transformed into four different types of single channel images to increase the diversity of training modalities. This helped the network to focus mainly on extracting the contour and texture shared features and made it less sensitive to the color information for images across modalities. Second, it combined any type of randomly selected transformed images and their corresponding optical images to form image pairs that were fed into the networks. The training strategy, with paired image data, eliminated the large cross-modality variations caused by different modalities. Finally, the triplet loss, in combination with the Hash function, helped the modal to extract the discriminative features of images and upgraded the retrieval efficiency. An image retrieval algorithm was introduced in  by integrating the image color information and surface geometry principal curvatures information. First, the color histogram of the quantized color image was obtained. Simultaneously, the Hessian matrix was used to extract the image texture information, and the joint histogram of the oriented gradient with the mixed sampling and multi-scale was constructed. Then, the color histogram and histogram of the oriented gradient were fused to obtain the final joint histogram. A CBIR method was presented in  that first modified the traditional microstructure descriptor (MSD) to capture the direct relationship between the shape and texture features and that between the color and texture features. Then, the image uniform local binary patterns (LBP) histogram was extracted to capture the color difference information. At the image comparison stage, first, the image descriptors were compared to compute their similarities. The similarity between each pair of images was then updated by considering the similarities to comparable images within the dataset. Accordingly, the final similarities of the images were obtained. A CBIR technique was proposed in  that focused on extraction and reduction in the multiple features. To obtain the multi-level decomposition of the image, the discrete Wavelet transformation was applied to the RGB channels initially, and a local binary pattern texture descriptor was applied to the transformed images. The additional information was also extracted using the magnitude information. Then, the GLCM description was used to extract the statistical characteristics for texture image classification. The proposed technique was applied to the CORAL dataset with the help of the particle swarm optimization-based feature selector to minimize the number of features that could be used during the classification process. In addition, three classifiers including, the support vector machine, K-nearest neighbor, and decision tree, were trained and tested. A CBIR system was presented in  in a hierarchical mode based on three visual features like color, texture, and shape for retrieving the most relevant images from a large-scale image database where in each stage the retrieval process discarded the irrelevant images by filtering process and as a result, the search space was reduced in subsequent stages. The system implementation results were evaluated on the GHIM-10k and Corel-1k databases according to the average precision criterion. The value of average precision was reported at 74.87% and 80.08% on the GHIM-10k and Corel-1k databases respectively. A feature descriptor named “correlated microstructure descriptor (CMSD) was proposed in  to incorporate the high-level semantic concepts for image retrieval. The CMSD relied on the various aspects of an input image by correlating color, texture orientation, and intensity information. The correlated information was mapped with the microstructure to extract all the fine detail within the subject area. A richer multi-directional edge orientation map was also constructed by quantizing the edge obtained by the 4D Sobel operator into 6 levels. Moreover, the local information was obtained by quantizing the V component of HSV into 10 levels. Experiments were performed on standard databases of Corel-1k, Corel-5k, and Corel-10k. The method implementation results evaluation according to the average precision criterion showed the best value of 78.54%.
A relevance feedback retrieval (RFRM) method was introduced in  for the content-based image medical retrieval (CBMIR). The feedback was based on voting values performed by each class in the image repository. A group of color and texture features was extracted based on the color moments and GLCM texture features. For similarity measure, eight common similarity coefficients were used. After briefly researching and applying a single random image query, the top images retrieved from each class were used as voters to select the most effective similarity coefficient that was used for the final searching process. The method was implemented on the Kvasir database, which has 4000 images divided into 8 classes. The method implementation results showed an average precision of 85% on the Kvasir database. A method was proposed in  that mainly concentrated on extracting the dominant color information of the image using the clustering process. The clustering process was initiated by the proposed seed point selection approach. This approach derived the number of seed points using the first-order statistical measure and the maximum range of the distributed pixel values. Moreover, the method gave equal priority to dominant color and its occurrence information in calculating the similarity between query and database images. The performance of the method was investigated on SIMPLIcity, Corel-10k, OT-scene, Oxford flower, and GHIM databases. A feature detector was presented in  by performing non-max suppression after detecting edges and corners based on corner score and pixel derivation-based shapes on intensity-based interest points. Thereafter, interest point description is applied to interest point features set by using symmetric sampling to cascade matching produced by validating dense distributed receptive fields after estimating perifoveal receptive fields. Spatial color-based features vector was fused with retinal and color-based feature vector extracted after applying L2 normalization on a spatially arranged color image. Dimensions were reduced by using PCA on massive feature vectors produced after symmetric sampling and transmitted to bag-of-word in fused form for indexing and retrieval of images. Experiments were performed on Corel-1k, Corel-10k, Caltech-101, image net, alot, coil, ftvl, 102-flowers, and 17-flowers databases. The method was also compared with seven other descriptors. A comprehensive survey of deep learning based developments in the past decade was presented in  for the CBIR. The categorization of existing methods from different perspectives was performed for the greater understanding of progress. The taxonomy used in this survey covered different supervision, different networks, different descriptor type and different retrieval type. A performance analysis was also performed using the state-of-the-art methods.
3- Proposed Method
The steps of the proposed retrieval method are shown in Fig. 1. These steps are described as follows.
3-1- Image Division
In this step, tiles and ceramics images in the database that were photographed by a Sony Cyber-shot camera, 1/12 Megapixels, are resized to 240*150 pixels. Then, each image is divided into nine equal sub-images of size 50*80 pixels. Since many tiles and ceramics designs are concentrated in the middle, this division method causes the original design of tile or ceramic to be located in the middle and around the perimeter. The image division process increases the retrieval accuracy, but it also increases the retrieval time. On the other hand, based on the existing designs of tiles and ceramics in the market, the images of our database can be grouped into three clusters; images that have a design in the middle, a design in the margins, and both. Therefore, we group all the tiles and ceramics in our database into three clusters in order to retrieve them faster, with less computation, and compensate for the time increasing according to the image division process.
Fig. 1 Steps of the proposed retrieval method for tile and ceramic images database
3-2- Grouping of the Database Images
The grouping process is performed according to the flowchart that is shown in Fig. 2. After determining the five sub-images of each database image, if in the middle sub-image, a dominant color occupies more than half of the pixels, the design is not in the middle but in the margin (group 1). Therefore, there is no need to examine the dominant colors of the other four sub-images. However, the feature vector extraction has to be performed from the four sub-images. Because in the images of group 1, the color design may be at the top, left, right, and bottom of the image (Fig. 3). Finally, the feature vectors of four sub-images are combined and formed a feature vector for the original image.
Fig. 2 Grouping process of the database of tile and ceramic images
In the second group, the design is in the middle of the images. Therefore, it is existed more than one color in counting the number of pixels of the dominant colors in the middle sub-image. So the number of pixels of the image is divided between two or three colors. In this case, the number of pixels of the dominant colors in the four sub-images also has to be counted. If the number of dominant colors of all four sub-images is equal, then for certain, the design is in the middle, and the feature vector has to be extracted only from the middle sub-image (Fig. 4). The third group includes images that the design is in the middle or in the margin. Therefore, the number of dominant colors in the middle sub-image and the four other sub-images, is more than one dominant color and the number of dominant colors in the four sub-images is not equal. So, the feature vector has to be extracted from all the five sub-images of the original image and finally, the five feature vectors are combined and formed a feature vector for the original image (Fig. 5).
3-3- Feature Extraction
In the CBIR retrieval, features have to be selected somehow to increase accuracy and decrease retrieval time. Therefore, the smaller the number of features and amount of computation, the faster is the retrieval. Of course, a small number of features should be sufficient to maintain retrieval
Fig. 3 The sample images of tiles and ceramics in group 1
Fig. 4 The sample images of tiles and ceramics in group 2
Fig. 5 The sample images of tiles and ceramics in group 3
Accuracy. In the proposed method, in order to select the appropriate features for tiles and ceramics images, we reviewed previous research available in various references. Then we tested several features and selected the ACD, HSV, energy, and contrast features according to the same reasons mentioned above. The results of these experiments are shown in section 4. Regarding the feature extraction method, by specifying the group, the feature extraction is done only in a few sub-images instead of in the whole of the original image, which leads to a reduction in the computational volume and thus speeds up the retrieval.
3-3-1- HSV Color Feature
The HSV color space is closer to the human eye vision due to its uniformity and is more suitable for retrieval. Each of the HSV components (S, V, and H) has a wide range of values. In the proposed method, the H component is quantized into 8 bins and each of the S and V components into 3 bins. Therefore, 72 color features are obtained according to Equations (1). Then, these three feature vectors are combined by Equation (2), and the final HSV feature vector is formed .
3-3-2- ACD Color Feature
To extract the ACD color feature , first, the original image is quantized into 38 colors, then pixels that the pixel itself and its 4 neighbors have the same color, are determined. Next, from these pixels, the top 3 most counted colors that have a similar color to their four neighbors, are separated and for each of these three colors, the average of the B, R, and G components are calculated. Finally, these three average values are again re-averaged and this value is considered as the ADC color feature of the original image.
3-3-3- Contrast and Energy
To determine the contrast and energy as the texture features, first, the gray-level co-occurrence matrix, of each image is formed then, energy and contrast are calculated according to Equations (3) and (4) .
Where is defined by first specifying a displacement vector d and counting all pairs of pixels separated by d having gray levels i and j. Energy is a measure of the textual uniformity of an image. Contrast is the difference moment of the C matrix and it measures the amount of local variation in an image.
3-3-4- Feature Vector of Each Group
Finally, in our database, for 150 images of tiles and ceramics in group 1, the first four feature vectors are extracted from 4 sub-images of each image and placed side by side in a feature vector with a size of 4*4. Then 150 feature vectors in size of 4*4 are combined and placed side by side in a feature vector with a size of 4*4*150 for group 1. Similarly, for 120 images of tiles and ceramics in group 2 of our database, because we only consider the middle sub-image, the final feature vector becomes 1*4*120. In addition, the final feature vector of group 3 with 250 images is in size of 5*4*250, because we consider 5 sub-images of the original image.
When a query image is given to the proposed CBIR system, the query image is first divided into the nine equal sub-images, and then its query group is determined according to the flowchart in Fig. 2. Next, the feature vector of the query image is extracted. Finally, only the feature vector of the specified group is compared to the feature vector of the query image based on the least Euclidean distance. Note that all the previous steps of the proposed method are performed offline, and only retrieving step is online.
4- Simulation Results
4-1- Evaluation criterion
To compare the retrieval performance of the proposed method with the most similar methods, the precision criterion is calculated according to Equation (5) .
4-2- Simulation Results
Methods in  and  are very similar to the proposed method. Therefore, first, we compare the retrieval accuracy and retrieval time of the proposed method with these two methods. The comparison results are shown in Table 1 according to the average precision criterion and the online running time of the methods. As it is seen, the retrieval accuracy of the proposed method is increased and its retrieval time is decreased on our tiles and ceramics database by selecting fewer features and extracting them from only a few numbers of the sub-images of the query image. In addition, the retrieval accuracy of the proposed method is also compared with a number of other methods that are mentioned in the references section. The results are also shown in Table 2 by using the average precision criterion. Furthermore, the proposed method is tested on other famous image databases to show its retrieval performance on different image databases as well. The results are shown in Table 3.
4-3- Results of the Best Features Selection
Retrieval of all images of tiles and ceramics in our database was performed by applying the proposed method based on the various features extraction such as the ACD, HSV, color moment, energy, contrast, and Ranklet. The selection of these features was based on the research that was carried out in 10 years from 2012 to 2022. Evaluations of these retrieval results are reported in Fig. 6 and Fig. 7, using both time and precision criteria. As it is seen, the ACD, HSV, energy, and contrast are the best choice between different features. Therefore, in the feature extraction stage of the proposed method, only these four features are extracted, which are the minimum number of features that leads to a reduction in the computational volume and thus speeds up the retrieval.
Table 1: Retrieval accuracy and time of the proposed method, in comparison with the two most similar methods on our database
Average Time (Sec.)
Table 2: Retrieval accuracy of the proposed method, in comparison with a number of other methods on our database
Table 3: Retrieval accuracy of the proposed method on other databases
Total number of images
4-4- Examples of Retrieval Results
A few examples of the query images and their retrieval results by using the proposed method are shown in Figs. 8, 9, and 10.
Fig. 6 Retrieval time of the proposed method on our database based on extracting different features
Fig. 7 Retrieval evaluation of the proposed method on our database based on various features extraction using the average precision criterion
To have a specific retrieval approach for the tiles and ceramics database in e-commerce, a CBIR method was proposed. In the proposed method, according to the division of our database images into nine sub-images, the images were grouped. Then, four selected features were extracted from the main sub-images in each group and the feature vectors of the database images were formed. The same process was performed for a query image until its feature vector was formed. Finally, by comparing the query feature vector with the feature vector of each database image based on the Euclidean distance, retrieval results were presented. After implementing the proposed method on our database, the results were compared first to the most similar methods, which showed the retrieval accuracy and speed were improved by 16.55% and 23.88% respectively. In addition, these results were also compared to a number of the recent methods with less similarity, and still, the proposed method retrieval accuracy even with fewer extracted features was 1.5% higher. Furthermore, the proposed method was run on other famous image databases as well. The results showed that in the standard databases with high-quality imaging and more images, the accuracy of the proposed method was even higher than in our own database.
The implementation results on the best features selection for the retrieving showed that the contrast, energy, HSV, and ACD were the best choice between different features. The retrieval time of the proposed method was 0.51 sec. which is enough fast as a real-time method. As a result, minimum feature selection but sufficient was very effective in improving the speed and accuracy of retrieval. Also, using only the main sub-images as defined in each group instead of the whole input image reduced the computational volume and therefore reduced storage space and increased speed without compromising accuracy. For future work, an investigation of the number of sub-images of the original image in the division process is recommended.
Fig. 8 The query image and its retrieval images by applying the proposed method on our database. All the similar images, even with different angles and directions, are retrieved for the query image.
Fig. 9 The query image and its retrieval images by applying the proposed method on our database. All the similar images are retrieved.