Iplab.dmi.unict.it
A Benchmark Dataset to Study the Representation of Food ImagesGiovanni Maria Farinella, Dario Allegra, Filippo Stanco{gfarinella, allegra, fstanco}@dmi.unict.it, Image Processing Laboratory, University of Catania
Experimental settings and results
Food is an essential component of human life and it is well-known that people love food. Nevertheless, an insane
For testing purposes images have been resized to 320 x 240 pixels. We have employed the χ2 distance to measure the
diet can cause problems in the general health of the people. Automatic recognition of food images (e.g., acquired
similarity between two images represented as Bags of Textons [1,4] or PRICoLBP [2]. The similarity measure tested
with mobile/wearable cameras) has a key role in building monitoring systems to assess the daily food intake. A food
with SIFT [3] is based on the number of matchings. Moreover, the similarity measure in which the SIFT matchings are
recognition system could replace the traditionally dietary assessment based on self-reporting in a food diary that is
inversely weighted taking into account matching distances have been also taken into account. All the considered local
often inaccurate. It could be important when a patient (e.g., with obesity, diabetes, or food allergy) has to be assisted
descriptors are rotationally invariant. SIFT is also scale invariant. The representation have been considered in both
during his daily meals. Moreover, experts (e.g., nutritionists) could use the food intake monitoring system to study the
grayscale and color domains.
daily diet of patients to better understand their habits and/or eating disorders.
To properly evaluate the different representation methods, the experiments have been repeated three times. At each
However, food recognition is a challenging task since the food is intrinsically deformable and presents high variability
run different approaches are executed on the same training and test sets. To this purpose, at each run we have built a
in appearance. The image representation employed in a food recognition engine plays the most important role. To
training set composed by 889 images, by selecting one image of the UNICT-FD889 dataset per dish, whereas the rest
properly study the peculiarities of the image representation in the food application context, a benchmark dataset is
of images have been used for testing purposes. The images considered for the three training sets are different. At each
run, test images are used to perform queries on the corresponding training dataset used for that test. Given an image
We introduce a food dataset composed by
representation, the final results are obtained by averaging over the three tests.
889 distinct plates of food of different nationalities (e.g., Italian, English,
Thai, Indian, Japanese, etc.). Each dish has been acquired multiple times by users (with a smartphone) in real cases of
8890 Textons - Gray - Class Based
8890 Textons - Gray - Global
4445 Textons - Gray - Global
PriLBP - Gray
PriLBP - Color
meals and in unconstrained settings (e.g., background, light environment conditions, etc). The dataset presents both
2220 Textons - Gray - Global
1110 Textons - Gray - Global
1110 Textons - Color - Global
photometric (e.g., flash vs no flash) and geometric variabilities (rotation, scale, point of view changes). The dataset is
designed to push research in this application domain with the aim of finding a good way to represent food images for
The first question we try to answer is the following: are we able to perform a near duplicate image retrieval (NDIR) in
case of food images?
The UNICT-FD889 dataset
The overall dataset contains 3583 images related to 889 distinct plates of food. In the image on the left are shown
SIFT - Gray - Match
SIFT - Gray - Score
SIFT - Color - Match
SIFT - Color - Score
1110 Textons - Color - Global
SIFT - Color - Score
PriLBP - Color
examples of 96 dishes of the proposed dataset. In the image on the right are shown three instances of each dish for 32
different plates of the UNICT-FD889 Dataset.
The retrieval performances on each run have been evaluated with the probability of the successful retrieval P (n) in anumber of test queries:
where Qn is the number of successful queries according to top − n criterion, i.e., the correct near duplicate image is
Representation of food images
among the first n retrieved images, and Q is the total number of queries. We also consider the precision/recall valuesat top − n = 1. Note that the precision and recall for top − n = 1 are equivalent because there is only one correct
To benchmark the proposed dataset for Near Duplicate Image Retrieval (NDIR) purpose, we explore three standard
match for each query in the training set. Finally the retrieval results are evaluated through the mean average precision
state-of-the-art image representations in our tests: Bag of Textons [1], PRICoLBP [2] and SIFT features [3]. We decided
(mAP) measure, i.e., the area under the precision-recall curve.
to use Bag of Textons model for its power in representing textures and because have been obtained the best results so
far on the PFID dataset [4]. We tested both class-based and global-based Bag of Textons representation with different
vocabulary sizes. PRICoLBP descriptor has been chosen since it encodes spatial co-occurrence of local LBP features
which are useful in representing textures. Finally SIFT features have been considered due their good performances in
the context of near duplicate image retrieval [5].
[1] Varma M. et al., A Statistical Approach to Texture Classification from Single Images, International Journal of Computer Vision, 62(1-2),61-81, 2005
[2] Qi X. et al., Pairwise rotation invariant co-occurrence local binary pattern, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014
[3] Lowe D.G., Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 60(2), 91-110, 2004[4] Farinella G.M. et al., Classifying food images represented as bag of textons, IEEE International Conference on Image Processing, 2014[5] Nistér D. et al. Scalable recognition with a vocabulary tree, IEEE Conference on Computer Vision and Pattern Recognition, 2006
Source: http://iplab.dmi.unict.it/UNICT-FD889/Poster-W22-25.pdf
OJ/S S13923/07/2014 Member states - Supply contract - Contract notice - Open procedure This notice in TED website: United Kingdom-Larkhall: Pharmaceutical products Directive 2004/18/ECSection I: Contracting authorityI.1) Name, addresses and contact point(s)The Common Services Agency (more commonly known as NHS National Services Scotland) (‘the Authority')National Procurement, Canderside, 2 Swinhill AvenueContact point(s): National ProcurementFor the attention of: Stuart GillespieML9 2QX LarkhallUNITED KINGDOMTelephone: +44 1698794585Internet address(es): Address of the buyer profile: Electronic submission of tenders and requests to participate:Further information can be obtained from: The above mentioned contact point(s)Specifications and additional documents (including documents for competitive dialogue and a dynamicpurchasing system) can be obtained from: The above mentioned contact point(s)Tenders or requests to participate must be sent to: The above mentioned contact point(s)
Concentration Deficit Disorder (Sluggish Cognitive Tempo) Russell A. Barkley, Ph.D. Clinical Professor of Psychiatry and Pediatrics Medical University of South Carolina This Fact Sheet is based on a chapter to appear in the next edition of Dr. Barkley's textbook, Attention Deficit Hyperactivity Disorder: A Handbook for Diagnosis and Treatment (4th edition). New York: Guilford Press. In press; expected publication date – December 2014.