Dietary tracking in the present world requires a constant updating of numerous nutritional metrics such as calorie and fat content of food being consumed. Mostly the nutrition information is primarily conveyed to consumers through nutrition labels which can be found in all packaged food products.
However, sometimes it becomes challenging to utilize all this information available in these nutrition labels as they might not be familiar with nutritional terms or due to lack of time and motivation. So it is essential to automate data collection and interpretation process by integrating Computer Vision techniques. To make it more manageable and enjoyable we present a computer vision solution to extract data directly from nutrition label on the food product itself.
We built a nutrition information extraction module which takes images as input and classify all nutritional labels and performs extraction on them. We extract information directly from nutritional label on the food product itself. Our module uses computer vision and optical character recognition (OCR) techniques to extract relevant data from images.
The module takes image/images as input and selects only nutrition label images from them as input.
The pre-processing stage is responsible for doctoring the image into a clean, organized format in which text and numerical values are easily identifiable. As we need only detailed relevant information like calories, cholesterol, etc. the pre-processing will remove other noise or irrelevant information from the whole image. The region of interest in image which is nutrition label box in the image is identified and segmented.
The OCR module takes the pre-processed images and extracts all the text in the image. We used open source Tesseract which is maintained by Google. Tesseract output contains some inaccuracies as well as extraneous information such as lines of dashes or some other irrelevant special characters.
The tesseract output may contain errors such as misspellings or misidentified characters and there can be additional characters that are present on actual label. This module organizes nutrient information into key- value pairs and outputs that as an excel sheet.
Our module also triggers Google vision OCR when there is a lot missing key-value pairs in the output text after correction of text. It compares both outputs and returns the best one.
Traditionally, the data entry process is manually done by people which takes a week or more to finish for all images and there can be a chance of human error. We can give any type of images as input to the nutrition extraction module, it identifies and takes only nutrition label images from them. The nutrition extraction module can process a single image or perform batch processing and it takes approximately 10 seconds per image and can process 200 images in less than half an hour in batch mode.
The mundane data entry process is either a handwritten or electronic log of eating habits, performing tedious calculations to keep one’s progress up to date. Health and fitness applications have arisen that provide an automated means of tracking nutritional data but many of these still require the user to input all necessary information. This manual input requires tedious repetition on the part of the user. Our module can be plugged as a mobile or web application to provide a simple and accurate means of tracking diet.