Cooking recipes generator utilizing a deep learning-based language model

Published in PUT Poznań Bachelor Thesis, 2020

Cooking recipes are a very specific type of text, that allows to share culinary ideas between people by providing an algorithm for their realization. Creating a recipe requires a certain dose of creativity and often some of the best ones are crafting unlikely ingredient combinations. By providing an automatic recipe generator we can allow the creation of truly unique dishes with combinations no one has ever thought of. With current technology it would be very time consuming and difficult to create a generator that can distinguish a good recipe from a bad one in terms of taste, but having a model that creates them with a viable text format and sensible instructions is a step forward to a new generation of machine conceived dishes. Therefore, this work will focus on the creation of viable and original recipes that will be able to pass as real human made recipes when presented to a person.

The method that has been chosen for recipe generation is a deep learning model that will process real life recipes for training. The first order of business was the acquisition of training data that will be used by the model. In addition to using existing datasets, more data was gathered by scrapping cooking websites, to increase the variety of training data. Next the gathered data has been analyzed to understand how recipes are constructed. The result of such an analysis has been used to clean the data, so texts that were ungrammatical, irrelevant or lacking crucial features (like list of ingredients) were removed. We used this data to train a deep learning language model that is capable of generating a recipe based on some input ingredients. We then served the model in the form of a recipe generation website. A crucial part was evaluation of the generated text, that has been done by using NLG (Natural Language Generation) metrics as well as human based study.

Download paper here