Ingredients as Data

a drawing of groceries in a grocery bag being converted into items in a database

In order to be able to create a ‘combined ingredients list’ for a set of recipes, one must be able to distinguish what ingredients are actually being called for in an ingredients list. This post will cover the challenges of representing a standard list of ingredients as a meaningful set of data.

Normalization

Because cooks are human beings usually, there isn’t much need for a standard recipe to follow rigorous protocols on how to represent a single ingredient.

So I guess before we cover the process of normalizing human-composed ingredients, we first need a concept of an ingredient definition for the computer to be able to understand.

What is an Ingredient?

An ingredient should have all 3 of the following properties without exception.

  1. A Quantity Value

  2. A Unit Of Measure

  3. What thing is being measured

In code that might look like this:

 
{
    quantity: 1.5,
    measurementUnit: ‘cups’,
    ingredient: ‘milk’
}
 


Without explicitly all 3 of these items, an accurate shopping list cannot be generated because it will not be possible to understand how much of a specific item to create, what units to add to the ingredient total, or even which ingredient actually should be added into the list.


Why worry about Normalization?

Humans don’t have too much trouble interpreting context from an ingredient.

Here is an example of a real ingredient item from a real recipe on allrecipes.com.

1 (4 ounce) package instant mashed potatoes (such as Idahoan® Buttery Homestyle®)

A human can handily create the object we talked about earlier, with no problem.

 

{

quantity: 4,

measurementUnit: ‘oz’,

ingredient: ‘instant mashed potatoes’

}

 

But how is the program supposed to do that?

There are probably 100 different ways to go about this. But I chose something that I knew I could quickly do, that would allow me keep my parser as simple as possible, so that I can actually complete this project.

Symbols

Symbols are a tool used in all programming languages to get stuff done. The machine doesn’t just naturally understand how to make use of symbols. As far as it’s concerned there isn’t much significant difference between the number 0 and the $ symbol.

Symbols are just a way to create a standardized protocol to assign some sort of meaning to other characters coming before or after that symbol, and between another symbol.

There are some great papers on Lexers out there, and it’s a super interesting topic if you’re getting into the world of string parsing. But that’s another post for another time.

Simple Symbols

I am just creating four symbols for now, for handling my recipes. I’ll copy-paste recipes I like form the internet into my vault, and manually update the recipe so that it conforms to my protocol. A bit more work than I like, but it’s work once and done.

Here are the symbols:

  • :q:

  • :m:

  • :mst:

  • >>

:q:

Quantity - the actual number value that is going to be needed for that ingredient

:m:

MeasurementUnit - The unit of measure that will be associated with the quantity, that at this point will already be collected.

:mst:

Measurement Unit Subtype - in the event that there is a non-standardized packaging for an ingredient, and those non-standard packages have different sizes, this is the way of handling that.

>>

Ingredient- finally, a standardized name of the actual material that will become food. This standardized name will be as simple as possible.

Conclusion

And that’s the basics of how the program works. Now that it’s understood that ingredients must be represented in a way that can be more useful to the program, it’s possible to move this ingredient data around and be more useful in the larger system if other tasks may need to be completed.

I will cover in a future post how I run combinations of recipes, and how I parse them from my vault to begin with.

Previous
Previous

Ingredients Validator

Next
Next

Grocery and Recipe Simplification