Building MongoDB document structures

For the purposes of this book, we will not focus on actual accounting database storage: most small enterprises either outsource their accounting to a separate firm or use a canned software accounting package, which has its own database and storage capabilities. Instead, we will focus on defining the following collections:

  • Products
  • Customers
  • Purchases

In addition, we will include one more collection called common, which holds the defaults for drop-down menus, product categories, units of measurement, social-media codes (for example, FB is the code for Facebook), and so forth.

Defining the products document structure

Following the preceding suggestions, after reviewing the customer requirements and interviewing stakeholders, you might discover that each of the groups defined previously needs the following information about products:

It is the job of your application to determine the website visitor's role. Your application must also determine, based on the role, what level of access to the database should be granted and what information this user can view. 

A decision that you need to make for this example is how to store information about the photo. One obvious approach is to simply store a URL that points to the photo. Another possibility is to store the photo itself directly in the database, using Base64 encoding. Yet a third approach would be to store the actual photo file in MongoDB using GridFS. For our illustration, in this section, we will assume that product photo information is stored as either a URL or a Base64-encoded string. The next chapter, Chapter 5Mission-Critical MongoDB Database Tasks, will discuss how to deliver an image directly from the database. Chapter 10, Working with Complex Documents across Collections, discusses how to use GridFS.

The information that should be stored for each product could be summarized in the following JSON document:

product = 
{
"productPhoto" : "URL or base 64 encoded PNG",
"skuNumber" : "4 letters from title + 4 digit number",
"category" : "drawn from common.categories",
"title" : "Alpha-numeric + spaces",
"description" : "Detailed description of the product",
"price" : "floating point number",
"unit" : "drawn from common.unit",
"costPerUnit" : "floating point number",
"unitsOnHand" : "integer"
}
For the actual technical specification for Base64 encoding, see RFC 4648 ( https://tools.ietf.org/html/rfc4648).
Primary key considerations

For those of you used to working with RDBMS, you will notice that a critical field is missing: the famous primary key. In MongoDB, when a document is inserted into the database, a unique key is generated and given the reserved field name _id. Here is an example drawn from the products collection:

"_id" : ObjectId("5c5f8e011a2656b4af405319")

It is an instance of BSON data type ObjectId, and is generated as follows:

  • 4-byte UNIX timestamp
  • 5-byte random value
  • 3-byte counter (starting with a random value)

This means that, in MongoDB, there is no need to define a unique key; however, it is often convenient to store a unique key of your own creation that is somehow tied to the data entered into the collection. For our purposes, we will create a unique field, productKey, which is a condensed version of the product title.

It's interesting to note that because the ObjectId contains a UNIX timestamp, it's possible to use it to determine the date and time a document was created. You can use the ObjectId.getTimestamp() method for this purpose ( https://docs.mongodb.com/manual/reference/method/ObjectId.getTimestamp/#objectid-gettimestamp).

The following is an example of the first product entered into the collection, along with our completed document structure:

db.products.findOne();
{
"_id" : ObjectId("5c7ded398c3a9b9b9577b7e9"),
"productKey" : "apple_pie",
"productPhoto" : "iVBORw0KGgoAAAANSUhEUgAAASwAAADCCAYAAADzRP8zA ... ",
"skuNumber" : "APPL501",
"category" : "pie",
"title" : "Apple Pie",
"description" : "Donec sed dui hendrerit iaculis vestibulum ... ",
"price" : "4.99",
"unit" : "tin",
"costPerUnit" : "4.491",
"unitsOnHand" : 693
}
Please note that we truncated the contents of productPhoto and description to conserve book space.
The reference for  ObjectId is  https://docs.mongodb.com/manual/reference/bson-types/#objectid.
Defining the purchases document structure

Following the preceding suggestions, after reviewing the customer requirements and interviewing stakeholders, you might discover that each of the groups defined previously needs the following information about purchases:

 

You will note from the preceding table that documents in the purchases collection need three distinct pieces of information, which are summarized in the next table:

 

As always, there are decisions that you must make regarding how much customer and product information you will store. In legacy RDBMS circles, you run the risk of violating the cardinal rule: no duplication of data. Accordingly, assuming that you are accustomed to the RDBMS way of thinking, the tendency would be to establish a relationship between the products, purchases, and customers collections (that is, a SQL JOIN). Alternatively, you might decide to embed the associated customer and products directly into each purchase document.

In Chapter 7, Advanced MongoDB Database Design, we will cover how to embed documents. For the purposes of this chapter, however, in order to keep things simple (for now), we will simply copy the appropriate pieces of customer and product data directly into the purchase document without the benefit of embedded objects. 

Furthermore, in order to avoid having to create a join table, joining multiple products to a single purchase, we will simply define an embedded array of products. In this part of the book, we will use very simple operators to access embedded array information. In Chapter 8, Using Documents with Embedded Lists and Objects, we will go into further detail on how to actually manipulate embedded array information.

Another decision that needs to be made is whether or not to define a unique identifying field. As we mentioned in the previous section, MongoDB already creates a unique identifier, the mysterious _id field, which is an ObjectId instance. For our purposes, we add a new identifying field, transactionId, which is useful for accounting purposes. Although we can extract information on the date and time from the _id field, for the purposes of this illustration, we will use the date to form our new identifier, followed by two letters representing customer initials, followed by a random four-digit number.

Finally, we need to decide whether or not to include the extended price in each purchase document. An argument in favor of this approach is that by precalculating this information and storing it, we experience faster speed when it's time to generate sales reports. Also, by storing this value at the time the purchase document is created, we avoid possible errors that might come from overly complicated reporting programming code. For the purposes of this example, we have decided to precalculate the extended price and store it in each purchase document.

Therefore, the final structure for purchase would appear as shown here from a find() query:

purchase = {
"_id" : ObjectId("5c8098d90f583b515e4d3f41"),
"transactionId" : "20181008VC5473",
"dateOfPurchase" : "2018-10-08 11:03:37",
"extendedPrice" : 303.36,
"customerKey" : "VASHCARS8772",
"firstName" : "Vashti",
"lastName" : "Carson",
"phoneNumber" : "+1-148-236-8772",
"email" : "vcarson148@Swisscom.com",
"streetAddressOfBuilding" : "5161 Green Mound Ride",
"city" : "West Haldimand (Port Dover)",
"stateProvince" : "ON",
"locality" : "Ontario",
"country" : "CA",
"postalCode" : "N0A",
"latitude" : "42.9403",
"longitude" : "-79.945",
"productsPurchased" : [
{
"productKey" : "glow_in_the_dark_donut",
"qtyPurchased" : 384,
"skuNumber" : "GLOW437",
"category" : "donut",
"title" : "Glow In The Dark Donut",
"price" : "0.79"
}
]
}
Coordinate pairs such as latitude and longitude lend themselves to geospatial queries. Accordingly, it might be convenient to represent them as MongoDB GeoJSON objects. This subject is covered in more detail in Chapter 9, Handling Complex Queries in MongoDB.
Defining the common document structure

When creating an application based upon a traditional RDBMS, you would normally define a plethora of tiny tables that are consulted when presenting the website user with HTML select options, radio buttons, or checkboxes. The only other alternative would be to hardcode such defaults and options into configuration files.

When using MongoDB, you can take advantage of two radical differences from a legacy RDBMS:

  • Each document in a collection does not have to have the same schema.
  • Documents can contain arrays.

The following table summarizes the options needs for Sweets Complete:

As you can see, for these two items, a dictionary is defined with key–value pairs. For other items, summarized in the following table, there are lists for which there is no key:

 

To add flexibility, we can simply add each of the aforementioned items as a separate document, consisting of a single list. Later, when we need to display options or to confirm form submissions against the options, we perform a lookup based on the item. So, finally, here is how the JavaScript to populate the common collection might appear:

db.common.insertMany( [
{ "key" : "gender",
"data" : { "M" : "male", "F" : "female", "X" : "Other" }
},
{ "key" : "socialMedia",
"data" : {"FB":"facebook", "GO":"google", "LI":"linkedin",
"LN":"line", "SK":"skype", "TW":"twitter"}
},
{ "key" : "categories",
"data" : ["cake","chocolate","pie","cookie","donut"]
},
{ "key" : "unit",
"data" : ["box","tin","piece","item"]
}]);

We will now turn our attention to developing application code that takes advantage of these data structures.