Skip to content

jsoendermann/MongoStyleGuide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 

Repository files navigation

MongoDB Style Guide

Table of Contents

  1. General
  2. Enumerations
  3. Booleans
  4. Dates
  5. Null and undefined
  6. Other types
  7. Names
  8. Object modelling

General

  • 1.1 Avoid surprises: Your goal when creating a new schema should be to be as least surprising as you can. The rules in this style guide are written with this in mind. If there is a trade off between ease of creating data and ease of understanding its structure, choose the latter.

    Why? Having others be able to understand your data is usually more important than making your code slightly more convenient.

  • 1.2 Priorities: There are several dimensions that will impact your database design, such as used hard disk space, read/write speed and ease of development. Spend some time thinking about which ones are important in your case. Then come to the conclusion that ease of development is the only thing that matters because you're not Google.

    Why? It's easy to fall into the trap of wasting time making things scale from the start. This makes it less likely that you will ever get to a point where these problems become real.

Enumerations

An enumeration is a type that allows a limited number of values, e.g. a role field that can contain the values 'DOCTOR', 'NURSE', 'PATIENT' and 'ADMINISTRATOR'.

  • 2.1 Modelling: Always model your enums as uppercase string constants, e.g. 'WAITING', 'IN_PROGRESS' and 'COMPLETED'.

    Why?

    1. Favoring strings over booleans or numbers means your data is self-documenting. If your gender field contains the value 1, you don't know if that means male, female or something else.
    2. Using uppercase strings makes it easy to see at a glance whether something is an enum or a user-facing value.
    ❌ gender ❌ gender ✅ gender
    0 'Male' 'MALE'
    1 'Female' 'FEMALE'

  • 2.2 Explicit states: Don't use null, undefined, or any value except upper case string constants in your enums. This includes initial, undecided or unknown states.

    Why? It makes meaning explicit. If you have an assessmentState field that's set to null, you don't know whether that means 'no assessment necessary', 'not applicable', 'patient didn't show up', 'undecided' or any other possible state.

    ❌ assessmentState ✅ assessmentState
    null 'NOT_REQUIRED'
    missing 'NOT_APPLICABLE'

Booleans

  • 3.1 Booleans and enums: Don't model things as booleans that have no natural correspondence to true/false, even if they only have two possible states. Use enums instead.

    Why?

    1. Booleans can not be extended beyond two possible values, e.g. a boolean field 'hasArrived' could not be changed later to include the possibility of cancelled appointments.
    ❌ gender ✅ gender
    false 'MALE'
    true 'FEMALE'

  • 3.2 Prefix: Prefix your boolean field names with verbs such as 'is...' or 'has...' (e.g. 'isDoctor', 'didAnswerPhoneCall' or 'hasDiabetes').

  • 3.3 Orthogonality: If you have several mutually exclusive boolean fields in your collection, merge them into an enum.

    Why? So that it's impossible to save invalid data in your db, like a car that's green and red simultaneously.

    isRed isBlue isGreen
    false true false
    true false false

    color
    'BLUE'
    'RED'

Dates

  • 4.1 ISO strings: Make sure you never save dates as ISO strings like '2017-04-14T06:41:21.616Z'. This can easily happen as a result of JSON deserialization.

    '2017-04-14T06:41:21.616Z' ISODate('2017-04-14T06:41:21.616Z')

  • 4.2 Day strings: Don't use the Date type when all you are concerned with is the day component. Instead, use strings of the form 'YYYY-MM-DD'

    Why? It makes common operations much easier, like checking whether a date falls on a certain day or getting all the documents that fall on a certain day.

    ❌ dateOfBirth ✅ dateOfBirth
    ISODate('1989-10-03T06:41:21.616Z') '1989-10-03'

Null and undefined

  • 5.1 No overloading: Don't overload the meaning of null and undefined to mean anything other than 'unset'.

    Why? It breaks expectations and makes it impossible to interpret the data by looking at it.

  • 5.2 Default: Don't use null or undefined when there is a sensible default value like 0, '' or [].

    Why? It keeps your types pure

    ❌ notes ✅ notes
    'Some interesting observation' 'Some interesting observation'
    null ''
    ❌ comments ✅ comments
    [ 'First!', 'Great post!' ] [ 'First!', 'Great post!' ]
    null []

  • 5.3 Primitive types: In columns that contain primitive types (booleans, numbers or strings) or Date objects, use null to express absent values.

    1 1
    2 2
    missing null

  • 5.4 Complex types: To express the absence of a value in columns that contain objects or arrays, add another column that determines whether your array or object is present. If it isn't, it should be undefined.

    productType chapterTitles
    'BOOK' ['1 Get Started', '2 The End']
    'CAR' []
    'CAR' null

    productType chapterTitles
    'BOOK' ['1 Get Started', '2 The End']
    'CAR' missing
    'CAR' missing

  • 5.5 Don't mix the two: Don't mix null and undefined in the same column.

    Why? It makes it hard to understand what the two values are supposed to represent.

    ❌ height ✅ height
    178 178
    null null
    missing null

  • 5.6 Multiple missing value states: If you need two ways to represent the absence of a value, convert the value to an object with state and value keys.

    weight
    null
    missing
    97
    ''

    weight
    { state: 'NOT_APPLICABLE' }
    { state: 'USER_DOES_NOT_TELL' }
    { state: 'SET', value: 97 }

Other types

  • 6.1 Mixed types: Don't mix values of different types in one column. Restrict yourself to one type per column.

    Why? So you don't have to typecheck or cast when you consume the data.

    1 1
    '2' 2
    { value: 3 } 3

  • 6.2 Object columns: If you have a column or array that contains objects, make sure all objects share the same schema.

    { value: 42 } { value: 42 }
    { otherValue: 'foo' } { value: 43 }
    { } { value: null }

  • 6.3 Falsiness: Don't use '' or 0 for their falsiness.

    Why? It makes it very easy to write buggy code.

    ❌ height ✅ height
    182 182
    167 167
    0 null

  • 6.4 Numbers as strings: Don't save numbers as strings except when saving data like phone numbers or ID numbers for which leading zeroes are significant and for which arithmetic operations don't make sense.

    Why? It means you don't have to cast when performing arithmetic or when comparing values.

    ❌ height ✅ height
    '178' 178
    '182' 182
    ❌ phoneNumber ✅ phoneNumber
    20123123 '020123123'

  • 6.5 Units: If you really need to include units with your numbers (think about whether you do, can't you convert and save them in metric?), save them as objects with a unit and a value field.

    Why? It makes it possible to perform comparisons and arithmetic operations on your values.

    '10 kg' { unit: 'KG', value: 10 }
    '20 stones' { unit: 'STONES', value: 20 }

  • 6.6 Sets: Model sets as arrays containing uppercase string constants. Use JavaScript's Set class in your code where appropriate.

    Why? You can look at sets as multi-valued enumerations.

    ['Dolphin', 'Pigeon', 'Bee'] ['DOLPHIN', 'PIGEON', 'BEE']
    { DOLPHIN: true, PIGEON: false, BEE: false } ['DOLPHIN']

  • 6.7 ObjectIds: Don't use MongoDB's ObjectId type. Instead, set _id fields yourself, either by using a property of your data that is naturally unique or by creating random strings.

    Why? Serializing and deserializing ObjectIds is just too much of a hassle and easily leads to bugs.

    ❌ _id ✅ _id
    ObjectId("5937a6a76cb02c00018577fe") '6xAySKn98aZ66vN'
    ObjectId("5937a7136cb02c00018577ff") 'eOiga4lkLaW99ER'

Names

  • 7.1 Abbreviations: Don't use abbreviations except for domain specific language. When you do abbreviate, capitalize properly.

    Why? It makes your data and code hard to read.

    • apTime
    • appointmentTime
    • ankleBrachialPressureIndexRight
    • ABIRight
    • healthInformationSystemNumber
    • hisNumber
    • HISNumber

  • 7.2 Case: Use camelCase over snake_case for key names.

    Why? It's what we use in JavaScript which means we don't have to convert or mix the two in our code.

  • 7.3 Collection names: Collection names should be pluralized and in camelCase. Use dots when there is a relationship between collections (e.g. users and users.appointments)

Object modelling

  • 8.1 Growth: Don't let your objects keep growing. Prune and merge properties into nested objects where appropriate (e.g. by combining 'footAssessmentState', 'eyeAssessmentState' and 'nutritionAssessmentState' into a nested 'assessmentStates' object with properties 'foot', 'eye' and 'nutrition')

  • 8.2 Nesting: Don't excessively nest objects. Consider breaking up your data if you find yourself needing deeply nested objects

Todo (pull requests welcome)

  • Normalization/denormalization

About

📗 An opinionated guide to data modeling with MongoDB.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published