May 9, 2017

Entities and Security: identity matters

This article is an excerpt of topics discussed in the book Secure by Design that I'm currently writing together with Dan Bergh-Johnsson and Daniel Deogun.


Entities

Each part of your domain model has certain characteristics and a certain meaning. Entities are one type of model object with distinct properties. What makes an entity special is that:
  • It has an identity that defines it and makes it distinguishable from others.
  • It has an identity which is consistent during its lifecycle.
  • It can contain other objects, such as other entities or value objects.
  • It’s responsible for the coordination of operations on the objects it owns.
What this means is that if we need to know if two entities are the same, we look at their identities instead of their attributes. It’s the identity of the entity that defines it, regardless of its attributes, and the identity’s consistent over time.
During the lifecycle of an entity it may transform and take on many different attributes and behaviors, but its identity always remains the same. Let’s consider a car, for example. Many attributes of a car can change during its existence. It can change owners, have parts replaced, or be repainted. But it’s still the same car. In this case the identity of the car can be defined by its vehicle identification number (VIN), which is a unique 17-character identifier given to every car when it’s manufactured.

Sometimes an entity’s identity’s unique within the system, but sometimes its uniqueness is constrained to a certain scope. In certain cases the identity of an entity can even be unique and relevant outside of the current system. The identity’s also what’s used to reference an entity from other parts of the model.

Another important trait of an entity’s that it’s responsible for the behavior and coordination of the objects it owns, not only in order to provide cohesion, but also to maintain its internal invariants.

The ability to identify information in a precise manner, and to coordinate and control behavior, is crucial if you want to avoid security bugs sneaking into your code.

The Continuity of Identity


The attributes of the customer change, but the identity remains the same.
The attributes of the customer change, but the identity remains the same.

Sometimes a domain object is defined by its attributes. But sometimes those attributes change over time without implying a change of identity of the domain object. For example, a representation of a customer can be defined by its attributes name, age, and address. Most of these attributes can change during the time the customer exists in the system, but it’s still the same customer with the same trail of history in the system, and its identity shouldn’t change. It’d quickly become quite messy if the system were to create a new customer every time an address got updated. The customer in this case isn’t defined by its attributes, but rather by its identity, and should therefore be modeled as an entity. This way the customer’s identity stays consistent for as long as the customer exists in the system and regardless of how many state changes it goes through during that existence.

Choosing the right way to define an entity’s identity is essential and should be done carefully. The result of that definition typically takes the form of an identifier. This means that the identity, and uniqueness, of an entity is determined by its identifier. Sometimes the identifier can be a generated unique ID, and sometimes it can be the result of applying some function to a selected set of attributes of the entity. In the latter case, you need to pay careful attention to avoid including any attributes that may change over time. This can be hard because fixed attributes may change during the evolution of the system. As a rule of thumb, favor generated IDs over an identity based on attributes. It’s also important to note that what we mean by identity in DDD isn’t the same concept of identity, or equality, which is built into many programming languages. In Java, for example, object equality is, by default, the same as instance equality. Unless we explicitly define our own method for equality, two object instances representing the same customer won’t be equal. The identity isn’t dependent on a specific representation of the entity. Regardless of whether the customer is represented as an object instance, a JSON document, or binary data, it’s still the same entity.

Local, Global, or External Uniqueness

The identity of an entity is important, but the scope in which its identity is unique can vary. Consider, for example, our customer entity. A system could use an identifier which is unique not only to the current system, but also outside of the system. This is an example of an externally unique identifier. An example of this would be to use a national identifier like those used by many countries as a means to identify their citizens. In the United States, this would be the Social Security number. Using an externally defined identifier can come with certain drawbacks. One of them is security implications.

Some entities need to be globally unique.
Some entities need to be globally unique.

Perhaps more common than externally unique identifiers are identities made to be unique within the scope of the system or within the boundaries of the current model. Such identifiers can be referred to as being globally unique. An example of this is a unique ID generated by the system when a new customer is created. There can be some interesting technical challenges involved that’re worth pointing out. If the method used to generate IDs can guarantee the uniqueness of each ID, assigning them is a fairly straightforward process. But, if you’re dealing with distributed systems, generating globally unique IDs in an efficient way can be a technical feat in itself.

Some entities only have local identities.
Some entities only have local identities.

Some entities are contained within another entity. Because such encapsulated entities are managed by the entity that holds them, it’s usually enough if they’ve an identity which is only unique inside the owning entity. Such an entity’s identity is said to be local to the owning entity. To go back to our customer entity, say our system is a customer management system for retail stores and every customer belongs to one, and only one, store. In this case the identity only needs to be unique within the store the customer entity belongs to. Modeling an identity to have local uniqueness can simplify the ID generation function. It also makes it clearer that the responsibility for managing those entities lies with the encapsulating entity.

Keep Entities Focused

One thing to keep in mind when you’re modeling entities is to try to only add attributes and behaviors that’re essential for the definition of the entity, or help to identify it. Other attributes and behaviors should be moved out of the entity itself and put into other model objects that can then be part of the entity. These model objects can be other entities or they can be value objects. Entities are concerned with the coordination of operations on themselves and the objects they own. This is important because there may be certain invariants, or rules, that apply to a certain operation, and because the entity is responsible for maintaining its internal state and encapsulating its behavior, it must also own the operations on the internals. If the operations were to be moved outside of the entity, this would make it anemic1.

Entities coordinate operations.
Entities coordinate operations.

When boarding an airplane, each passenger must present a boarding card in order to verify that they’re about to enter the correct plane, and to make it easy to keep track of whether anyone is missing when the plane is about to depart. If passengers were allowed to freely walk in and out of the airplane, the stewards would need to check all the boarding cards after everyone was seated. This would be a lot more time-consuming and possibly cause confusion if passengers had taken a seat in the wrong plane. With this in mind, it makes sense to control and coordinate the boarding of passengers. The same goes for the software model to handle this. If we have the airplane modeled as an entity with a list of boarded passengers, then other parts of the system shouldn’t be allowed to freely add passengers to that list, as it’d be too easy to bypass the invariants. A passenger should be added by a method board(BoardingCard) on the airplane entity. This way the airplane entity controls the boarding of passengers and can maintain a valid state. It only allows the boarding of passengers with a boarding card that matches the current flight.

Entities play a central role in representing concepts in a domain model, but not everything in a model is defined by its identity. Other building blocks of your domain models, that are important from a security perspective, are Value Objects, and Aggregates. Which you also learn about in the book.


This article is an excerpt of topics discussed in the book Secure by Design that I'm currently writing together with Dan Bergh-Johnsson and Daniel Deogun.

--------
[1] Fowler M., "AnemicDomainModel" (2003), https://www.martinfowler.com/bliki/AnemicDomainModel.html

No comments:

Post a Comment