With microservices architectures, one key question is where should the boundaries be. Is every entity a separate service? And how fine-grained should service architectural divisions be? We can offer some simple rules of thumb.
Probably the most useful way to start approaching an API breakdown, is considering entities by topic. We can group entities with related data and similar lifetime/ lifecycle into topics.
What are Topics?
Topics are a related group of entities, usually accessed together and sharing a similar lifecycle. Topics can be related to domains in Domain-Driven Design; a domain is normally made up of a number of topics.
- Customer, Address and Preferences can often be formed into a cohesive ‘Customer’ topic.
- Products Catalog and Images often form a cohesive ‘Product Catalog’ service.
- Orders, Order Lines and Order Status may likely form a cohesive ‘Orders’ topic.
- ‘Facebook Likes’ might usefully be a separate topic from ‘Posts’ to allow for the much greater volume & lower data-consistency requirements for displaying Likes.
Child data used only in conjunction with the parent entity, for example Customer Address, should be kept together — either as a child element within the Customer request/response, or as a separate API within the same topic.
Data having a significantly important lifecycle of its own, or important security/ storage constraints, is usually a flag for that entity being given its own topic. For example:
- Orders should not be grouped with Customer, since Orders may accessed individually from fulfilment & shipping; and have a shorter lifecycle than the Customer overall.
- Credit Card details should not be grouped with Customer, since strong security and storage constraints apply.
Other Commonality: Topics by Lifecycle
Sometimes, some entities within a domain may only be loosely related; but commonality exists in terms of Data Source, Lifetime, or Lifecycle.
In these cases, modelling decisions are more fluid but we might consider:
- A set of Reference Data tables could be considered for grouping together, if this would result in a meaningful cohesive topic.
- Data coming from a particular External Source could be considered for grouping together, if this would result in a meaningful cohesive topic.
- Data with common lifetime or storage requirements (eg. History) could be considered for grouping together, if this would result in a meaningful cohesive topic
As above, entities should belong to the same domain and usually be at least loosely related to consider them for grouping into a single topic based on these commonalities.
Mapping Topics to Services
Now we have grouped our entities & operations into topics, how should we divide these into services?
One original “microservice” school of thought was that every entity should be an individual service — possibly with it’s own database. These architectures typically have scores or hundreds of services. However, the overheads of integrating and deploying such numerous fine-grained services have been found to be very high.
These days, the pendulum has swung back towards a “midi” style of service architecture, with a moderate number of coarse-grained topics and a separately deployed service for each of these.
When Should I use a Separate Database?
Databases provide the crucial storage, underlying our services. While some purists advocate for a separate database for each separately deployed service, there are pros and cons.
Sharing the same database offers benefits when:
- Lifetime & volume of data are compatible (within the bounds of what your DB can handle).
- Ownership & security of the data are similar.
- Bulk operations or performance may be an issue — having the data in a single database enables filtering, joining & stored procedures in the database.
- BI, analytics & reporting are easier.
Using a separate database offers benefits when:
- Volume & rate of transactional or timeseries data are vastly greater, than for other data.
- Ownership or security of the data require physical separation.
- Storage lifetimes or storage management benefit from use of a separate schema, separate physical storage or separate DB.
- Technical factors require different database technology.
This is probably a slightly controversial view, but personally I would tend to lean by default towards sharing databases where that is possible. I have had many experiences in performance and BI where being to use a cohesive database has allowed key advantages.
Technical factors do distinguish some databases, but selecting a good general-purpose database (PostgreSQL or Oracle come to mind) on modern hardware will enable a common database to cover a very broad range of needs.