Architecture in an Agile environment is quite different from a design-first approach: Rather than building up a complete picture it is an evolutionary process. The role of architects shifts from designing the solution to providing an overall view of the system, and communicating the principles behind it.
These principles, formerly often hidden behind complete solutions, now take centre stage. And they get a new dimension: Delivery speed. Architecture that calls for a lengthy environment creation process1, or half of everything to be built before anything can be delivered fails to incorporate this new dimension - and automatically prevents any agile delivery.
One of the main aspects to look out for are dependencies. Most of us would have been there: We have to build services A, B and C because B depends on C, and A on B. Unfortunately these aren't easy to unpick: An ordering service requires a product service to work, right? And all need to be deployed on a set of in-house clustered HA servers, right? How long is that going to take?
I found that "latency", coined by Edmund Jorgensen in his blog article "Why Your Team Has Slowed Down, Why That's Worse than You Think, and How to Fix it", is an interesting term. It describes the speed of delivery, and the delays associated with activities - and to "go fast" you need to invest in your latency.
All components have a latency factor. Where a library may have a small initial latency for learning, a dedicated server in your data centre has a much higher one. As components are integrated these latencies adds up, and without "supervision" the resulting latency is often the sum of the component latencies.
Mind your categories
Components usually fall into these categories:
- Infrastructure (servers, networking, cloud)
- Business software (code that solves your problem; this includes configuration)
- Technical software (libraries, software like databases, external services; not specific to your domain)
Mixing any of these is always a recipe for disaster: A library or database providing business logic, or infrastructure containing software configuration leads quickly to a chain of actions when developing and deploying new features because these categories typically have a different lifecycle. Re-deploy the whole application because the behaviour of a shared library has changed, anyone?
Story-Service Misalignment (artificial latency)
Conway's law, these days we can't go without referencing it: It says the software design will reproduce the social and communication structures within the organisation. I am even inclined to add that even the communication mechanisms are reproduced, as a large number of batch-oriented process designs will attest - long live Microsoft Excel as the source of that.
These communication structures and mechanisms have evolved from process optimisations, and often don't reflect the customer's view of the organisation (ever been pushed between departments?). Stories though are customer-focussed ("As a customer I want ...").
To illustrate the issue:
The mismatch between the organisational structure, and how it serves the customer is a difficult problem to solve - and one beyond the scope of this article. The architecture though must recognise this mismatch, and align services to stories - and likewise question stories that are aligned to multiple services.
Good on you for having a big ball of mud - that solves, at least, this issue ...
Foundation Services (component latency)
We promote out-of-the-box components, for numerous reasons: It is usually cheaper than building, better to support - and is opinionated in its concepts, which mostly is a good thing. Anyone involved though in the installation and configuration of an enterprise product will have another story to tell, one that usually spans several vacation leaves - I don't think I have to tell you any, do I?
High-latency components though are not only external enterprise products. Architects and developers, in their quest for better and higher levels of abstractions, have created universal solutions that can cover anything: Database-layer-as-a-service, rules engine service come to mind - and who has never thought about that products and inventory are basically the same, except a few negligible business rules that can be configured?
Interestingly these things tend to gravitate towards the foundation of your architecture; does this diagram look familiar? (I made it up, but have seen many just like it)
All the red lines indicate communication - dependencies again. If you need to have the session service, data service and transaction service ready before you can practically start, you have a problem. And because the foundation services also tend to take the longest to develop, and even longer to deploy (huge latency) ... well, you have a large problem.
Design Patterns (design latency)
Latency even lurks in your design patterns; it is commonly considered good practice to separate your concerns into layers: Presentation layer with some MVC pattern, business layers, perhaps a data access layer with a repository pattern?
The latency hides in the re-use of components (more about that below), and their (well-intentioned) dependencies. There is nothing wrong with layers, but when used in design its implied meaning is that re-use is encouraged - and often manifested in the package structure. Layering should be an implementation detail.
I have to admit I'm a fan of XML-based declarative dependency configuration (like Spring was used, or OSGi's blueprint): not so much the XML part, but the fact it is separate from the code. I know, why to keep it in two places, but at least, this is a relatively easy way to assess the dependencies in one location.
One of the worst for latency though is data. Data frequently forms a bi-directional dependency: You can't change one without the other. Then there is the idea of a shared database: sounds good, but it now forms a bi-directional dependency tree; it's even worse than it sounds. Now try migrating this between environments ...
And then there is its evil cousin: The Canonical Data Model. It should be considered an anti-pattern, given that it ties whole networks of components together, usually under the term Enterprise Data Model. I think I might write some more about that another time ...
Re-use (code latency)
If you are a CS student: Re-use is good, please don't read any further. For all others: re-use is a double-edged sword. Reuse helps to abstract common concepts - and the prospect of maintaining one code base also sounds good. However, it introduces a new dependency, and the less thought-through it is, the more coupled it is to the consumers.
And yes, re-use also has its evil cousin: Inheritance. Not the good type of inheritance, the behaviour common to all sub-classes; no, the practice of arbitrarily moving seemingly duplicate code into parent classes, and mindlessly introducing dependencies where there are none. AbstractController, GenericRepository and similar names are indicators for this.
After seeing too many projects being severely slowed down by re-use I've come to the conclusion that the promise of re-use is broken, both on the architectural and the code level - and with Uwe Friedrichsen I found, at least, one person agreeing.
So where does this leave us at?
If you are like me, and only look at the pictures and the summary, then the theme is: red lines are bad. For those who read the rest (thank you), there are hopefully some takeaways:
- Identify the latency for each deployable (make it part of your architecture!)
- Identify latency that doesn't align with (customer) features
- Identify dependencies, especially those that accumulate latency
- Reduce latency - particularly accumulating latency and latency across components
Coming back to the order service depends on product service depends on in-house HA cluster (as a hypothetical answer to a hypothetical question): What a customer wants shouldn't need more than one service; design services to be autonomous. Design it, so it doesn't need HA - for example, asynchronous design with eventual consistency goes a long way. Avoid high-latency application servers, prefer standalone applications. And build it so it can run on low-latency infrastructure - container-based cloud infrastructure, foundation (e.g. database, message queues) as-a-service makes all of this possible.
Every bit counts - so even if you can't operate production in a cloud environment - you get a lot by having QA/UAT on low-latency services.
There is an argument that projects requiring special hardware or software (for example DSLAMs in the telco space, or large application servers in, well, most enterprises) require a long setup process. I'd argue that these projects can not be agile because this doesn't allow for change. Implementing agile here is often counterproductive, and often driven by other needs, like reporting. However, given the speed of change soon there won't be many applications that require special hardware or software. ↩