Monday, February 18, 2008

SOA and the "Universal Truth"

I was having a discussion with an experienced EA from another organisation the other day. His view was that a decentralised data architecture could never work because there would always be a certain amount of entropy in the data. That is, there would always be delays as the data is published between the interested services and as such there would not be one universal version of the truth.

In his opinion, in the centralised world you could point management at the data in a given service and know the whole truth.

This is however in my opinion merely an illusion of the truth. In order to get at "the truth", we must first define what the truth is. The truth is a matter of business definition. IT systems support a business - it is not the other way around. Truth begins in the real world as a business concept, and eventually ends up being recorded in a database somewhere. The event did not occur when the database was updated, it happened beforehand.

As such, with both approaches you will not be able to look to a single database for the universal truth. There will always be delays between the "truth" and the database. That is, you will always have entropy.

Also worth noting is that information in organisations has natural entropy. This is due to the fact that no one person can know everything at once. A customer first arrives via the sales department and eventually ends up at billing. The view each department has of the customer at various stages during the sales process is quite different. Also, the information does not travel instantly between the people in each department.

By mirroring this behaviour with our services (by decentralising our data), then we have a better chance at getting at the truth as the business would define it.

So although in theory there is a "universal truth" to the universe (except at the quantum level perhaps), in reality it is something we cannot practically attain; not due to the selection of a decentralised as opposed to centralised approach, but due to human limitations.


Unknown said...

Hi Bill,

Interesting post.

I think what the other EA was trying to say is that if data is centralised, it's easier to interrogate (takes less effort). The reference to entrophy seems irrelevant to me and is supported by your points.

I subscribe to the view that there is no one ideal architecture. Real world requirements and resource restraints dictate what works.

For example, if a decentralised system was developed to facilitate high availability but there was a need for holistic reporting, why not add a central data warehouse which contains a snapshot of the different data sets at a point in time. In a SOA/Message Bus enviroment, that could be achieved by a service that subscribes to be notified of changes in all other services.

From my experience, the reason data is centralised is so that it can be analysed and generally this is big picture stuff so the business doesn't need live information.

So we could be talking about two different paradigms, both of which can be satisfied by mixing architectures.

Bill said...

Hi Mat,

I'm glad you enjoyed the post!

Indeed in many enterprises it is necessary to have a wholistic view of data.

A BI solution is most often appropriate, and implementing that by having the BI service subscribe to all relevant notifications is indeed the best approach.

I wouldn't refer to this as mixing paradigms however, as we now have even more decentralisation of data.

The centralised approach as I've described it stipulates that any given piece of information is stored only in one location in the enterprise, and then accessed from that location by all other systems that need it.

Having all data in the enterprise duplicated in a centralised data warehouse still constitutes a decentralised data architecture by nature of the duplication.

I will be doing a follow up post on SOA and BI sometime in the next few days.

Sudip Shrestha said...

It was indeed very good theoretical answer from you regarding the decentralization. However the EA you were having discussion with probably does not realize that practically the message extraction with Modern technology such as JMS and MDB/MDP, is really fast and it is upto the recipient system to process that message into itself.

Bill said...