Friday, February 29, 2008

COTS Applications and the Service Boundary

Most commercial off the shelf (COTS) applications these days expose Web service endpoints so they can interoperate in a platform neutral way with other applications in your enterprise. These COTS applications are not services in their own right.

Although they expose Web service endpoints, they are not services in the context of SOA. The endpoints they expose should be considered integration points to be leveraged inside the service boundary.

The contract of a business service must be expressed in terms of your business. The contract should abstract away the implementation details of the service (such as which COTS application is in use), expressing the business functionality in the language of your business.

The same off the shelf package will fulfil different roles in different services in different organisations. For instance in my last post, we explored an example where we had a CRM system being used to support both a Sales service and a Marketing service.

Now this could be two instances of the same CRM package, or as in the example could be a shared instance. Regardless, the functions being exposed to the enterprise are two distinct services that based on their service contracts perform Sales and Marketing functions.

If we directly exposed the CRM application's service interface as both the Sales and Marketing service contracts, then both services would have the same interface! This would imply they perform the same business function. Obviously as they are distinct services, they do not.

There needs to be an abstraction away from the service interfaces offered by COTS applications to the intended business function to be performed by the service.

So in summary, do not directly expose a COTS application's Web service endpoints to the other services in your organisation. Put a layer of abstraction in between.

Wednesday, February 27, 2008

Sharing Applications Between Services

A reader posted a good question regarding my Service Boundary Example post. He asked:

Let's say as a result of our enterprise architecture planning we've decided that we're going to have a Sales Service (Leads, Opportunities, etc) and a Marketing Service (Campaigns, Promotions, etc).

However let's say we currently have one legacy CRM system that supports the marketing department and the sales department.

Which service boundary will contain that system?

The answer to that is that different parts of the legacy CRM would sit behind each service boundary. There is no issue in sharing infrastructure between services. You just need to be aware of the dependencies on that infrastructure from a risk management perspective.

Service boundaries are drawn up around data. There must be no data shared between services except through explicit message exchanges.

At the very least I would suggest that we have process isolation between services, so that data isolation is enforced at the OS level. To be fair however, you don't really require process isolation between two services in order for them to function as services. It is just obviously a good design choice.

So in this case, as long as the lead and opportunity data are never used by the Marketing service, and the campaigns and promotions data are never used by the Sales service, then we have data isolation and we haven't offended the SOA gods too deeply.

When transitioning legacy systems into an SOA, we sometimes have to compromise further on this because we have a single legacy system used by two services that doesn't enforce this data isolation. We can't control the design of this system. And to be fair, it wasn't designed to be used in this way.

The issue is we now have a message sent to one service, affecting the state of the second. This is obviously very bad and not ideal, and I wouldn't consider these services as true isolated services in the context of SOA. However it can be a useful half way point on the way to true SOA.

Why a Single Service?

In my last post, I illustrated an example where two legacy applications were bundled up to be included in a single CRM service. The question was raised as to why these were not just simply made to be separate CRM services.

The answer comes down to aligning the service definitions with the business. In the majority of businesses, the handling of a VIP customer is similar to that of a regular customer. The two processes have a lot in common. In order to have loose coupling and high cohesion, both of these concerns should then be placed in the same service.

Moreover, the decision as to whether or not a customer is VIP is a CRM concern. We do not want this logic in other services that send requests to the CRM service. If we had two separate CRM services, then each service sending them requests would need to decide which request to send - the VIP CRM service request, or the regular CRM service request. This decision should not be a concern of the requesting service.

Tuesday, February 26, 2008

Service Boundary Example

After my last post, I thought I'd give an example to clarify the point I was making about service boundary definition. Let's say we've got a couple of legacy custom built CRM applications that we want to leverage in support of a new CRM service. Let's say one application is used for VIP customers, and the other for regular customers.

We have identified that we want a single CRM service to support the Sales department. The development team decides that they will expose Web service endpoints on each of the legacy CRM applications in order to provide integration points for our new CRM service.

We define our CRM service contract based on the business function we want our service to perform. Note that we now have 3 service contracts, but only one service. We have the public CRM service contract, which is what defines our service boundary; and then we have a service contract for each of the legacy CRM applications, which sit behind the service boundary.

These legacy CRM applications are not services in the context of SOA, despite accepting messages over a WS-* stack defined by a WSDL document. They are an implementation detail of our single CRM service, and should not be directly accessed by anything outside the CRM service's boundary.

We now need to implement the the public CRM service operations, leveraging the private operations exposed on the legacy CRM applications. We might employ a tool such as BizTalk for this job.

At a future point in time, if our enterprise decided to replace the legacy CRM applications, we could do so without updating the CRM service contract - assuming of course that the required functionality offered by the CRM service to the enterprise did not change.

Saturday, February 23, 2008

Defining the Service Boundary

I think maybe we need a break from our lengthy thread of discussion on data and its influence on SOA. So I've decided to start off a discussion on service definition and the service boundary.

Perhaps one of the reasons why there has been so much confusion over the definition of SOA, is that we haven't been very clear on the definition of a service itself. A service in any given architecture is defined by its boundary. The boundary tells us what parts make up the service (what is inside the boundary), and how the service interacts with the outside world (determined by the service contract).

One problem is that the WS-* stack has become pretty much ubiquitous, and is used today for pretty much any type of communication between distributed components. WCF as Microsoft's unified communications stack perpetuates this. When we have WCF artifacts such as service hosts, service contracts, endpoints, etc floating around everywhere, people tend to align their definition of a service with these constructs.

As a result of this WS-* proliferation, it is extremely likely that we will have a lot of WS-* communications going on behind the boundary of a distributed service.

A distributed service is one where communications over a network occur between distributed components behind the service boundary. Although these communications are message based and defined by a contract described in WSDL, these components do not sit on the service boundary. Whether the parts of a service are distributed or not is an implementation detail of the service - it does not affect its contract, and as such does not affect its boundary.

So with all these endpoints and WSDL documents flying around, how does one identify the service boundary? Well, we need to classify all distributed components exposed in a service as public or private. The public components define the service boundary. The private components are an implementation concern of a distributed service and must be hidden from the service's consumers.

No system outside the service boundary shall communicate directly with any system behind the service boundary. Data cannot be directly shared between services. It can be shared only by way of message exchange through the explicitly defined public endpoints at the service boundary.

Enforcing this is really a matter of IT governance. Service boundaries must be defined, documented and published by the architect. Only the public endpoints should be known to service consumers. This in turn defines the intended service boundaries.

Friday, February 22, 2008

Fear the DSP

Udi Dahan has just made a post which brought to my attention a new TLA (Three Letter Acronym) in the SOA space - DSP (Data Services Platform). I agree with Udi - this is all bad. Hopefully it will go away on its own.

Udi made a previous post citing similar criticism of Microsoft's then named Astoria product, which has now been re-badged as ADO.NET Data Services. I agree that this product has the potential to encourage people to expose CRUD interfaces at the service boundary, which for reasons I've previously explained are bad news.

The space in which this product should play is exposing data for consumption by Web clients in AJAX or Silverlight applications. In this case, the CRUD interface is used only inside the service boundary, which is fine.

If it is unclear to you as to why the Web client is inside the service boundary, stay tuned for my next series of posts which should clear the fog a bit.

SOA and Business Intelligence

This post follows on from my post on SOA and the "Universal Truth". The "Universal Truth" as it was described, referred to the ability (or inability as it turned out) to find the complete truth about a given business entity (such as a customer or an order) at any given point in time. What was not addressed was the ability to see a wholistic view of a number of entities distributed across a number of services.

For this, we create a BI service which sits on the service bus, subscribed to all relevant business events. When these events occur, the BI service stores the relevant information extracted from the notification messages into a local data mart, which is then used to populate one or more OLAP cubes. The business can then use a standard BI package to query and present the data.

Note that this BI service does not constitute centralisation of data in the context of my previous post on Centralised vs. Decentralised Data. Although we have a large part of enterprise data stored in a single location, that data is also still duplicated in all the various business services throughout the enterprise. As such, data has in fact become more decentralised.

There is also some debate as to whether a BI service would constitute a service in the strict sense of the term in the context of SOA. A service should have both behaviour and data. A BI service that has only a reporting capability would probably be considered a shared component of all services whose data is duplicated within the data warehouse.

In this regard it is similar to an enterprise portal (where portlets can be exposed in a central portal although actually being hosted in other portals elsewhere in the enterprise). The central enterprise portal is just an aggregation of portlets in a central location and doesn't itself constitute a service. I'll be doing a series of posts on defining services and service boundaries next, so stay tuned!

Wednesday, February 20, 2008

CRUD is Bad

We keep hearing that CRUD interfaces (that is interfaces with create/read/update/delete operations) are considered bad practice (an anti-pattern) in the context of SOA. But why? And how do we go about designing our service interfaces to avoid CRUD?

Let's start with the why. Services have both data and business logic. For reasons of encapsulation and loose coupling between services, we want to keep our business logic near the data upon which it operates. If our service contract permits direct manipulation of the data held within the service, this means that the business logic can leak outside the service boundary. It also means that the business logic inside the service boundary can be bypassed by direct manipulation of the service's data. All bad.

The same holds true in traditional OOP. Other classes cannot directly affect the state of a class. It is achieved through passing messages to (calling methods on) an object. This helps enforce loose coupling and high cohesion.

Even more compelling is the issue of updating multiple entities as part of a single logical atomic operation. CRUD interfaces usually will have create/read/update/delete operations for each entity housed by the service. But what if you want to update two different entities where either both updates succeed or neither?

You have the following options:

  • Use a distributed transaction
  • Implement compensation logic to handle failures yourself
  • Create new create/update/delete operations for specific combinations of entities

None of these options are satisfactory. The first option may not even be possible if the service stack doesn't support distributed transactions (e.g. ASMX, WSE). And even if it does support transactions, cross-service transactions are incredibly bad practice because services have the ability to lock each other's records which severely hurts service autonomy.

The second option is certainly not an easy task to do properly, and takes a lot of additional effort. And the third option isn't really practical. There are too many combinations of entities that may need to be updated in a single transaction, and it would take a lot of additional effort to implement them all.

Lastly, if a service must go to other services to pick up the data it is going to operate on, this means synchronous request/reply message exchanges between services. These are bad news because they are really slow and introduce temporal coupling between services (the service with the data must be available at the time the service without the data needs it).

So hopefully this is enough to convince you that CRUD is bad. But how do we design our services to avoid CRUD? Well, firstly we decentralise our data! This way all the data a service needs to operate on as part of a single logical operation is held locally within the service. Secondly, we make our service operations task centric, rather than data centric. The operations should be more like "make reservation" and "cancel reservation" rather than "retrieve reservation" and "update reservation". Udi Dahan has recently made a couple of posts discussing this very point.

A final point I'll make on this is that CRUD operations are fine inside the service boundary, so for example what you might see between a smart client and the service back end. But this point will be discussed in more detail in future posts. Stay tuned!

Tuesday, February 19, 2008

Designing for the Worst Case Scenario

Nick Randolph posted an interesting entry on The Right and Wrong way to Centralise Services. Firstly, I'd like to clarify the use of the word "Services". The referenced article is referring to the centralisation of IT systems and data so that regional health care units can share data efficiently. I don't believe Nick is referring to "Services" in the context of SOA.

Nick makes a very good point regarding assumptions that architects make when designing systems. It is the responsibility of an architect to design a system so that it will meet all its service level agreements. These SLAs must be met by the system even when things go wrong. It is of course important to determine that the SLAs are realistic in the first place.

Too many times have I seen systems designed for the best case scenario. I tend to find myself asking questions like, "what if this server crashes or is rebooted?". A common response to these questions is "that will never happen," closely followed by "why are you over complicating this?".

The complication comes from the nature of the requirements themselves. Systems that handle unreliable connections are more complex by nature. Burying our heads in the sand and hoping for the best will not deliver a system that does the job. Architects really must start thinking more about designing for the worst case scenario, such that the minimum SLAs are met.

To finish, I'd just like to thank Nick for the warm welcome to the blogosphere. Nick works for Intilecta and is very knowledgeable in the area of mobile applications. You can find out more about Nick's travels here.

SOA Design Patterns and Best Practice

Join me as I present to the Perth .NET Community of Practice on SOA design patterns and best practice. The session will try to cut through much of the hype surrounding SOA and deliver some clear and practical guidance on how to design and build services on the Microsoft platform. Details below:

DATE: April 3, 5:30pm
VENUE: Excom, Level 2, 23 Barrack Street, Perth
COST: Free. All Welcome

Monday, February 18, 2008

SOA and the "Universal Truth"

I was having a discussion with an experienced EA from another organisation the other day. His view was that a decentralised data architecture could never work because there would always be a certain amount of entropy in the data. That is, there would always be delays as the data is published between the interested services and as such there would not be one universal version of the truth.

In his opinion, in the centralised world you could point management at the data in a given service and know the whole truth.

This is however in my opinion merely an illusion of the truth. In order to get at "the truth", we must first define what the truth is. The truth is a matter of business definition. IT systems support a business - it is not the other way around. Truth begins in the real world as a business concept, and eventually ends up being recorded in a database somewhere. The event did not occur when the database was updated, it happened beforehand.

As such, with both approaches you will not be able to look to a single database for the universal truth. There will always be delays between the "truth" and the database. That is, you will always have entropy.

Also worth noting is that information in organisations has natural entropy. This is due to the fact that no one person can know everything at once. A customer first arrives via the sales department and eventually ends up at billing. The view each department has of the customer at various stages during the sales process is quite different. Also, the information does not travel instantly between the people in each department.

By mirroring this behaviour with our services (by decentralising our data), then we have a better chance at getting at the truth as the business would define it.

So although in theory there is a "universal truth" to the universe (except at the quantum level perhaps), in reality it is something we cannot practically attain; not due to the selection of a decentralised as opposed to centralised approach, but due to human limitations.

Friday, February 15, 2008

The Data Debate Continues...

A reader has just left a number of very good questions regarding my last post. I thought the answers may interest everyone, so I thought I'd address it as a follow up post. I'll tackle each question in turn:

With a decentralised approach would you double up on data around the place?

Indeed this is the case. The data is not exactly doubled up however. Each service holds a different representation of the data, although there will be some elements in common. This serves our interests very well because we are able to optimise the representation for the specific business domain handled by each service.

For example, a customer as seen by sales involves opportunities, volumes, revenue, etc. According to billing the customer involves credit history, risk, billing history, etc. Some elements such as name and address however will be in common.

How do you keep data shared between services in sync?

Whenever an event occurs in a service that updates data in that service, an event is published onto the service bus. All other services that are interested in that event are subscribed and as such receive a notification. Each subscribed service then makes a local decision as to how to react to that notification. This may involve saving a new version of the updated data, but only the elements that the subscribing service is interested in.

When a service updates a bit of shared data does it broadcast to the world it has done this? Or do other services subscribe to that service so it has a list of services to inform of updates?

Only services subscribed to the topic on which that event is published will receive a notification.

How does it work if two or more services try to update the same thing at the same time? Broadcast some kind of lock event to all other interested services?

This question is really a business issue rather than a technology one. Firstly, this can only happen if you have two separate users using two separate systems that update the same data at the same time. The business may not require or even desire this. Usually, specific information is owned by specific groups of people within an organisation and only they should have authority to make changes to it.

However, if the business requires multiple independent users to be able to update the same data from different services, a business decision must be made as to what should occur. The easiest solution is first in best dressed. Another solution is to make one service the authoritative source, so that if there is a conflict that service always wins. Another way would be to send the information about the conflict to someone in an email.

In order to achieve this, we need to be able to detect conflicts to begin with. One way of doing this is to place an integer version number on the entity within each service. When any one service performs an update, it places the current version number in the notification and then increments the number locally. Subscribing services perform the update in their respective databases, where the version number in the notification matches the one in their database. If it doesn't match, you know that there is a conflict. This is essentially an optimistic locking strategy.

We don't want to use pessimistic locking between services. One service should never be able to lock the records of another. It would hurt performance too much and introduce too much coupling.

Is that less chatty than having it centralised?

Indeed. With a decentralised approach, all data needed by a service to service a request is held locally. This means no requests to other services to pick up data. Secondly, all data updates are done locally, meaning no requests to multiple other services to update data. The only message we really need to send out is a notification message once our local operation is complete. Yes, this message is sent to multiple subscribers, but to the publishing service, it is only one message. It is the messaging infrastructure that handles routing it to each subscriber.

Centralised vs. Decentralised Data (continued...)

I had a comment from John on my last post, which I'd like to take the opportunity to discuss.

John suggests that the centralised approach is just plain easier to design, implement and most importantly, think about. I can't say I agree. Once making the shift to thinking in an event driven paradigm, the decentralised approach is actually very natural and as such is easier to design, implement and think about.

The decentralised approach produces an architecture with considerably less coupling. As a result of this, each service is easier to design and implement as you are only really concerned with one service at a time. Moreover, the architecture is less sensitive to change, so the overall system becomes easier to implement as changes are more localised to each service.

John also questions whether the customer really cares about and is prepared to pay for the advantages that decentralisation offers. In my experience, the decentralised approach delivers a better solution faster, so it actually turns out to be the cheaper alternative. One reason for this is you have considerably fewer message exchanges to worry about and implement.

Moreover, when looking at the cost of a system over its entire lifetime, the development cost pales in comparison to the maintenance cost. Due to the looser coupling offered by the decentralised approach, maintenance costs are substantially lower.

And finally, I believe that the performance and reliability problems of the centralised approach outlined in my last post are deal breakers. Will a customer be satisfied with a solution where rebooting a single system takes down all other dependent systems?

So I would suggest that yes, the customer will in fact care about the approach taken, as the decentralised approach delivers a system that is cheaper to implement, easier to change and easier to operate.

Thursday, February 14, 2008

Centralised vs. Decentralised Data

I tend to find during my travels that there are 2 main approaches to SOA design. In one corner we have the people who favour a decentralised data architecture, and in the other corner, we have the people who favour a centralised data architecture.

In the decentralised corner, we have data redundancy between the various services. For instance, customer data may exist in some form or another in every service in your enterprise. When a change occurs in the customer information in one service, an event is published and the services interested in that change receive a notification. They then in turn update their local representations.

In the centralised corner, there is only one centralised source of any given piece of information. Any service that needs that information at any time makes a request to that service to retrieve the information.

The centralised approach has the following disadvantages:

  • It produces far more chatty message exchanges because every service needs to go to other services to pick up the data needed to service a given request.
  • Data must be updated transactionally across multiple services. We never want transactions to span services as it adds too much coupling. Records may be locked in one service for extended periods of time whilst waiting for other services.
  • Synchronous request/reply (which is what is used to pick up the data from the other services) is really slow. You can chew up a lot of threads waiting for responses from other services.
  • If one of these services needs to rebooted, every service that depends on it for retrieving data will keel over.
  • One data representation must service the needs of all systems in the enterprise, which considerably increases complexity.
  • The risk of changing that representation could impact all systems in the enterprise, thus meaning considerably greater testing efforts.

Due to these shortcomings, and the success I have had with the decentralised approach, I sit squarely in the decentralised corner. However, I still tend to find that most people I come across sit in the centralised corner. Are there some advantages to this approach I am missing? Do we need more readily available guidance in this space?

Wednesday, February 13, 2008

SaaS service vs. SOA service

"Service" is a horribly overloaded term that means many different things to many different people in many different contexts. No wonder there is so much confusion over the term Service Oriented Architecture!

One instance of this confusion manifested itself in a discussion I had with a collegue regarding whether there was any specific interplay and bleeding of the lines between SaaS and SOA. Both terms include our favorite word... service. And in fact, many SaaS providers expose Web service (there's that word again) interfaces as integration points, which has led some people to believe that using a SaaS application delivers you an SOA.

To clarify, "service" in the context of SaaS refers to a software delivery model, as opposed to an architectural entity. In the context of SOA, a SaaS application is architecturally identical to any third party application that is hosted in house. The integration capabilities are often similar. Most third party business applications expose Web service interfaces, as do most SaaS applications.

Such an application (SaaS or otherwise) could potentially behave as all or part of a service in the context of an SOA. Even so, you would want to wrap the application's Web service interface with an integration layer that provided richer messaging capabilities (such as guaranteed once-only message delivery and publish/subscribe) to the rest of your enterprise.

So in conclusion, you can utilise SaaS applications in your enterprise and have no SOA, you can have an SOA with no SaaS applications, or you can have both. They really are orthogonal concerns, with SOA being an architectural style, and SaaS being a software delivery model.

Hello World

Welcome to my little corner of the world. My name is Bill Poole and I hail from Perth, Western Australia. I am a Senior Consultant with Change Corporation, primarily consulting in Solution Architecture. My professional interests include SOA, systems integration, large scale application development, as well as design patterns and best practice.

The title of my blog actually originated from one of my mentors from way back. She used to refer to any productive design discussion as a process of "creative abrasion". If there wasn't any friction, then the process wasn't working. So I'm hoping to share some opinions with you and open some discussions that gets that creative abrasion going!