Monday, March 31, 2008

Can DNS be my Message Router?

In my post on Guerilla SOA, a reader posted the question as to whether we could use DNS as our means of routing messages between endpoints. The point that was made was that we could define a different host name for each endpoint, and then if we needed to physically move an endpoint we could simply update the DNS entry and messages would still be routed correctly.

Although this point is valid, DNS doesn't give us the power and granularity of routing we require in an SOA environment.

Firstly, DNS will only help us route a message to a physical machine. An endpoint address contains more than just the host name. For example:

net.msmq://policy-service.myorg.com/private/policy_requests

Here we have the transport being specified (net.msmq), as well as the queue name (/private/policy_requests). If we were to change the either of these details, updating our DNS record would be insufficient for our service consumers to continue communicating with our service.

Moreover with this approach, we still have the same coupling discussed in my last post. If we restructure our service contract such that our messages now need to be routed to different endpoints, we would need to update our service consumers.

So although DNS does abstract physical host addresses, we get better decoupling from a more powerful routing mechanism.

Saturday, March 29, 2008

Content Based Routing

When implementing the business logic within each of our services, it is important to get the communication infrastructure abstraction correct such that we don't have a leaking of communication concerns into our business logic.

We need to define which concerns relate to the transport of a message from its source to its destination, and which relate to the business logic of the service.

One concern which should be handled by our communication infrastructure that we often find embedded in our business logic is addressing. It is not uncommon to see implementations where the service consumer creates a proxy as a means of communicating with the service:

MyEndpointProxy proxy = new MyEndpointProxy();
DoSomethingRequest request = new DoSomethingRequest();
proxy.DoSomething( request );

Although the endpoint address in the above example is not specified in the code (let us assume it is specified in an external configuration file), we have still bound ourselves to the assumption that MyEndpoint is the logical endpoint that should receive the DoSomethingRequest message.

Although this fact is exposed in the contract of the service producer, the problem is that the service consumer must choose the logical endpoint to receive the message. As such, the structure of the service producer is now influencing the implementation of the consumer. If we were to update the structure of the producer, we would have to update its consumers. This is coupling we would like to avoid.

The solution here is to push this addressing decision down to the communication infrastructure. The business logic of the application provides only the message and mode of communication (send vs. publish). The infrastructure then, based on the content of the message determines which endpoint(s) should receive the message.

Content based routing is generally performed based on a rules engine. XPath is a common strategy employed to determine the destination of a message based on its content.

Now if we update the structure of our service producer, all we need to do is update the routing rules and everything continues to work. We do not need to update the implementations of any service consumers.

Thursday, March 27, 2008

Decomposing Services

In my last post, I discussed how a service exposes one or more endpoints. But how many endpoints should a service expose? What is the rationale for exposing more than one endpoint?

The purpose of a service endpoint is to provide access to functionality exposed by the service. The functionality exposed is determined by the syntax and semantics of the messages exchanged via that endpoint.

The binding/policy of the endpoint determines the service level agreements (SLAs) (such as messaging reliability and performance) upheld by the service, as well as the demands made of the consumer in order to successfully communicate with the service (such as message security).

A service may be decomposed into multiple autonomous components. This is not a component in the traditional sense. That is, it is not a single .NET assembly or Java package. It is an entire horizontal partition of a service. An autonomous component comprises an endpoint, business logic and data. To the consumers of a service, it is assumed that no data is shared between the autonomous components of a service.

Being part of the same service, autonomous components may share the same domain model, and consequently may share the same database schema. That being said, they may or may not in fact share the same physical database. If they use different physical databases, those databases may or may not share the same schema.

So for example, we may have one autonomous component responsible for dealing with standard customers, and another responsible for dealing with VIP customers. Although the functionality exposed by each component may in fact be identical; by separating the service into two autonomous components, we give the ability to provide different SLAs for service consumers dealing with VIP and non-VIP customers. For example, we may use an unsecured HTTP transport for the non-VIP customers, and a secured MSMQ transport for the VIP customers.

Note that the VIP and non-VIP customer data although sharing the same structure and business rules, are not in fact sharing the same physical data. If the two autonomous components were to share the same database, we would be dealing with separate database records for the VIP and non-VIP customers.

As another example, we may decide to have separate autonomous components for different product lines supported by a service. In this case, each component would likely have a different domain model (and database schema); although there may be some structural aspects in common.

For instance if we were dealing with different insurance product lines, we may have information common to every policy across all product lines in a table whose structure was common to every autonomous component; while maintaining the flexibility to have independent structures for each product line in tables whose schemas are not shared between autonomous components.

In conclusion, there is no value in arbitrarily exposing multiple endpoints for a service. But rather we should define multiple endpoints where there is a need for decomposing a service into autonomous components, assigning each autonomous component its own endpoint.

Wednesday, March 26, 2008

What is an Endpoint?

For a while now I've been describing the various ins and outs of services using the term endpoint, assuming that the term was well understood by my readers. Although this may be the case for many readers out there, there are many subtleties around the definition of an endpoint that are worth discussing.

Services communicate with each other by way of exchanging messages. Messages travel between services via a channel. An endpoint either sends messages onto or receives messages from a channel. All messages arrive at or leave a service via an endpoint.

Note that although a channel has two or more endpoints, it only has one address. The address is controlled by the service that defines/owns/controls the channel. But with potentially many services communicating over a given channel, who owns the channel?

Well that depends on the type of channel. In the case of a point-to-point channel (one service sending a message to one other service), the receiving party controls the channel. In the case of a publish-subscribe channel, the publishing party controls the channel.

We define the endpoints of a service to be only the endpoints of channels controlled by that service. That is, only the endpoints attached to channels owned by a service are described in that service's contract.

A service endpoint is defined by three components:

  • The schema of the messages that pass through the endpoint
  • The binding and policy (which describe the channel - that is, the means by which messages are transported)
  • The address of the channel used by the endpoint to receive or publish messages

Note that each combination of schema, binding/policy and address determines a unique endpoint. So if a service can receive the same messages via HTTP and MSMQ, then we have two distinct endpoints - one for each transport. Alternatively we could have two different endpoints on the same transport accepting different messages.

One catch though is that a service contract may or may not contain service instancing information. That is, it may or may not contain the endpoint addresses. The argument here is that a service shouldn't really care where it is deployed, so that information shouldn't really appear in its contract. We should be able to move a service without updating its contract.

In scenarios where a service registry is being employed, the service registry will contain the service instancing information and the service contracts will contain only the schema, binding and policy metadata.

Tuesday, March 25, 2008

Services are Autonomous, but Why?

In my post on synchronous request/reply, I mentioned that one tenet of SOA is that services are autonomous. But what does that mean? And why is it important?

The four tenets of SOA were originally posited Don Box, in his article A Guide to Developing and Running Connected Systems with Indigo.

The other tenets may be a topic for another post. The services are autonomous tenet was intended by Don Box to mean that services are deployed and versioned independently from each other.

I and some others such as Udi Dahan extend this constraint to also stipulate that services should be able to function autonomously at run time. That is, they should be able to as much as possible function without influence from other services. That is not to say that services live in a vaccuum. Services must of course be able to interact with other services.

The nature of these interactions however should be such that no service should rely on the immediate availability of another service in order to handle a message. That is, we want to eliminate temporal coupling between services by limiting ourselves to asynchronous communication between services.

The more autonomous our services are, the more loosely coupled they are. And loose coupling is what gives us business agility, which is the key business driver for SOA.

Without this kind of autonomy between services, one service failing could cause a domino effect causing many other services across the enterprise to fail. In my opinion this degree of coupling is unacceptable.

Saturday, March 22, 2008

RPC is Bad

In my last post I illustrated how we can achieve RPC style service interactions in an interoperable way using the wrapped document/literal style. However just because we can do something doesn't mean we should.

When using the document/literal style of messaging, RPC really becomes a matter of intent rather than message syntax. As was illustrated in my last post, we had both a message oriented syntax and RPC syntax with the same message on the wire.

With RPC, our intent is to invoke a method on the service producer with a given set of parameters. This tends to encourage exposing internal object methods as service operations, leading to poor encapsulation. The number, name and type of method parameters are an implementation detail of the service which should be encapsulated behind the service boundary. With RPC, they end up in the service contract.

This then leads to poor service contract design. Deriving the service contract from method signatures is dangerous. The service contract and the messages it defines are first class citizens in SOA and need to be explicitly designed and highly visible.

Semantically, we need to be thinking about sending or publishing messages that have business-level relevance. When we are thinking in terms of RPC, we are focussed more on service implementation details than the message.

Also, RPC style interactions tend to lead to fine grained communications. That is, we result in many small messages rather than fewer larger messages. This is because developers naturally wish to limit the number of parameters in a method signature. This then has negative performance connotations.

The RPC approach is also not very version tolerant. If we add or remove a parameter from our method signature, old versions of the message will not be able to be processed by newer versions of the service. This means we have fewer options for updating service producers without also updating all their consumers at the same time. This means more coupling between our services.

Furthermore, RPC tends to lead developers down the path of returning values from the methods being invoked on the service. This leads to synchronous request/reply, which is a bad idea for reasons explained here.

And finally, RPC style interactions obscure the fact that a message is being sent over a network boundary by providing an interface at the service consumer that looks local. One of the tenets of SOA is that boundaries are explicit. Due to the fallacies of distributed computing, we cannot hide the network boundary from service developers. Distributing objects in a way that makes then look local (as was done by DCOM and .NET Remoting) has led to a number of failed implementations.

So in conclusion, avoid RPC style interactions between services in favour of a message centric approach.

Wednesday, March 19, 2008

RPC over Document/Literal

In my last post, I discussed how document/literal is the recommended style to use when implementing services. One final twist to make the whole thing even more confusing is that there are two different styles of document/literal - wrapped and bare.

The difference between these two parameter styles though is really just an implementation detail at the service producer and consumer. The WSDL and message schema are exactly the same in both cases.

The difference between the two is how the message contract is conceived. With the bare style, the message is a first class citizen in both the service producer and consumer. The message has distinct representation in the code at both ends.

With the wrapped style, the code at both ends takes the form of a method and parameters. The message schema is derived from the method signature. As such, the message does not appear at all in the code.

Since the WSDL remains the same with both options, you could have different parameter styles at each end and the producer and consumer will still work together. You will just have the operation at one end taking the form of an explicit message and a method call with parameters at the other end.

So let's say the service consumer is using the bare parameter style. We might have:

DoSomethingRequest request = new DoSomethingRequest();
request.id = 1234;
request.value = "Hello World!";
proxy.DoSomething( request );

And at the producer using the wrapped parameter style we might have:

void DoSomething( int id, string value )
{
}

Just like with RPC/literal, wrapped document/literal defines a convention for packaging the method name and parameter values into the message. The difference is that the method and parameter names are explicitly defined in the service contract.

We take the name of the method we are invoking as the outer XML element, and then create inner XML elements for each parameter.

For example, the request message for the above example would be:

<DoSomething>
    <id>1234</id>
    <value>Hello World!</value>
</DoSomething>

In my next post, I'll cover why RPC style interactions (that is deriving the message schema from the method signature) are a bad idea.

Tuesday, March 18, 2008

WSDL Styles (continued...)

Just in case my explanation given in my last post wasn't clear on the difference between document and RPC styles, I thought I'd give a little bit more detail.

The document style stipulates that the SOAP body XML element contains one or more child elements (which the WSDL specification calls parts). Note that WS-I Basic Profile 1.0 stipulates that only a single part be specified for each message with the document style. There are no explicit formatting rules for these parts. The only requirement is that the service producer and consumer both agree on the representation.

The RPC style stipulates that the SOAP body XML element contains a single element with the name of the method being invoked. This element in turn contains an element for each method parameter with the element name matching the parameter name. Each WSDL part describes a single parameter.

The RPC style is simply more prescriptive than the document style. For every RPC/literal WSDL description, an equivalent document/literal WSDL description can be created such that the messages are identical on the wire.

Moreover with the RPC/literal style, the XML elements in the SOAP body corresponding to the method name and parameter names are known only by convention. The XSD schemas included in the WSDL document define only the structure of the parameters, not the elements binding them all together.

With document/literal assuming we conform to WS-I Basic Profile, we have a single XSD document describing the entire contents of the SOAP body. This is a lot simpler. So to reiterate, we only want to use the document/literal style.

Monday, March 17, 2008

WSDL Styles

As I mentioned in my last post, the SOAP specification was originally intended for performing RPC in a platform neutral way. In fact, SOAP used to stand for Simple Object Access Protocol. However since SOAP is now commonly used for message oriented interactions, the SOAP 1.2 specification stipulates that SOAP is no longer an acronym at all. But I digress...

The WSDL specification defines two different styles for interacting with a service, document and RPC. The RPC style is designed for invocation of methods on remote objects, with messages containing parameters and return values. The document style is message oriented, with messages containing whole (usually XML) documents, rather than explicit parameters.

Orthogonal to these styles are two different encoding styles, encoded and literal. The encoding styles stipulate how data are encoded on the wire (e.g. integers, strings, arrays, etc).

Before the arrival of the XSD specification, SOAP had to provide its own specification for encoding data on the wire. Once XSD arrived, service implementers now had a choice as to which style they preferred.

SOAP encoding rules are generally better at handling object serialisation than XSD as they are capable of encoding more complex data graphs (i.e. those that contain cyclic references) and handling polymorphism. However, implementations of the SOAP encoding rules don't tend to be universally interoperable, unlike XSD implementations. As a result, WS-I Basic Profile 1.0 says to prefer the use of literal encoding.

Because the SOAP encoding rules are better for serialising objects, the encoded style is generally used with the RPC operation style. As XSD is better for serialising explicit messages, the literal encoding style is generally used with the document operation style.

Most SOAP stacks in fact support only these two combinations. You don't tend to see the document/encoded and rpc/literal combinations in the field that much.

So that leaves us with document/literal and rpc/encoded. Due to the interoperability issues suffered by the encoded encoding style it is best practice only to use the document/literal style.

However for one final twist, in my next post I'll discuss how we can perform RPC with the document/literal style so we can achieve RPC style interactions in an interoperable way. However for reasons I will discuss later, RPC style interactions introduce more coupling between services than message centric interactions and as such should be avoided anyway.

Sunday, March 16, 2008

Web Services Defined

We have all heard of Web Services. In fact as architects, we very likely use the term almost every day. But what are Web Services really? The term is in and of itself very misleading. Web Services in fact do not necessarily have anything to do with the Web.

Based on the SOAP protocol, Web Services were designed primarily to be a means for invoking methods on remote objects (i.e. for doing RPC) in a platform neutral way over a transport (HTTP) that could easily traverse firewalls. As such, at the time the name made a lot of sense.

Unlike previous incarnations of the SOAP protocol, SOAP 1.2 is actually transport neutral, no longer tied to the HTTP protocol. As such, we now have Web Services accessible over proprietary transports such as MSMQ.

So if we no longer must use HTTP to access Web Services, why do we still call them Web Services? Good question, the name just stuck. So how do we define a Web Service today? The answer is a Web Service is any service that communicates based on the SOAP protocol and as such can be described with a WSDL (Web Service Description Language) document.

Although we may access a Web Service over a non-Web protocol such as MSMQ, it is still considered a Web Service.

Friday, March 14, 2008

Asynchronous Request/Reply (continued...)

In my last post, I discussed how request/reply can be achieved asynchronously in place of the synchronous variety at the service boundary. When the sending service receives its response, it needs a way of restoring the state that was present at the time the request was sent.

In order to achieve this, we must put a correlation ID in the request and response messages. The requesting party stores the relevant information it needs to process the response in persistent storage referenced by the correlation ID, and then sends the request.

The responding party places the correlation ID it received in the request into the response message. When the sending party then receives the response, it can look up the relevant state in the database and restore the state that was present at the time the request was sent.

MS Workflow Foundation natively contains the ability to correlate workflow instances to external interactions. Leveraging libraries such as this behind the service boundary can certainly make our lives easier.

Workflow Foundation persists the state of the workflow instance in a persistent store when it goes to sleep awaiting another message so that if the server is rebooted, the workflow state is not lost.

Another approach to maintaining state between asynchronous requests and responses is to store the entire thread state in the request message header, rather than just a correlation ID. The receiving service then places the state back in the header of the response.

When the response is received, the service that sent the request can extract all the relevant state from the response message header.

Thursday, March 13, 2008

Asynchronous Request/Reply

In my previous post I concluded a discussion on synchronous request/reply and why it is bad news at the service boundary. But does that mean request/reply is finished altogether? Not at all.

There are many instances in which request/reply is the appropriate choice of message exchange pattern between services. Take for example a service that assesses risk. The requesting service provides a risk profile to be assessed, while the risk assessment service delivers the risk assessment response back to the requesting party.

So how do we deal with this requirement? We use asynchronous request/reply. Asynchronous request/reply is where the sending thread continues on about its business without awaiting the response, and when the response arrives it is handled by a separate thread.

The difficulty with this approach is that when the response is received, we have lost the state that was previously stored in all the local instance variables on the requesting thread. We need a way of restoring this state when the response is received in order to process the response.

I'll discuss some approaches for achieving this in my next post.

Wednesday, March 12, 2008

Synchronous Request/Reply is Bad (continued...)

In my last post, I demonstrated that synchronous request/reply is a bad design choice for performance reasons.

The other reason to avoid synchronous request/reply goes to reliability. A requesting service invoking another service with synchronous request/reply will likely fall over if the receiving service is unavailable for some reason. You could have a situation where when a critical service goes down, every other service goes down.

Yes, you could build compensating logic into your service to handle retries so a failed request doesn't cause the requesting service to fail. But that is a lot of work you would have to do yourself. When using an asynchronous approach, the messaging infrastructure will handle retries for you.

Moreover, asynchronous messaging infrastructure will provide transactional guaranteed once only message delivery. With synchronous request/reply you must make sure your message is handled idepotently (meaning that if it is received multiple times there are no side effects) as well. Retries may result in the same message being delivered multiple times. Again, this is more work for you. All the while, the sending service has a thread tied up awaiting the response.

Note that synchronous request/reply is no sin inside the service boundary. It is quite acceptable and often necessary to leverage synchronous request/reply inside the service boundary, such as between the user interface and application tiers. Here, there is no domino effect that will bring down your entire enterprise. The effect of a system going down is confined to the service in which it occurs.

But between services if you require request/reply, use the asynchonous variety. I'll cover asynchronous request/reply in my next post.

Tuesday, March 11, 2008

Synchronous Request/Reply is Bad

In previous posts, I have stated that synchronous request/reply is a bad idea between services, but I think it warrants its own post to discuss.

Synchronous request/reply is a message exchange pattern where a request message is sent by a service, where the sending thread is then suspended until the response is received.

So why is synchronous request/reply so bad? One reason relates to performance. Consider a situation where you have many services communicating in a complex interaction.

The first service makes a request to the second, which then makes a request to the third, which then makes a request back to the first and so on. The thread that sent the original request is still waiting for its response.

When you have many requests and responses going over many network boundaries, the requesting thread will be waiting a long time. It may in fact wait forever. One tenet of SOA is that services are autonomous. There is no requirement for them to be available all the time.

While it is waiting, the requesting thread is tied up unable to be used for any other purpose. Effectively what results is a whole bunch of threads across a number of services sitting around waiting for responses.

The performance of your entire service ecosystem grinds to a halt. Meanwhile the CPU utilisation on each of the machines is nearly zero. No work is being done; everyone is just waiting for everyone else.

Obviously this result is unacceptable. I'll cover another key reason why synchronous request/reply is a bad idea between services in my next post.

Monday, March 10, 2008

Composite Applications and the Service Boundary

Like SOA, the term "composite application" means different things to different people. To me, a composite application is an application that composes two or more separate applications into a single integrated user experience. One such example is an enterprise portal. Another is a composite smart client application.

So what does it mean when we have a single composite application providing an integrated user experience spanning multiple services? Now we have the potential to share data between services without it being done via the public endpoints of the service. Bad composite application!

But is this really a problem? Well it depends on the design of the composite application. The purpose of our service boundaries and explicit service contracts is to provide loose coupling between services. We want to isolate change behind our service boundaries.

If we design our composite application in a modular fashion, having each module contain logic for one and only one service and we do not permit these modules to reference each other in any way, then we enforce loose coupling within the composite application and thus maintain loose coupling between our services. If one module cannot directly reference another, then a change in one module cannot by definition affect any other.

Any given module can communicate with private endpoints of the service to which it belongs, and any public endpoint of any other service. If we follow these simple guidelines we will not violate any service boundaries.

Friday, March 7, 2008

Guerilla SOA

Jim Webber has posted a very entertaining video presentation on Guerilla SOA which although I do not entirely agree with his conclusions, is well worth the watch.

One thing Jim fails to explain is how to achieve publish/subscribe with pure WS-* on the WCF stack. In order to achieve this message exchange pattern today, one needs some kind of service bus topology.

He also talks about implementing SOA based entirely on open standards (such as WS-RM), but doesn't consider the fact that in many cases we need transactional durable asynchronous messaging.

We want the sending of a message and its receipt to be transactional, but we want the sending to be decoupled from the receipt such that the receiver does not need to wait for the sending thread to commit its transaction before it can commit its transaction.

We simply cannot have a transaction spanning two services. This is going to couple our services unacceptably. We must leverage an asynchronous messaging infrastructure such as that provided by an ESB.

Something else Jim does not consider is the need for service virtualisation and message routing. We don't want each service to have to know the location of all other services with which it collaborates. We do not want to have to configure each endpoint independently. It is well accepted that point-to-point communications are an administrative nightmare. We want a decentralised distribution model with centralised configuration.

Where I do agree wholeheartedly is that ESB technology is particularly appropriate and useful for providing an EAI capability behind the service boundary.

I also agree that putting smarts in the network (in the ESB) is a very bad idea. With SOA, the service is the top level construct and all smarts must go behind the service boundary. Once you have smarts inside the network, you are really violating your service boundaries, bleeding functionality outside the boundary. There will be a separate post on this later down the road.

In my opinion, the ESB still delivers capabilities in the fabric between services not yet matched by stacks such as WCF. Not to say that we won't necessarily get there in the end.

Thursday, March 6, 2008

Services and User Interfaces

One of the common misconceptions regarding the definition of service boundaries is that user interfaces fall outside the service boundary. User interfaces form part of an application, they are never a separate application in their own right.

Consider a Web application that exposes some service endpoints. Some of these endpoints may be public (they sit on the service boundary) and some may be private (intended for use within the service boundary).

Regardless, any endpoint should be exposed only for the purpose of accessing functionality from a remote system (either inside or outside the service boundary). Endpoints should never be utilised by a component sitting on the same physical tier. Why go to the effort of creating a message to send to a local component? Why not just invoke the local component directly?

If the application is designed appropriately, there should be proper separation of concerns between the logic that receives the message, and that which performs the business logic. That is, you should have a message handler object that receives each message, and separate business logic components invoked by the message handler in order to achieve the desired business action.

This way, we can have the user interface invoke these local business logic components directly, without going to the effort of constructing message objects that are not actually going to be sent over a boundary of any kind.

Now, where we do need messages to be constructed and passed by the user interface to the application is where we have the user interface on a separate tier. One example of this is an AJAX application where the browser invokes Web services on the application tier. Another is a smart client application.

Note that the Web service endpoints powering the UI are private. They are not intended for use outside the service. Their existance is merely an implementation detail of the service. In fact in the case of the smart client application, we could change our approach to use .NET remoting, or DCOM if we so desired. In this case, there are no Web service endpoints exposed, but the application functionality remains identical.

So with any application, the user interface (and the people that use it) sit inside the service boundary.

Saturday, March 1, 2008

Are there People in my Service?

Too often in discussions around SOA the focus tends to be on the systems, rather than the people. Applying SOA in a business context involves aligning the services with the organisational structures and business processes they support. As such, people play a big part in service definition.

In truth, a service involves more than just the IT systems, it includes the people that support the function performed by the service (assuming the service is not completely automated, in which case there would be no people). This is because the people that interface with the service (via a UI) make various decisions that support the business logic of the service.

Usually we automate the structured decisions within the service with programming, and rely on people to make the unstructured decisions. People are an implementation detail of the service, encapsulated behind the service boundary.

To make my case I'll use an example. Say we have a service that performs risk assessment in an insurance company. Initially let's say the service is implemented such that when a risk assessment request arrives, it is routed to a worker where it sits on his or her work queue.

The worker eventually pulls the item off the work queue and performs the risk assessment against the risk assessment service via the UI. The system then sends a risk assessment response back to the requesting service.

Let's say then at some point, the business decides to automate this process. The worker is taken out of the loop, the risk assessment UI is thrown away, and a rules engine is put in its place. The service contract has not changed, so to the outside world, the service remains the same; but we have now removed the worker.

The worker is therefore an implementation detail and sits squarely inside the service boundary.