Monday, June 16, 2008

Reliable Messaging

Recently, we discussed the use of idempotent messages as a strategy for achieving reliable message delivery over an unreliable transport such as HTTP. The goal of idempotent messages is to eliminate any side effect from receipt of duplicate messages so that the sending party can retransmit messages for which no receipt acknowledgement was received without fear of the retransmitted message causing problems at the receiver.

The problem with relying upon idempotent messages of course is the effort involved in writing the retransmission logic for every endpoint sending messages reliably over an unreliable channel. It can also in some cases take quite a lot of effort to design and write systems that compensate for operations that are not naturally idempotent such that they become idempotent.

As such, we want to leverage reliable transports where possible so we prevent idempotence and retransmission concerns leaking into our application logic. Reliable transports handle retries and eliminate duplicate messages for us as part of the communication infrastructure.

Furthermore in situations where it is possible that messages arrive out of order (perhaps as a result of being routed by one or more intermediaries), reliable transports are capable of reordering messages such that they are delivered in the order in which they were sent.

A problem with reliable transports however is that they tend to be platform specific (such as MSMQ, available only on the Windows platform). Fortunately a standard reliable messaging specification has been defined, WS-ReliableMessaging (WS-RM). This specification falls under the WS-* group of specifications.

The catch though is that it is left up to the WS-RM implementer to decide what kinds of delivery assurances the WS-RM stack supports and will enforce. For example, the number of attempts a sending party makes to send a message before giving up is a matter of configuration at the sender. What the messaging infrastructure does with messages that fail to be delivered is out of scope for the WS-RM specification.

Whether the receiving party holds out of order messages aside such that they can be dispatched to message handlers in order is matter of how the receiving WS-RM stack is implemented and/or configured. It makes no difference to the messages that are transmitted over the wire.

The same applies for whether messages are placed in a durable store at the sender before being sent, or whether they are placed in a durable store at the receiver before being dispatched to the message handler. This makes sense if you think about it. There is no way that a service provider could enforce that its consumers store messages durably before forwarding them onto the provider.

The best we can achieve is that the service provider and its consumers are able to make claims about delivery assurances. This is achieved with WS-Policy assertions. Although WS-Policy assertions have been defined for some delivery assurances, none have yet been defined to make claims about durable messaging.

So we need to be aware when using WS-RM that either endpoint may or may not be storing messages durably. This means that if a service provider or consumer process crashes, a message could potentially be lost.

Microsoft WCF does not support durable messaging with WS-RM at all. Durable messaging with WCF is achievable only by using the MSMQ transport. In my opinion, this severely limits the usefulness of WCF's WS-RM implementation.

Another limitation of WS-RM is that it is not at present universally supported by all SOAP stacks as it is a relatively new specification. Where it is supported, there are no guarantees of what delivery assurances are enforced by the interacting parties.

That being said, where reliable messaging is required between services on disparate platforms and WS-RM is available, it certainly beats a raw HTTP transport.

No comments: