Once in a while services need to participate in transactions. This is done to make sure that the business logic is carried out correctly, or completely canceled in case one of the services fail. Let's look at a classic example:
A bank application needs to transfer money from one account to another. Currently the bank system has two services:
- Deposit Service
- Withdrawal Service
Here is an illustration of the bank system:
|Service Transactions - a simple bank application using two services.|
In order to transfer money from one account to another, the bank application needs to call the withdrawal service and deposit service. For instance, first withdraw $100 from account A, and then deposit the $100 in account B.
If for some reason one of the two service calls fails, then the bank system will end up in an inconsistent state. If the bank system called the withdrawal service first and the deposit service call fails, then $100 will have been withdrawn without them being deposited anywhere. $100 were lost in cyberspace. If, on the other hand, the deposit service was called first and then the withdrawal service fails, then $100 will have been deposited without being withdrawn from any account. $100 was created out of thin air.
The solution to this problem is to have the withdrawal and deposit service calls be executed as a single, atomic action. This is what transactions are for.
Transactions as Atomic Actions
A transaction groups one or more actions together, and makes sure they are executed as if it was just a single action. If one of the actions fail, all of the actions fail. Only if all actions succeed will the result of the actions be "committed" (permanently stored) to the system.
Here is an illustration of the withdrawal and deposit service being grouped into a transaction:
|Service Transactions - two services grouped into a single transaction.|
Two Phase Commit Transactions
Services participating in a transaction needs to be coordinated in order to assure that either all or non of the services called "commit" their actions within that transaction. A popular way to coordinate such distributed transactions is the "Two Phase Commit" protocol.
The two phase commit protocol consists of three steps:
- Begin Transaction
- Prepare Phase
- Commit Phase
First all participants are told to participate in a transaction. From this point on, all actions carried out by each participant refering to this transaction, must be carried out as a single action (all or non). The actions cannot be committed to the main system yet, but must be kept internally in the participant (service), until the participant is instructed to commit the actions.
Second, once all actions are executed successfully by all participants, all participants (e.g. services) are ordered to move to the "prepare phase". Once a participant is successfully in prepare phase, the participant must guarantee that all actions carried out inside the transaction are ready to be committed. If the participant fails after it has successfully moved into the prepare phase, it must be able to commit the actions once the participant is functioning correctly again. In other words, the actions executed inside the transaction must be able to survive even a participant crash / reboot. This is usually done by writing a transaction log to a durable medium , e.g a hard disk.
If one of the participants cannot move to the prepare phase, all participants are ordered to rollback (abort) the transaction.
If all participants successfully move to the prepare phase, then all participants are now ordered to commit their actions. Once committed, the actions are then visible in the system, and permently executed.
A Two Phase Commit Weakness
The two phase commit protocol is not 100% secure. Imagine if a service crashes after entering the prepare phase. Now all services are requested to commit. Imagine that the failed service is the last service in the chain to be ordered to commit. But the commit order fails, because the service has crashed. All other services in this transaction would already have committed their actions, but not this last one.
Normally, the service would have to commit it's part of the transaction once the service is operational again. However, if the service never comes back up, then what? It's actions has not been committed, what is the status of the system then?
Here is an illustration of a two phase commit transaction where the last service fails to commit:
|Service Transactions - A two phase commit transaction failing.|
When multiple services are to participate in a transaction, some entity must coordinate the transaction. In other words someone has to say when the transaction begins, and when to move to the prepare and commit phases. This entity, the transaction coordinator, can be either:
- The application
- An enterprise service bus
- A separate transaction manager / service
In the example earlier in this text the application coordinated the transaction.
One Service Per Transaction
A way to simplify transaction management in a service oriented architecture is to design the services so each transaction is contained within a single service. For instance, in the example at the beginning of this text the money transfer would be implemented as a separate service, instead of a transaction involving two services. Here is how that would look:
|Service Transactions - Each transaction is contained within its own service.|