Many SOA solutions consist of integrating applications, services, and backend processes to fulfill business requirements. You can design these solutions for the absolute best case and you are sure to get disappointed in production. A more prudent approach is, as always, to design for the real world consisting of interruptions, exceptions, and unscheduled outages. Here are some pointers when designing integration solutions:
- Depending on how much reliability you need consider using reliable messaging
- Introduce appropriate level of abstraction with each integration point. You many end up with a rat's nest of point to point integrations only to find out that the system you integrated with is about to be decomissioned.i.e. you need to wrap legacy capabilities.
- Provide administrative capabilities to manage each integration point. Ideally, you should be able to enable/disable and/or throttle the number of messages going through etc. This is exactly what Michael Nygard advises when talking about circuit breakers in his book Release It.
- Make sure you can smoke test integration points after scheduled and unscheduled downtime. You don't want to find out that a routine patch deployed on a sunday is causing an outage for your business users monday morning.
- Provide the ability to vary logging/auditing at runtime. For instance, specific to an integration point you might want to provide verbose logging to troubleshoot an issue and revert back to the normal level after the problem goes away. You shouldn't redeploy or restart the app for these eventualities.
- Build as much contextual information as possible while generating exceptions. Integration points can be painful to troubleshoot without appropriate error detail. Also, you should design a solution for assessing the impact on your system. Expect your business users to ask questions like - how many transactions were impacted? what about transactions that were in-flight? when the problem is resolved, can the transactions resume processing? is there a report of all transactions that need to get completed today (due to SLA, regulatory, or financial reasons) and are being impacted?













Leave a comment