How to avoid Information Silo

Information silo is one of the most common and expensive pest living in IT departments and consuming its resources. Here is how it grows and how to fight it.

What Information Silo is

Information Silo is defined by Wikipedia as a “management system incapable of reciprocal operation with other, related management systems”. In any company you will surely find many examples of this villain: different applications are use its own database. Integration depends on file transfer (import export) subject to constant synchronization constraint and troubles.

In practice I have seen two root sources of Information Silo:

  1. Packaged applications. For different reasons (not necessarily economic) the adoption of packages applications (like ERP or CRM) is now an usual practice. Each application comes with its own database and integration became a nightmare.

  2. Departmental applications. Stacked on their own database with the excuse to be “limited in scope” and built using agile methodologies to meet scheduled deadlines. The “departmental” term has been abused as an escape valve and many corporate applications result from this approach.

Please, understand I am absolutely not contrary to packaged applications or agile approach. Both have shown their great value. By the way the success of agile methodology shall raise another question: what was wrong with the waterfall model and data base administration (DBA) in general?

If you ask a developer she will possibly tell you that going through a DBA to discuss a schema is waste of time. The schema was correctly generated by her persistence framework. The DBA will surely have her different point of view to justify her work and enforce some curious policies (like those hard to spell column names). I am not defending any of them but trying to show a solution. Apart technical factors, let me illustrate my point through practical organization example.

DBA failure

Suppose there were no computers, you are the boss of a company and need to know an employee salary, how would you proceed?

  1. Ask information your human resources (HR) department or

  1. Go to HR department and look directly into their files?

Unless your company is a real burden you will surely do (1). At least you pay HR department to provide services!

The lesson is clear: sharing data has been recognized as a poor practice since early software engineering practices. Programmers shall remember this lesson from old COBOL programs where all data was shared in the “data division”. The apparent impression of being organized shortly evolves into chaotic maintenance.

The same applies to database when multiple applications put their hand on same data. Data producers are locked into a schema that can't be evolved because consumers use it to access data. What happens if I directly look into HR files and HR manager decided to change the drawer they are kept? The same happens if an application evolves its database, and other applications depends on it.

SQL Views offer limited relief and their maintenance also quickly becomes a pain, as shown in figure below:

This was, in my opinion, the real cause of DBA failure as it was conceived in the classic sense more than 30 years ago. On the other side, if DBA principles were systematically applied no “information silo” would exist today. So what can we do?

Single owner of data

I suggest the adopt following principle to solve the information silo problem:

Each piece of data shall have a single owner application that manages it,

and make it available to other application through services.

Best alternative to communicate between applications is probably a Enterprise Service Bus (ESB) as shown in figure below. In addition to providing full database flexibility to single applications, complexity is reduced:

But keep in mind two important points:

  1. ESB alone will not solve the problem! You shall also adopt a well engineered architecture that adheres to single owner principle! Somebody shall coordinate the API design that involves now multiple teams. A possible alternative would be to evolve your current data administrators with new disciplines (patterns, API design...), and I have seen excellent results in this sense.

  2. Service Bus (or any API based) communication is probably more expensive than brute data sharing, not only in term of resources, but in terms of API design and implementation. Such investment is easy to justify but shall anyway be made, and take some time. If you are in a rush sharing data is faster and (like any sin), more compelling.

Just identifying an approach doesn't solve the problem, so let's see how to handle the two root cases.

Choosing packaged solutions

Preventing is better than fixing: In case of packaged solutions, an “acid test” can detect how much a candidate applications compare in terms of integration.

I understand that a detailed comparative analysis can be prohibitive, but a sample test applied to all candidates can avoid bad surprises with little effort.

Simply identify 5-7 cases of communication with other applications and ask vendor to show required code. Suppose you are acquiring a CRM application, a typical case would be: show me the code I shall implement to make CRM use product information from my legacy system, or, show me the code to capture communications from my specific EMAIL or VOIP system.

What is really important is to actually looking at the code! That will tell you many things including how much effort can be expected, how much flexible the application is (in terms of integration) and how well such facilities are documented thus usable.

What is common is a bare question in the usual checklist: “can it be done?”. Those questions are always answered by more or less evasive “yes” by vendors that possibly add some vague constraint (but …, if …, unless...). That's absolutely not enough to avoid surprises. You shall insist in seeing the code, or at least for a documented API reference, and check with your developers if that will fit their needs.

Now, what about already acquired the application? You still have the wrapping alternative.

Wrapping an existing application

The alternative is to wrap an existing application into an integration shell. Ideally part of the application itself will interface the shell. Such undertaking shall not necessarily constitute a standalone undertaking, but can be released as an incremental API as integration needs arise and evolve.

Suppose, for example, that another application requires access to client balance of invoicing system. Once such functionality is be made available through a specific service, is also important that invoicing system itself uses the same service whenever client balance is required.

Practicing the “eat your own dog food” principle guaranteed consistency and evolution to the API. Thus, in practice, hybrid situations like the one shown below will not be the ideal one but convenient:

There is no magic recipe or road map to build this kind of shell, but, having access to application data (and obviously understanding tis meaning) gives enough flexibility to satisfy most requests without intervention on code. Publisher-subscriber patterns, for example (through which another application can be notified when certain data change) can be implemented by firing SQL triggers.

You can find in Enterprise Integration Patterns a practical set of recipes and suggestions for different situations and issues.

Last modified on 2011-05-24 by Administrator