For as long as I’ve been a technology analyst I’ve been aware that there is both a process centric and a data centric view of IT. For a while I believed that one of these views was correct and the other was fundamentally wrong, and I became a process-centric bigot. But later, when struck by a vision on the road to Damascus, I changed my mind.
The Hard and the Soft
In a physics lecture at university, one of the esteemed lecturers tried to explain how it might be that, from one perspective, an atomic-level phenomenon appeared to be a particle and yet from another perspective, it appeared to be a wave. He said, simply; “It’s like a vending machine. You put your money in and maybe chocolate comes out, or maybe it’s potato chips.” He seemed delighted with his explanation, until someone asked him to explain his explanation. He explained it by repeating, generating a little more sound, but no light whatsoever. Of course, that was in the days before string theory and, with string theory, quite a different explanation could be given.
But thinking about this, years later on the road to Damascus, it suddenly struck me that all forms of study seem to embody some kind of fundamental dichotomy.
- In religious study, the question: God or no God? (try finding a God in Taoism)
- In physics, the question: Is it a wave or is it a particle?
- In mathematics, the question: Is it a continuous or is it discrete?
- In chemistry, the question: Organic or inorganic?
- In biology, the question: Life or not life? (e.g. Is a virus life?)
- In philosophy, the question: Free will or determinism? Yin or Yang?
- In linguistics, the question: Verb or noun?
- In literature, the question: Poetry or prose? (e.g. Is The Iliad poetry or prose?)
- In phonetics, the question: Is it a vowel or is it a consonant?
- In computing, the question: Is it hardware or is it software?
- And in software, the question: Is it data or is it process?
Here’s an example that lies on the border-line of the software dichotomy, the simple statement in the C language:
y = get(x);
This statement assigns a value to the variable y. But the value it assigns is a function that returns a value rather than a simple value. So is that really a data assignment?
We can pursue this dichotomy for quite a distance. For example, a program in its raw form is just a collection of text-like data. So isn’t it just data anyway? Even when we compile it and it’s no longer recognizable as a set of commands, isn’t it just a string of bits?
Or let’s consider the situation where a set of parameters are provided to a program. The parameters are, of course, just data. But, wait a minute. Maybe they determine the order of events that will take place and the files that will be acted upon and the destination of the results. Surely that’s not data, it’s a set of commands and hence it is process.
Now click on the command in your browser that says “View Source” Look at gobbledegook it shows you and, well, some of it looks like data and some of it looks like god-knows-what, but the browser surely knows how to interpret it. So that gobbledegook is process, isn’t it? Of course, HTML is an unfair example, because it is a poorly designed confluence of process and data having a sordid affair with each other and doing unspeakable things together.
When Data Centric and Process Centric Collide
Roughly speaking the data centric view of computer systems sees: a useful collection of well-defined simple and compound data items that are transformed by various processes into usable forms for the greater good of the data consumer. The Relational Database movement in computing was fundamentally data centric and the programs all orbited the database. Indeed, with referential integrity, cascade deletes, database constraints and stored procedures, the database did its best to subsume process. Data was king and process was, at best, queen or courtesan.
The process centric view is the opposite. Computer systems are a set of complex transformations that are carried out in the service of the computer user and, to that end, are fed with the appropriate data. The object oriented movement was, of course, process oriented. Objects were collections of processes to which data could be assigned. Data was something that either persisted (if it were to be used again) or was disposable once used.
It is odd that both these views of computing co-existed for many years, leading to skirmishes between their adherents. Nobody won. The OO programmers saw the database as nothing more than a cupboard in which to store persistent data. That was fine, but relational databases were never built to be depositories for oddly shaped collections of data. According to laws carved into tablets brought down from Mount Sinai, data was to be stored in two dimensional tables or not at all. This created the famous “impedance mismatch.” An impedance mismatch occurs when an irresistible force meets and immovable object.
There would have been “great wailing and gnashing of teeth” if some programmers hadn’t written products like Hibernate to resolve the problem. And it didn’t completely resolve the problem, because despite this truce, the data-centric folk were devoted to cramming logic into stored procedures, while the process-centric folk did their best to trap the same logic in objects.
And It Happened Again
That would all have been a foot-note to the history of computing were it not for the fact that it happened again. The border dispute between the OO folk and the RDBMS folk, was rekindled by the onset of the much vaunted Service Oriented Architecture. Now I don’t want to bad mouth SOA, it is an extremely positive development in many ways. However, I remember having to produce an architecture map for SOA in about 2006, and one of the first things that I noticed was that SOA didn’t give a damn about data. It gave so little of a damn about data that none of the merry group of SOA vendors that leapt on to the SOA bandwagon had written a single line of code to enable data integration within SOA. There was nothing there. Nothing at all.
In my next posting I’ll suggest what could have been and maybe should have been there.