Sunday, December 5, 2010

Data, Information and Knowledge.

Many years ago I was working at one unit of CSIRO (Australia's major research organisation) and there were regular brown-bag sessions covering aspects of related areas.
There were a number of good talks, sometimes internal, sometime from visitors from other units, other countries etc. There is one that I particularly remember which was about the difference between 'Data', 'Information' and 'Knowledge'.

As I remember the definitions used were something like:
Data = a record, or a set of related values which tell you something about the world.
Information = a document. I remember the presenter spending sometime justifying this point but don't remember the details. It came down to the fact that information was a concrete entity which could be written down and persisted. Thinking about this now I don't think that a physical document should be required. Just that the information should be abstractly external and therefore documentable.
Knowledge = what is in someone's head. He was clear on the fact that knowledge always required someone to know it. Knowledge is not passed on to someone else, but rather created within their own minds as a partial copy of the original. The point being, of course, that no-one can be sure that what they know about a topic is the same as some-one else's view. If nothing else the mental models and contextual links will be quite different.

Several years later I tried to see if any of this had ever been published or was available somewhere within the organistion. I was unable to define exactly when I had heard it, who had presented or even what unit they were from. Naturally I was unsuccessful.
I have found other definitions of Data, Information & Knowledge* but none seem to match the approach I remember.
In other words - I am not sure if what I have written above matches in any way the actual presentation. Still, it provides a basis for further thought.

The approach has significant consequences for things such as knowledge bases or knowledge transfer sessions etc. An organisation, if treated as an organism, has knowledge of its environment and its internal processes. This knowledge is a conglomeration of the knowledge of the constituents, the individual members, of the organisation and is subject to contexual links between them. (I suspect the same applies to knowledge within individuals). Communication patterns - how information is processed and propagated - within any group of people will have a significant impact on the operational knowledge the group uses to perform its function - and on the sensory knowledge it derives to examine its environment.
To maintain knowledge within an organisation requires that it be sufficiently dispersed amongst the members of the group so that the removal of any specific element has no significant impact. Dispersal of the knowledge requires that sufficient communication paths exist to distribute information amongst the people most involved in it. Since knowledge tends to be stratified within an organisation (i.e. each level in the hierarchy has its own priorities; despite "Undercover Boss" you would not expect the CEO to know how to work on the shop floor - their priorities are different) the most important communication paths are between peers. A good manager should understand this and encourage it; while still understanding enough of his people to be able to cover the inevitable gaps.
There is a whole discipline there about the creation of knowledge in new members of a group. Depending on the maturity and size of an organisation and the clarity of the knowledge, there are very many different mechanisms for achieving this. Teaching, Training, Mentoring, Coaching, Leading, or just chatting between the old hands and the newbies. Each has its own advantages and drawbacks and areas of effectiveness. All are related to re-creating knowledge held by one person in the mind of another.
Converting knowledge to documents allows it to be propagated and stored. But storage of information in a repository can only be useful as a back-up mechanism and with full understanding of the limitations. 'Knowledge bases' are only useful if used as temporary storage and constantly updated. Nailing down knowledge as information makes it static and isolated. It removes context and the documentation is locked in place where it quickly goes stale. If picked up in time, by someone who has some mental framework in which to place it, the information can be brought back to life as knowledge. But over time, the signal to noise ratio decreases and it becomes more and more difficult to identify what is relevant.
In other words, unless carefully organised and maintained - and with regular turn-over - any 'knowledge store' collects so much outdated information that it becomes a major effort to find the uesful tidbits. This is the whole point of librarians. They specialise in the organisation, maintenance and search for meaningful information that may be used to re-create knowledge.

but one point that I think I may return to is that one view of the Internet is as a vast information repository. Most of it is turned over very regularly and hence is relatively useful - depending on the relevance it has to the knowledge you already have in your head**. While not particularly well organised (and, I think, all the more powerful because of it), the information can be reached easily which mitigates somewhat. However current search mechanisms could easily become a problem. 4M responses to a simple query is not targetted and it is very difficult to find anything when the relevant keywords are too generic (try finding about a problem where IE8 freezes occassionally - what other search terms can I think of?). Extracting the signal from the noise becomes more and more difficult as the total volume of information increases.

I am not sure where I am going with this and in re-reading the post seems to ramble about a bit. But it touches on a number of points that I think I will explore in more detail later and I don't want to leave them out. Besides - since no-one is reading this anyway, who cares :-)

* For instance: Information and Knowledge as the first and second derivatives respectively of Data with respect to Intelligence. A cute definition but not particularly useful - even mathematically :-)
** Useful is a very relative term. It has a critical dependence on the inclinations of the person acquiring the knowledge and how closely it can be linked to existing knowledge.

No comments:

Post a Comment