The Concept of a Self Help OLAP

(Unorthodox views on OLAP)

Mauro Guazzo, Codework Italia

Introduction

I have been involved in DSS-OLAP projects for over ten years - mostly for European companies.

Yet the ground rules gained from my experience in over 100 applications do not seem to match the prevailing views on OLAP.

This is probably due to the fact that I have in mind a special variety of DSS-OLAP , which I call a Self Help OLAP : a robust, powerful, low-technology, high-safety, direct-use no-nonsense software that adopts a sound multidimensional data model.

Today's OLAPs must be client-server, object-oriented, relationally-enabled, network-centric, multi-standard, multi-platform, multi-tier, Java-conscious, workgroup-oriented, ... to start with.

Then, proceed to check that they can handle data well in the Terabyte range.

No surprise that such colossuses need teams of multi-disciplinary experts and a very long implementation phase.

True, when such a project produces a stable system, the company also has a tremendous showpiece to impress its visitors with (and justify project costs).

OLAP companies seem to be looking at each other and at few success stories, not at the potential user population.

Against one case where money is available to build a "luxury info system", there are a thousand situations in which budget is nearly null but a problem analyst is in a pinch.

To start with, he (or she) might not have the time/budget/company support for a major OLAP project.

This is the kind of scenario I have in mind. This note tries to explain these views and the desirable features of such a "OLAP survival kit".


What real-life managers use

In all companies that I have been in contact with, the pattern is the same: automation is of great help for low-level routine work and of little value for top managers (who, by the way, make vital decisions).

We know that decision support can take many forms and be implemented at various levels of sophistication. Leading edge technology now has a lot to offer, under the names of expert systems, natural language interfaces, knowledge engineering, workgroup support, intelligent agents and what not.

What I have in mind is the very basic level (which is wrongly taken for granted): to decide having the relevant data in front of you.

Where the problem lies:

Obsolete or unsatisfactory operating applications;
No easy, uniform, versatile access to data;
Lack of integration within operating applications;
Data re-typing or copy-paste as a frequent last resort.

Top manager staff mostly use only two data analysis tools: spreadsheets and hand calculators. Isn't this a cave-age situation?

Consequences:



Ground rules for a Self Help OLAP

The person in charge of a company function (like finance, personnel, production,...) or in charge of a company unit (service, plant, office,...) should have the right to
define what data are relevant to his job;
have a master copy of such data in an environment on which he has complete control;
be able to do all processing and manipulation with his own staff.

The third point is the most controversial, the objections being that the manager is trying to build his own IT center, that efforts are duplicated, that everybody should be doing his job and not other people's job, etc.

What we claim is simply that a person in the manager's staff be endowed with the necessary tools and skills to perform all data processing, both routine and ad-hoc.

The required response times and the frequency of adaptation make it impossible for traditional IT specialists to take action (this, in turn, is due to different tools, skills, turn of mind, mission,...)

The inevitable and unpleasant conclusion is that some user must take full responsibility for functions that were traditionally the responsibility of IT professionals.

Concept of data laboratory as a tool for problem analysts. (a.k.a. business analysts, or "people who produce commented analyses with supporting data").
Concept of separation between data definition and constitution on the one hand and data use on the other. The former deserves quite a bit of planning, the latter must be totally free and unconstrained.
Guaranteed response to a changing environment: the person who has structured the data and implemented the application from scratch is deemed capable of adjusting them to changing requirements.


Typical OLAP implementation problems

The responsibility of system conception and project management (which were formally the realm of IT-trained people) must be shouldered by users who might have neither the proper training nor the correct frame of mind.
Level of detail must be extremely variable. Large data volumes are the rule.
One must master a very complex data structure
The basic data structure is a multi-way table (a multidimensional hypercube); various aggregation steps on several dimensions; need to travel effortlessly from synthesis to analysis, from one logical view to a different one, just to follow a train of thought.
Long cascades of calculations and complex processing.
Need to build durable solutions (as opposed to private, semi-manual, disposable applications).
Need for a simple and bare-bones system that will work "no matter what" and will not add its own glitches and malfunctions to the problem at hand.
Ability to react to a change in the rules (at any level) in very short times.


Learn from other people's failures

Try to define everybody's role, even in small teams. (for ex. sponsor, client, administrator, manager, user, ...)
Define the proper role of each software tool (info system, data access tool, report writer, OLAP, EIS, spreadsheet, DTP, business graphics tool). Ideally, you want to have freedom of choice for the best product in each category.
Identify the benefits of your OLAP project for the whole company, for the team, for each individual.
Parsimony in the use of software technology and the accompanying complexity. You want just the technology that is necessary and sufficient for your objectives. In short: "keep it simple!".

State your objectives and priorities (even if they look blurred at first) For example:

  1. have the data available;
  2. print and distribute it in adequate form (standard routine reporting);
  3. learn to ad-lib and adapt to change;
  4. add the glossy presentation (slides, business graphics, DTP, HTML, multimedia...).
Look into the many aspects of data reliability:
  1. unambiguous meanings and nomenclatures;
  2. stable and rational coding schemes;
  3. single data source;
  4. prompt updating (but at given predefined moments, like monthly snapshots - no real time data!);
  5. no redundancy in the basic data structure;
  6. reliable computation and processing---no manual fiddling with the data.


Desirable features of a Self Help OLAP

  1. A general-purpose tool or a language, not a tailor-made or parametric package (traditional programming is NOT an acceptable option to adjust to change).
    No ceiling to complexity of computations or data manipulations.
    In short: an open ended system.
  2. No impedance mismatch.
    Operations (even complex ones) that the user perceives as atomic must also be atomic for the OLAP system. Operations carried out on a small data cube must have a general validity.
  3. Applications should be naturally modular and reusable, like parts of a well-conceived construction kit.
  4. Easy data import/export with other software tools.
    Import/export should be problem-free thanks to exchange standards, not necessarily dynamic and instantaneous. Data should be available across company applications to analysts of any trade (controllers, accountants, quality assurance people, production engineers, auditors,...)
  5. Reliability of the whole OLAP system.
    "Reliability" sounds just great, until you mention the sacrifices that will improve it.
    Availability and usability have a much higher priority than any purely technical aspect (execution speed, software technology, ...).
    In short: it must work no matter what.
    The tradeoff between exciting new software technology and safe, field-tested tools is a difficult one in any application field. But you have to be twice as cautious if you are choosing a tool for an "emergency team".
  6. Encapsulate and demystify abstract, theoretical concepts.
    A forecasting module (that may contain very sophisticated statistical algorithms) can still be invoked by a simple, natural (and maybe option free) interface.
    A relational join can be expressed by pointing to the relevant fields in a rather intuitive way, even if we know that tons of scientific literature have been written on the concept.
  7. Semantics.
    We want OLAP systems that take charge of high level responsibilities.
    To do this, we must express as much high level data semantics as possible, until some basic common sense is transferred to the system.
    Permitted operations should then coincide with meaningful (not with possible) operations.
    Averaging telephone numbers, adding dollars and kilos, overwriting computed values, extrapolating non-time dimensions,... should not be within the user's reach.
  8. Scalability.
    This means that when a pilot project works, the same concepts, architecture and software tools should apply to a larger scale project or even to the whole company.
    Remember that scalability is a two-way street and the minimum host size is just as important for a Self Help OLAP .
  9. Transparency.
    The OLAP system should be in the habit of commenting and explaining (even in a verbose way) what it is doing, what it refuses to do, suspect situations, errors, remedial actions, ...
    There must be available services for documenting all work, in a paper form readable to everyone (e.g. work sessions, code, macros, data structure, aggregation paths, computation trees, ...).

About the author.

Mauro Guazzo is technical director of Codework Italia. In this capacity he has been active both in OLAP field projects and in the design of TANGRAM.

He can be reached at mauro.guazzo(at)gmail.com

Home page - Contact - Excel add-in - Mail to - Download - OLAP links - Manual