Principles and bibliography

We have formed some theoretical foundations for the standards, but they are far from complete, so you are invited to share your ideas. This document is a required read if you want to contribute to the standard.

NDI standard concerns two major areas:

Although data indexing and navigation are huge tasks per se, many theoretical and practical developments exist. The standard will not cover particular techniques or implementations but briefly. The standard will require features as part of a compliant implementation, as well as recommended features, along with pointers to relevant projects and bibliography.

The Data transformations and representation section will provide a theoretical background, by laying out the basic notions and principles behind the approach.

Data Indexing and Navigation

Views on data, data transformations and representation:

The standard will rely on the data-centric phylosophy, where the data are the focus and result of the transformations. Data are also providing certain services, so each datum is in fact a "micro-server".

Datum Lifespan

The standard will define two types of data in regards to the life span. One would be permanent and indexed, and the second will be a temporary type. A datum of the latter type can only be the result of a view, is not indexed and can only be used for presentation or as a base to any other view. A temporary datum is never stored in the system.

Data security and access rights

The standard will define a security framework for data, probably similar to the unix file access rights, with users and groups. Per-view access anyone?

Primitive data types

The standard will define primitive data types that will serve as bricks for more complex data. The only criteria for deeming a data type primitive is the necessity for such a type for data interoperability. Primitive data types will be absolutely necessarily stored in a Royalty-Free (but preferably public-domain) format. Primitive data types are viewed as something simple and self-sufficient, as far as the client is concerned. However, even primitive data can be cut into pieces. Also, each primitive type will have a list of required views, that can only result in primitive data or presentation data.

List of primitive data:

Useful complex data:

The standard will specify a list of useful complex types, along with a list of required views for each. These will be a minimal set.
  • Event
  • Point (1D,2D,3D,...)
  • Segment
  • Views on data:

    Views are what makes manipulation with data easy. A view can be applied to one or more data, resulting in a new datum.

    PART(OFFSET[,OFFSET,..], SIZE[,SIZE,..], SIZE_UNIT)

    This view should return a sub-unit of the same data type, starting with OFFSET and of SIZE SIZE_UNIT units.
    Multi-dimensional data (image) can accept more than one axis as OFFSETs and SIZEs.
    SIZE_UNIT can be letters, words, paragraphs, columns, pages, pagesets. Decimal points are allowed in SIZE, where every next dot reduces the granularity, e.g PART( OFFSET(.3.1), SIZE(..2), 'COLUMN' ) means fetch two words starting with the first word of the third paragraph.

    Presentation views:

    Presentation views are special views that result in a presentation on a device (such as the screen or printer or PDA or soundcard). These should contain as little as possible information other than necessary for formatting/presentation. To clear up the confusion, presentation view is not the datum, it is the process that generates the datum in a format fitting for presentation. HTML_out would be a presentation view. Presentation views are usually written in a programming language for reasons of efficiency.

    Aggregate views:

    It should be possible to create aggregate views and save them. An aggregate view specifies limited information about presentation. In ideal, an aggregate view will specify structural information as to how other data fit into the view. Decisions on how to display something are only taken care of in presentation views. Aggregate views are also processes, however they are open for the user to create and edit them visually or by writing commands in a specific data-manipulation language. Aggregate views are stored internally as compiled code.

    Keeping a clear separation between presentation views and aggregate views allows to have multiple presentations of the same aggregate data, as well as combine non-presentation aggregate views into more complex views, without getting mixed up with lower-level presentation if that is not desirable.

    Example of aggregate view: if I had defined one metric cube of uniform bricks, and wanted to build a house, the aggregate view of the house would be the instructions on where, and in which order, to lay each brick (NOT the house!). The execution of the "house" view would be the house. Note that the house could not be yet represented as it lacks a presentation view on top of it. Now, I could put this house in a 3D scene, add some lights and a camera, and generate a 3D rendering of it - THAT would be the presentation!

    Snapshots

    Sometimes it might be impractical to store a complex datum as an aggregate, when it is intended for frequent view. For example, consider storing a typical gimp layered image, with history of cuts, fills and other actions resulting in the final image. In such cases, the system should generate snapshots containing the result of an aggregate, in order to reduce the time needed to build the aggregate. For time-consuming presentation views, the system will create a snapshot of a presentation view for quick display (e.g. raster image). Ideally, the presentation view type will be a primitive datum. The snapshot would be marked invalid when the component data have changed, and would be rebuilt on next aggregation/presentation request.

    Views on complex data:

    ELEMENT( NAME [,INDEX] )

    Returns the datum known by NAME in the target aggregate. If the NAME holds an array of data, INDEX will specify which one to retrieve. Please note that no additional presentation is attached to the datum returned (unless the datum is a presentation datum itself). Further application of a presentation view is required to that effect.

    Queries on data:

    Every datum should implement a lookup interface, accepting atomic datum as parameter. The interface should allow for incremental search forward/backward (transparent cursor), results totals.

    Complex queries on data:

    It should be possible to have complex queries with inclusion, span relationships, boolean expressions, etc.

    Timeline and Data changes

  • A complying library will present support for data requiring change history, via transparent storage of diffs.
  • A complying library will present support for automatic update of compound data when components changed.
  • Representation units:

  • time axis
  • cartesian axis
  • angular axis
  • user-defined axes
  • Previous: Overview
    Next: RFPs

    Appendix A. Possible SIZE_UNIT values

        text size:
        CHARACTER
        WORD
        PARAGRAPH
        COLUMN
        
        size:
        CENTIMETER
        METER
        KILOMETER
        
        screen size:
        PIXEL
        
        angle:
        RADIAN
        DEGREE
        MINUTE
        
        temperatue:
        CELSIUS
        FARENHEIT
        KELVIN
        
        time:
        MILLISECOND
        SECOND
        MINUTE
        HOUR
        DAY
        WEEK
        MONTH
        YEAR
        CENTURY
        LIGHTYEAR