TAO: shapes

Note: This is subject to future work. Errors, ambiguities, and omissions are to be expected.

TAO is a clay that can take any shape. It includes only the most generic data types: trees and annotations (notes) to go with them. Any data type can be created out of those without need for specialized syntax.

Trees can be constrained in different ways to encode any data structure. Notes can be constrained to encode any data primitive.

All data in TAO naturally has implicit types, but explicit type annotations may be included as well via ops. The preferred typing discipline may be enforced or they can be mixed.

All this can be achieved within the minimal syntax of TAO. Alternatively, a custom syntax may be laid on top. Although to reap the benefits of a UTAS, common data types should be represented in a standard way across domains.

Conventions are proposed here for sensible representations of the basic shapes1 based on common patterns of various languages. This is a starting point for future standardization. Experimental adoption of TAO is needed to help that.

Generalizations of these conventions are independent of TAO and may be used with other syntaxes as well.

In practice, it is up to TAO endpoints2 to decide on a mutually intelligible data representation on top of TAO, as is the case with any other TAS. TAO though is exceptionally flexible in that regard, offering maximum freedom.

Fundamental

The fundamental data types of TAO are defined by its grammar:

These are the building blocks of other patterns which will now be described in terms of them.

Value

A value is a synonym for tao.

Emptiness

An emptiness is a tao with zero parts, i.e. an empty string.

It may be sometimes useful to define and distinguish different kinds of emptiness. This could be achieved with type annotations:

emptiness is []

a nil is [`: nil]
a null is [`: null]
a none is [`: none]
a unit is [`: unit]

`; or with prefix annotations:

a void is`: void []
a nothing is`: nothing []
an undefined is`: undefined []
an empty string is`: string []

These may be interpreted as different kinds of emptiness.

Space

A space is a note which contains only whitespace.

For example:

[   ]

is a tao which contains a tree with a space.

Matter

A matter is a note which contains at least one symbol other than whitespace.

matter

Key

A key is usually a matter. A tao with 3 different keys:

this is a key []
as well as this []
and this []

Primitive

Synonyms: scalar, a basic or simple data type.

A primitive is a tao with zero or more parts which are all notes or ops. Note that two consecutive notes are impossible by TAO’s grammar.

this is a primitive

Number

A number may be a primitive which conforms to a specific number grammar.

For example this tao conforms to a JSON definition of a number:

36.5

This however does not:

0xcafe

even though it may be reasonably interpreted as a number according to some other grammar.

It may be useful to distinguish between different kinds of numbers, such as integers, floating point, big, short, etc. Domain-specific grammars may be used for that.

Other specific primitives

Other specific primitives, such as booleans may be defined analogously to numbers. For example:

is tao [true]
is dictionary [true]
is primitive [false]

Might be a dictionary with 3 boolean entries.

Generic

A generic primtive may be defined as a a primitive which does not conform to any known grammar.

For example:

this is a generic primitive

In various syntaxes strings are commonly used as such a generic fallback data type when there is no adequate primitive defined in the grammar of the syntax itself. Such a catch-all primitive type is often indispensable.

String

A string may be defined as synonymous with generic or by a specific grammar:

"this is a string because it begins with a quote mark

or as a primitive with an explicit type annotation:

42`: string

Depending on the level of restriciveness desired.

Structures

Synonyms: a composite or compound or complex data type.

A structure is a tao with at least one part which is a tree.

List

Synonyms: array, vector, sequence.

A list is a structure with no notes or ops or with notes that are all spaces.

[parsley][sage][rosemary][thyme]

or equivalent:

[parsley]
[sage]
[rosemary]
[thyme]

The same as a JSON array:

["parsley", "sage", "rosemary", "thyme"]

See also a note for the extreme syntactic minimalist.

Pair

A pair is a structure with a single key followed by a single tree. The value of the tree is considered the value of the pair.

Dictionary

Synonyms: dict, object, record, struct, map, symbol table, hash table, keyed list, associative array.

A map is a structure with one or more pairs. This means that a single pair is the simplest dictionary.

songs [
    [
        title [Scarborough Fair / Canticle]
        length [3:10]
    ]
    [
        title [Patterns]
        length [2:45]
    ]
    [
        title [Cloudy]
        length [2:15]
    ]
]

Equivalent JSON object:

{
    "songs": [
        {
            "title": "Scarborough Fair / Canticle",
            "length": "3:10"
        },
        {
            "title": "Patterns",
            "length": "2:45"
        },
        {
            "title": "Cloudy",
            "length": "2:15"
        }
    ]
}

See also a note on orderedness, on concatenability, and on keys.

Comments, type annotations, and other extensions

Different variants of the above definitions are possible, but these capture the general theme.

For example if we introduce comments, type annotations then the above definitions have to be adjusted to accomodate them.


  1. The terms data type or pattern or shape may be used interchangeably.↩︎

  2. A TAO endpoint is any entity that reads or writes TAO.↩︎