Note: This is subject to future work. Errors, ambiguities, and omissions are to be expected.
TAO is a clay that can take any shape. It includes only the most generic data types: trees and annotations (notes) to go with them. Any data type can be created out of those without need for specialized syntax.
Trees can be constrained in different ways to encode any data structure. Notes can be constrained to encode any data primitive.
All data in TAO naturally has implicit types, but explicit type annotations may be included as well via ops. The preferred typing discipline may be enforced or they can be mixed.
All this can be achieved within the minimal syntax of TAO. Alternatively, a custom syntax may be laid on top. Although to reap the benefits of a UTAS, common data types should be represented in a standard way across domains.
Conventions are proposed here for sensible representations of the basic shapes1 based on common patterns of various languages. This is a starting point for future standardization. Experimental adoption of TAO is needed to help that.
Generalizations of these conventions are independent of TAO and may be used with other syntaxes as well.
In practice, it is up to TAO endpoints2 to decide on a mutually intelligible data representation on top of TAO, as is the case with any other TAS. TAO though is exceptionally flexible in that regard, offering maximum freedom.
The fundamental data types of TAO are defined by its grammar:
tao is zero or more parts each of which is a tree or a note or an optree is a tao inbetween a [ and a ].note is one or more symbols that are not [ or ] or `.op is the meta symbol ` followed by any symbol.These are the building blocks of other patterns which will now be described in terms of them.
A value is a synonym for tao.
An emptiness is a tao with zero parts, i.e. an empty string.
It may be sometimes useful to define and distinguish different kinds of emptiness. This could be achieved with type annotations:
emptiness is []
a nil is [`: nil]
a null is [`: null]
a none is [`: none]
a unit is [`: unit]
`; or with prefix annotations:
a void is`: void []
a nothing is`: nothing []
an undefined is`: undefined []
an empty string is`: string []
These may be interpreted as different kinds of emptiness.
A space is a note which contains only whitespace.
For example:
[ ]
is a tao which contains a tree with a space.
A matter is a note which contains at least one symbol other than whitespace.
matter
A key is usually a matter. A tao with 3 different keys:
this is a key []
as well as this []
and this []
Synonyms: scalar, a basic or simple data type.
A primitive is a tao with zero or more parts which are all notes or ops. Note that two consecutive notes are impossible by TAO’s grammar.
this is a primitive
A number may be a primitive which conforms to a specific number grammar.
For example this tao conforms to a JSON definition of a number:
36.5
This however does not:
0xcafe
even though it may be reasonably interpreted as a number according to some other grammar.
It may be useful to distinguish between different kinds of numbers, such as integers, floating point, big, short, etc. Domain-specific grammars may be used for that.
Other specific primitives, such as booleans may be defined analogously to numbers. For example:
is tao [true]
is dictionary [true]
is primitive [false]
Might be a dictionary with 3 boolean entries.
A generic primtive may be defined as a a primitive which does not conform to any known grammar.
For example:
this is a generic primitive
In various syntaxes strings are commonly used as such a generic fallback data type when there is no adequate primitive defined in the grammar of the syntax itself. Such a catch-all primitive type is often indispensable.
A string may be defined as synonymous with generic or by a specific grammar:
"this is a string because it begins with a quote mark
or as a primitive with an explicit type annotation:
42`: string
Depending on the level of restriciveness desired.
Synonyms: a composite or compound or complex data type.
A structure is a tao with at least one part which is a tree.
Synonyms: array, vector, sequence.
A list is a structure with no notes or ops or with notes that are all spaces.
[parsley][sage][rosemary][thyme]
or equivalent:
[parsley]
[sage]
[rosemary]
[thyme]
The same as a JSON array:
["parsley", "sage", "rosemary", "thyme"]See also a note for the extreme syntactic minimalist.
A pair is a structure with a single key followed by a single tree. The value of the tree is considered the value of the pair.
Synonyms: dict, object, record, struct, map, symbol table, hash table, keyed list, associative array.
A map is a structure with one or more pairs. This means that a single pair is the simplest dictionary.
songs [ [ title [Scarborough Fair / Canticle] length [3:10] ] [ title [Patterns] length [2:45] ] [ title [Cloudy] length [2:15] ] ]
Equivalent JSON object:
{
"songs": [
{
"title": "Scarborough Fair / Canticle",
"length": "3:10"
},
{
"title": "Patterns",
"length": "2:45"
},
{
"title": "Cloudy",
"length": "2:15"
}
]
}See also a note on orderedness, on concatenability, and on keys.
Different variants of the above definitions are possible, but these capture the general theme.
For example if we introduce comments, type annotations then the above definitions have to be adjusted to accomodate them.