Note: This is subject to future work. Errors, ambiguities, and omissions are to be expected.
TAO is a clay that can take any shape. It includes only the most generic data types: trees and annotations (notes) to go with them. Any data type can be created out of those without need for specialized syntax.
Trees can be constrained in different ways to encode any data structure. Notes can be constrained to encode any data primitive.
All data in TAO naturally has implicit types, but explicit type annotations may be included as well via ops. The preferred typing discipline may be enforced or they can be mixed.
All this can be achieved within the minimal syntax of TAO. Alternatively, a custom syntax may be laid on top. Although to reap the benefits of a UTAS, common data types should be represented in a standard way across domains.
Conventions are proposed here for sensible representations of the basic shapes1 based on common patterns of various languages. This is a starting point for future standardization. Experimental adoption of TAO is needed to help that.
Generalizations of these conventions are independent of TAO and may be used with other syntaxes as well.
In practice, it is up to TAO endpoints2 to decide on a mutually intelligible data representation on top of TAO, as is the case with any other TAS. TAO though is exceptionally flexible in that regard, offering maximum freedom.
The fundamental data types of TAO are defined by its grammar:
tao
is zero or more parts each of which is a tree
or a note
or an op
tree
is a tao
inbetween a [
and a ]
.note
is one or more symbols that are not [
or ]
or `
.op
is the meta symbol `
followed by any
symbol.These are the building blocks of other patterns which will now be described in terms of them.
A value
is a synonym for tao
.
An emptiness
is a tao
with zero parts, i.e. an empty string.
It may be sometimes useful to define and distinguish different kinds of emptiness. This could be achieved with type annotations:
emptiness is []
a nil is [`: nil]
a null is [`: null]
a none is [`: none]
a unit is [`: unit]
`; or with prefix annotations:
a void is`: void []
a nothing is`: nothing []
an undefined is`: undefined []
an empty string is`: string []
These may be interpreted as different kinds of emptiness
.
A space
is a note
which contains only whitespace.
For example:
[ ]
is a tao
which contains a tree
with a space
.
A matter
is a note
which contains at least one symbol other than whitespace.
matter
A key
is usually a matter
. A tao
with 3 different keys:
this is a key []
as well as this []
and this []
Synonyms: scalar, a basic or simple data type.
A primitive
is a tao
with zero or more parts which are all notes
or ops
. Note that two consecutive notes
are impossible by TAO’s grammar.
this is a primitive
A number
may be a primitive which conforms to a specific number grammar.
For example this tao
conforms to a JSON definition of a number:
36.5
This however does not:
0xcafe
even though it may be reasonably interpreted as a number according to some other grammar.
It may be useful to distinguish between different kinds of numbers, such as integers, floating point, big, short, etc. Domain-specific grammars may be used for that.
Other specific primitives, such as booleans
may be defined analogously to numbers
. For example:
is tao [true]
is dictionary [true]
is primitive [false]
Might be a dictionary
with 3 boolean
entries.
A generic
primtive may be defined as a a primitive which does not conform to any known grammar.
For example:
this is a generic primitive
In various syntaxes strings are commonly used as such a generic fallback data type when there is no adequate primitive defined in the grammar of the syntax itself. Such a catch-all primitive type is often indispensable.
A string
may be defined as synonymous with generic
or by a specific grammar:
"this is a string because it begins with a quote mark
or as a primitive
with an explicit type annotation:
42`: string
Depending on the level of restriciveness desired.
Synonyms: a composite or compound or complex data type.
A structure
is a tao
with at least one part which is a tree
.
Synonyms: array, vector, sequence.
A list
is a structure
with no notes
or ops
or with notes
that are all spaces
.
[parsley][sage][rosemary][thyme]
or equivalent:
[parsley]
[sage]
[rosemary]
[thyme]
The same as a JSON array:
["parsley", "sage", "rosemary", "thyme"]
See also a note for the extreme syntactic minimalist.
A pair
is a structure
with a single key
followed by a single tree
. The value
of the tree
is considered the value
of the pair
.
Synonyms: dict, object, record, struct, map, symbol table, hash table, keyed list, associative array.
A map
is a structure
with one or more pairs
. This means that a single pair
is the simplest dictionary
.
songs [ [ title [Scarborough Fair / Canticle] length [3:10] ] [ title [Patterns] length [2:45] ] [ title [Cloudy] length [2:15] ] ]
Equivalent JSON object:
{
"songs": [
{
"title": "Scarborough Fair / Canticle",
"length": "3:10"
},
{
"title": "Patterns",
"length": "2:45"
},
{
"title": "Cloudy",
"length": "2:15"
}
]
}
See also a note on orderedness, on concatenability, and on keys.
Different variants of the above definitions are possible, but these capture the general theme.
For example if we introduce comments, type annotations then the above definitions have to be adjusted to accomodate them.