Published close to 02.02.2020 02:02:20.20 CET.
This specification introduces an anonymous simple data interchange Format2, with potential advantages over JSON, YAML, XML, S-expressions, TOML, HOCON, and others.
In describing the Format, JSON will be used as a reference. Reading the description at JSON.org is advised.
The Format is very simple. It is essentially text in which paired Brackets ([
and ]
) define structural relationships (i.e. nesting). On top of that there is a generalized Escape mechanism which allows encoding Brackets as part of text as well as mixing in more sophisticated grammars.
The simplicity and the very small number of building blocks or syntactical elements make the Format easier to read and write than any other format, both for humans and machines. The Format is completely programming language-independent and it is trivial to write a parser for it.
There exists only one structural element (the Pair) on top of which others (e.g. conventional dictionaries or lists) can be built without the need for separate syntax (as for objects and arrays in JSON).
Opposite to that there is only one non-structural element (the Fragment) on top of which others can be defined (strings, numbers, booleans, …).
You may want to look at some examples before proceeding.
The Grammar is (notation borrowed from JSON.org):
Document Fragments Pairs Fragments Fragment Fragment Fragments Fragment Escape Characters Escape Begin Splice Escaped Pairs Pair Pair Pairs Pair Fragments Begin Document End Begin '[' End ']' Splice '\' Block '>' Characters Possibly empty sequence of characters excluding Brackets. Whitespace Same as JSON's whitespace. Escaped See The Escape mechanism.
The Grammar goes together with the very general and expressive Escape mechanism which, among other things, allows encoding Brackets as non-significant characters in Fragments.
An Escape sequence starts with Begin followed by Splice. The extent and meaning of the rest of the Escaped portion that follows varies depending on how it starts. Below is an incomplete semi-formal description of this:
if Escaped portion starts with: (Begin
| End
| Splice
)Character
then sequence terminates at: Character
and Resolves to: Character
example: [\[
Resolves to [
else if Escaped portion starts with: Whitespace
Characters
End
Terminator
then sequence terminates at: Terminator
and Resolves to: portion between Splice and Terminator
4
example: [\ text ]
Resolves to text
else if Escaped portion starts with: Block
Fragments
Terminator
End
Line
A
then sequence terminates at: first occurrence of Line
B
Terminator
Resolved
and Resolves to: portion between Line
A
and Line
B
example: See Heredoc Example
else if Escaped portion starts with: Fragments
Terminator
End
then sequence terminates at: first occurrence of Terminator
Resolved
and Resolves to: portion between End
and Terminator
Resolved
example: [\EOT]textEOT
Resolves to text
The general idea is that a Splice may be followed by a definition of a new grammar which supersedes the previous grammar. The definition should specify the grammar’s Terminator which, when encountered, restores the previous grammar.
The following Escape sequence:
[\>EOT]
text
EOT
Resolves to:
text
This functions similarly to the mechanism of here documents available in some programming languages and formats.
A Document encoded in the Format before further processing should undergo the process of Resolving which will transform all Escapes to their Resolved forms (as defined above) and join them together with adjacent Fragments into Resolved Fragments.
A List of values may be encoded as a sequence of Pairs where the Fragments are empty or consist entirely of Whitespace:
[parsley][sage][rosemary][thyme]
or equivalent:
[parsley] [sage] [rosemary] [thyme]
This would be equivalent to the following JSON array:
A Dictionary may be encoded as a sequence of Pairs where Fragments are non-empty and non-Whitespace:
Name [Parsley, Sage, Rosemary and Thyme] Artist [Simon & Garfunkel] Release date [October 10, 1966] Label [Columbia]
Equivalent JSON object:
{
"Name": "Parsley, Sage, Rosemary and Thyme",
"Artist": "Simon & Garfunkel",
"Release date": "October 10, 1966",
"Label": "Columbia"
}
Note that this encoding uses a convention where leading and trailing Whitespace is not part of the Dictionary’s keys. To have it included, it should be escaped:
[\ padded key ] [value]
Equivalent JSON:
Structural elements may be nested freely:
songs [ [ title [Scarborough Fair / Canticle] length [3:10] ] [ title [Patterns] length [2:45] ] [ title [Cloudy] length [2:15] ] ]
Equivalent JSON:
{
"songs": [
{
"title": "Scarborough Fair / Canticle",
"length": "3:10"
},
{
"title": "Patterns",
"length": "2:45"
},
{
"title": "Cloudy",
"length": "2:15"
}
]
}
Last updated 2020-02-02 at 03:00 CET.
Note: this is not finished or properly formalized but captures the essential idea and should suffice for preliminary assessment and implementation.↩︎
In this document a word that is Capitalized and italicized introduces a definition. References to this definition are capitalized but not italicized.↩︎
Note: This part of the specification is incomplete and unstable.↩︎
Portion between X and Y means that X and Y are themselves not included.↩︎