Skip to content

Instantly share code, notes, and snippets.

@ruv

ruv/01-terms.md Secret

Last active February 11, 2023 20:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ruv/af796cece2ecd2ee541d883a04483dcc to your computer and use it in GitHub Desktop.
Save ruv/af796cece2ecd2ee541d883a04483dcc to your computer and use it in GitHub Desktop.
Term definitions for discussions regarding recognizers

Additional terms and notations

tuple: a logical ordered union of several elements; when a tuple is placed into the data stack, the rightmost element in writing is the topmost on the stack, and floating-point numbers are placed into the floating-point stack.

lexeme: a syntactic unit of a program (a Forth source code); (unless otherwise noted, it is a sequence of non-blank characters delimited by a blank).

to recognize a lexeme: to determine the interpretation semantics and the compilation semantics for the lexeme in the current dynamic context.

to interpret a lexeme: to perform the interpretation semantics for the lexeme in the current dynamic context.

to compile a lexeme: to perform the compilation semantics for the lexeme in the current dynamic context.

to translate a lexeme: to interpret the lexeme if interpreting, or to compile the lexeme if compiling.

dynamic context of a lexeme: information that is available at the time the lexeme is translated.

unqualified token: a tuple of data objects that determines the interpretation semantics and the compilation semantics for a lexeme in its dynamic context.

token: unqualified token (a synonym, when it is clear from context).

to interpret a token: to perform the interpretation semantics that are determined by the token.

to compile a token: to perform the compilation semantics that are determined by the token.

to translate a token: to interpret the token if interpreting, or to compile the token if compiling.

token translator: a Forth definition that translates a token; also, depending on context, an execution token for this Forth definition.

resolver: a Forth definition that tries to recognize a lexeme producing a tuple of a token and its token translator.

token descriptor object: an implementation dependent data object (a set of information) that describes how to interpret and how to compile a token.

token descriptor: a value that identifies a token descriptor object; also, less formally and depending on context, a Forth definition that just returns this value, or a token descriptor object itself.

fully qualified token: a tuple of a token and its token descriptor.

recognizer: a Forth definition that tries to recognize a lexeme producing a fully qualified token.

simple recognizer: a recognizer that may produce the same token descriptor only.

compound recognizer: a recognizer that can produce the different token descriptors.

perceptor: a recognizer that is currently used by the Forth text interpreter to translate a lexeme.

current recognizer: the perceptor (an unformal synonym).

default perceptor: the perceptor before it was changed by a program (or after reverting these changes).

Comparison of terminology to some past proposals

Comparison to Recognizer API v1 (orig), Recognizer API v4 (orig) and Recognizer API v1 rephrase 2020

Some term names

v1 v4 v1 rephrase 2020 suggestion for v5
string; sub-string string; substring of the input buffer string lexeme
parsed data parsed data parsed data token; unqualified token
set of data handling words; data handling set; method table data type information; set of data handling words; data handling set; method set method table token descriptor object
information token data type id; data type information; recognizer type recognizer information token token descriptor
? data type word token descriptor (informal); token descriptor word (a Forth definition that just returns a token descriptor)
information token together with parsed data data type id together with parsed data recognizer information token together with parsed data fully qualified token
to analyze string; to parse text to analyse string; to analyze sub-string to handle the passed string to recognize a lexeme
parsing word; parsing method string parsing word; parsing word; recognizing word; recognizer text parsing word; parsing word; recognizer recognizer
recognizer recognizer concept ?
simple recognizer
compound recognizer
recognizer sequence
recognizer stack; current recognizer stack a system recognizer sequence id recognize-order; current recognizer-order perceptor
default perceptor
to handle data in the Forth context; to handle the result just like an ordinary dictionary search: interpret, compile; to perform the data processing to handle data; to handle data in the Forth context; to process data in the text interpreter; to perform the various semantics of the data: interpret, compile and postpone; to perform data processing to perform the data processing of the interpreter to translate a token
token translator
to perform the interpretation semantics of data -//- ? to interpret a token
to perform the compilation semantics of data -//- ? to compile a token
to perform the postponing semantics of data ? ? to postpone a token
to perform the postponing semantics of data to reproduce a token
processing action; method to perform some semantics of the data handling method; data processing method data processing method
interpret action interpretation action ? token interpreter
compile action compilation action ? token compiler
postpone action ? token postponer
postpone action token reproducer

Some term definitions

recognizer

version definition
v1 recognizer: a combination of a parsing word and the set of data handling words to deal with the data
v1 recognizer: a combination of a text parsing word that returns information tokens together with parsed data if successful.
v4 recognizer: a string parsing word that returns a data type id together with the parsed data if successful.
v1 rephrase 2020 recognizer: a combination of a text parsing word that returns recognizer information tokens together with parsed data if successful.
suggestion for v5 recognizer: a Forth definition that tries to recognize a lexeme producing a fully qualified token.

token descriptor

version definition
v1 information token: A single cell number. It identifies the datatype and a method table to perform the data processing of the interpreter.
v4 data type id: A cell sized number. It identifies the data type and a method set to perform the data processing in the text interpreter.
v1 rephrase 2020 recognizer information token: an implementation-dependent single-cell value that identifies the data type and a method table to perform the data processing of the interpreter.
suggestion for v5 token descriptor object: an implementation dependent data object (a set of information) that describes how to interpret and how to compile a token; token descriptor: a value that identifies a token descriptor object.
@ruv
Copy link
Author

ruv commented Jun 25, 2020

Short URL: https://git.io/JfhaI
Short ULR to the file of comparison: https://git.io/JfjoJ

The latest version of this term definitions file is in the dedicated repository: terms-and-datatypes.md (some rationales in Issue#2).

Perhaps we don't need the both terms: "tuple" and "token". It seems, the term "tuple" is enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment