Introduction
This is tinyscript, a scripting language designed for very tiny machines. The initial target is boards using the Parallax Propeller, which has 32KB of RAM, but the code is written in ANSI C so it should work on any platform (e.g. testing is done on x86-64 Linux).
On the propeller, the interpreter code needs about 3K of memory in CMM mode or 5K in LMM. On the x86-64 the interpreter code is 6K. The size of the workspace you give to the interpreter is up to you, although in practice it would not be very useful to use less than 2K of RAM. The processor stack is used as well, so it will need some space.
tinyscript is copyright 2016-2021 Total Spectrum Software Inc. and released under the MIT license. See the COPYING file for details.
The Language
The scripting language itself is pretty minimalistic. The grammar for it looks like:
<program> ::= <stmt> | <stmt><sep><program>
<stmt> ::= <vardecl> | <arrdecl> | <funcdecl> |
| <assignment> | <ifstmt>
| <whilestmt> | <funccall>
| <printstmt> | <returnstmt>
The statements in a program are separated by newlines or ';'.
Either variables or functions may be declared.
<vardecl> ::= "var" <assignment>
<arrdecl> ::= "array" <symbol> "(" <number> ")" | "array" <symbol>
<funcdecl> ::= "func" <symbol> "(" <varlist> ")" <string>
<assignment> ::= <symbol> "=" <expr>
<varlist> ::= <symbol> [ "," <symbol> ]+
Variables must always be given a value when declared (unless they are arrays). All non-array variables simply hold 32 bit quantities, normally interpreted as an integer. The symbol in an assignment outside of a vardecl must already have been declared.
Arrays are simple one dimensional arrays. Support for arrays does add a little bit of code, so they are optional (included if ARRAY_SUPPORT is defined in tinyscript.h). If the array declaration includes a size, then a new (uninitialized) array is created. If it does not include a size, then it must match one of the enclosing function's parameters, in which case that parameter is checked and must be an array.
Array indices start at 0. Array index -1 is special and holds the length of the array,
Functions point to a string. When a procedure is called, the string is interpreted as a script (so at that time it is parsed using the language grammar). If a function is never called then it is never parsed, so it need not contain legal code if it is not called.
Strings may be enclosed either in double quotes or between { and }. The latter case is more useful for functions and similar code uses, since the brackets nest. Also note that it is legal for newlines to appear in {} strings, but not in strings enclosed by ".
<ifstmt> ::= "if" <expr> <string> [<elsepart>]
<elsepart> ::= "else" <string> | "elseif" <expr> [<elsepart>]
<whilestmt> ::= "while" <expr> <string> [<elsepart>]
As with functions, the strings in if and while statements are parsed and interpreted on an as-needed basis. Any non-zero expression is treated as true, and zero is treated as false. As a quirk of the implementation, it is permitted to add an "else" clause to a while statement; any such clause will always be executed after the loop exits.
<returnstmt> ::= "return" <expr>
Return statements are used to terminate a function and return a value to its caller.
<printstmt> ::= "print" <printitem> [ "," <printitem>]+
<printitem> ::= <string> | <expr>
Expressions are built up from symbols, numbers (decimal or hex integers), and operators. The operators have precedence levels 1-4. Level 0 of expressions is the most basic, consisting of numbers or variables optionally preceded by a unary operator:
<expr0> ::= <symbol> | <number>
| <unaryop><expr0>
| "(" <expr> ")"
| <builtincall>
<funccall> ::= <symbol> "(" [<exprlist>] ")"
<exprlist> ::= <expr> ["," <expr>]*
<number> ::= <digit>+ | "0x"<digit>+ | "'"<asciicharsequence>"'"
<asciicharsequence> ::= <printableasciichar> | "\'" | "\\" | "\n" | "\t" | "\r"
<printableasciichar> ::= ' ' to '~' excluding ' and \
<expr1> ::= <expr0> [<binop1> <expr0>]*
<binop1> ::= "*" | "/" | "%"
<expr2> ::= <expr1> [<binop2> <expr2>]*
<binop2> ::= "+" | "-"
<expr3> ::= <expr2> [<binop3><expr3>]*
<binop3> ::= "&" | "|" | "^" | "<<" | ">>"
<expr4> ::= <expr3> [<binop4><expr4>]*
<binop4> ::= "=" | "<>" | ">" | "<" | ">=" | "<="
<unaryop> ::= <binop1> | <binop2> | <binop3> | <binop4>
Builtin functions are defined by the runtime, as are operators. The ones
listed above are merely the ones defined by default. Operators may use
any of the characters !=<>+-*/&|^%
. Any string of the characters
!=<>&|^
is processed together, but the operator characters +-*/
may only
appear on their own.
Note that any operator may be used as a unary operator, and in this case
<op>x
is interpreted as 0 <op> x
for any operator <op>
. This is useful
for +
and -
, less so for other operators.
%
is the modulo operator, as in C.
Variable Scope
Variables are dynamically scoped. For example, in:
var x=2
func printx() {
print x
}
func myfunc() {
var x=3
printx()
}
invoking myfunc
will cause 3 to be printed, not 2 as in statically scoped
languages.
Interface to C
Environment Requirements
The interpreter is quite self-contained. The functions needed to
interface with it are outchar
(called to print a single character),
and memcpy
. outchar
is the only one of these that is
non-standard. It takes a single int as parameter and prints it as a
character. This is the function the interpreter uses for output
e.g. in the print
statement.
Optionally you can provide a function TinyScript_Stop() to check whether a running script should stop. To do this, edit the tinyscript.h file to remove the default definition (0) for TinyScript_Stop().
Configuration
Language configuration options are in the form of defines at the top of tinyscript.h:
VERBOSE_ERRORS - gives better error messages (costs a tiny bit of space)
SMALL_PTRS - use 16 bits for pointers (for very small machines)
ARRAY_SUPPORT - include support for integer arrays
The demo app main.c has some configuration options in the Makefile:
READLINE - use the GNU readline library for entering text
LINENOISE - use the linenoise.c library for entering text
Standard Library
There is an optional standard library in tinyscript_lib.{c,h} that adds
strlen
as a standard requirement and requires the user to define two
functions: ts_malloc
and ts_free
. These can be wrappers for malloc
/free
or perhaps pvPortMalloc
/ vPortFree
on FreeRTOS systems.
Application Usage
As mentioned above, the function outchar
must be defined by the application
to allow for printing. The following definition will work for standard C:
#include <stdio.h>
void outchar(int c) { putchar(c); }
Embedded systems may want to provide a definition that uses the serial port or an attached display.
For the optional standard library ts_malloc
and ts_free
must be defined by
the application. The following definitions will work for standard C:
#include <stdlib.h>
void * ts_malloc(Val size) {
return malloc(size);
}
void ts_free(void * pointer) {
free(pointer);
}
The application must initialize the interpreter with TinyScript_Init
before
making any other calls. TinyScript_Init
takes two parameters: the base
of a memory region the interpreter can use, and the size of that region.
It returns TS_ERR_OK
on success, or an error on failure. It is recommended
to provide at least 2K of space to the interpreter.
If TinyScript_Init
succeeds, the application may then define builtin
symbols with `TinyScript_Define(name, BUILTIN,