BASIC token
A BASIC token is a single-byte representation of a BASIC keyword: Whenever the user edits or creates a BASIC line, any keywords are replaced by their respective token, and conversely; when the user LISTs the BASIC program, the tokens are displayed as the keywords they represent, in "plain text". Tokens only ever exist in the program storage area in memory, they are not visible to the user. Because they are a single byte, the resulting tokenized code requires less storage than the original text format.
BASIC tokens are not to be confused with BASIC keyword abbreviations.
Tokens are distinguished from "plain" PETSCII characters by the fact that token codes are always greater than or equal to 128/$80; i.e. the most significant bit in the byte representing a token is always set. This allows the interpreter to easily distinguish between tokens and other text at runtime. Note that only the keywords are tokenized in MS BASICs, other items like variable names, numeric and string constants, line numbers and other items remain in their original text format. Tokenization takes place when a line is entered or modified; the user's text is initially placed in a temporary area in memory, a buffer, and a routine known as the "chunker" copies it into the program storage area, replacing keywords with tokens as it goes.
Commodore BASIC V2 has 68 keywords and 8 operators, which are assigned token codes in the range from 128/$80 thru 203/$CB. Since the code 255/$FF is reserved for the "pi" character, this leaves 51 unused tokens in the range from 204–254/$CC–FE: Because of the ample use of vectors in the BASIC system, third-party BASIC expansions may use these for additional BASIC commands.
Example[edit | edit source]
If you type NEW, and then enter the BASIC program line
10 PRINT "HELLO WORLD"
the BASIC program memory (starting from 2049/$0801) will contain:
Location | Data | Description |
---|---|---|
2049/$0801 | 21/$15 8/$08 | Pointer to beginning of "next" BASIC line, in low-byte/high-byte order |
2051/$0803 | 10/$0A 0/$00 | BASIC line number "10", in low-byte/high-byte order |
2053/$0805 | 153/$99 | The token for the PRINT keyword |
2054/$0806 | 32/$20 34/$22 | SPACE and quote characters following PRINT |
2056/$0808 ... 2066/$0812 | PETSCII codes for the "hello world" text | |
2067/$0813 | 34/$22 | Quote at end of PRINTed text. |
2068/$0814 | 0/$00 | Zero-byte marking the end of the BASIC line. |
2069/$0815 | 0/$00 0/$00 | Two zero-bytes in place of the pointer to next BASIC line indicates the end of the program. |
This token system saves space in memory and helps in slightly speeding up execution time, both in the computer and when the program is saved to and retrieved from tape or disk.
Token-to-keyword conversion table[edit | edit source]
BASIC 2.0[edit | edit source]
The following table lists all the reserved keywords and symbols, according to their associated token code:
|
|
|
|
Notice that the tokens break down into four "groups", which mirrors four of the "classes" of keywords/symbols.
- Commands are in the range 128–162/$80–A2
- Various "bywords" that form part of the syntax of the keywords above fall in the 163–169/$A3–$A9 range
- Arithmetic and logic operators have token codes 170–179/$AA–$B3
- Functions are in the range 180–202/$B4–CA
In BASIC ROM, the System variables TIME, TIME$, and STATUS, are handled as exceptions in the routines for handling "normal" variables.
GO (token code 203/$CB) breaks this "rule"; it is actually handled as a command, despite it's location "behind" the functions in the token system.
Other Commodore BASICs[edit | edit source]
Other Commodore BASICs define additional tokens. BASIC 3.5, found in the Plus/4 and Commodore 16, and BASIC 7.0, found in the Commodore 128, fill most of the available space. These two BASICs recognize many of the same keywords. BASIC 7.0 repurposes the 206 token; instead of RLUM, it uses 206 as a shift code to make several two-byte tokens. It also uses the 254 code, unused in BASIC 3.5, to make more two-byte tokens.
BASIC 4.0, found in some PETs, provides some additional keywords, some of which occur also in BASIC 3.5 and 7.0.
Token code | BASIC 3.5 | BASIC 7.0 | BASIC 4.0 |
---|---|---|---|
204/$CC | RGR | RGR | CONCAT |
205/$CD | RCLR | RCLR | DOPEN |
206/$CE | RLUM | shift | DCLOSE |
206/$CE 2/$02 | POT | ||
206/$CE 3/$03 | BUMP | ||
206/$CE 4/$04 | PEN | ||
206/$CE 5/$05 | RSPPOS | ||
206/$CE 6/$06 | RSPRITE | ||
206/$CE 7/$07 | RSPCOLOR | ||
206/$CE 8/$08 | XOR | ||
206/$CE 9/$09 | RWINDOW | ||
206/$CE 10/$0A | POINTER | ||
207/$CF | JOY | JOY | RECORD |
208/$D0 | RDOT | RDOT | HEADER |
209/$D1 | DEC | DEC | COLLECT |
210/$D2 | HEX$ | HEX$ | BACKUP |
211/$D3 | ERR$ | ERR$ | COPY |
212/$D4 | INSTR | INSTR | APPEND |
213/$D5 | ELSE | ELSE | DSAVE |
214/$D6 | RESUME | RESUME | DLOAD |
215/$D7 | TRAP | TRAP | CATALOG |
216/$D8 | TRON | TRON | RENAME |
217/$D9 | TROFF | TROFF | SCRATCH |
218/$DA | SOUND | SOUND | DIRECTORY |
219/$DB | VOL | VOL | |
220/$DC | AUTO | AUTO | |
221/$DD | PUDEF | PUDEF | |
222/$DE | GRAPHIC | GRAPHIC | |
223/$DF | PAINT | PAINT | |
224/$E0 | CHAR | CHAR | |
225/$E1 | BOX | BOX | |
226/$E2 | CIRCLE | CIRCLE | |
227/$E3 | GSHAPE | GSHAPE | |
228/$E4 | SSHAPE | SSHAPE | |
229/$E5 | DRAW | DRAW | |
230/$E6 | LOCATE | LOCATE | |
231/$E7 | COLOR | COLOR | |
232/$E8 | SCNCLR | SCNCLR | |
233/$E9 | SCALE | SCALE | |
234/$EA | HELP | HELP | |
235/$EB | DO | DO | |
236/$EC | LOOP | LOOP | |
237/$ED | EXIT | EXIT | |
238/$EE | DIRECTORY | DIRECTORY | |
239/$EF | DSAVE | DSAVE | |
240/$F0 | DLOAD | DLOAD | |
241/$F1 | HEADER | HEADER | |
242/$F2 | SCRATCH | SCRATCH | |
243/$F3 | COLLECT | COLLECT | |
244/$F4 | COPY | COPY | |
245/$F5 | RENAME | RENAME | |
246/$F6 | BACKUP | BACKUP | |
247/$F7 | DELETE | DELETE | |
248/$F8 | RENUMBER | RENUMBER | |
249/$F9 | KEY | KEY | |
250/$FA | MONITOR | MONITOR | |
251/$FB | USING | USING | |
252/$FC | UNTIL | UNTIL | |
253/$FD | WHILE | WHILE | |
254/$FE | none | shift | |
254/$FE 2/$02 | BANK | ||
254/$FE 3/$03 | FILTER | ||
254/$FE 4/$04 | PLAY | ||
254/$FE 5/$05 | TEMPO | ||
254/$FE 6/$06 | MOVSPR | ||
254/$FE 7/$07 | SPRITE | ||
254/$FE 8/$08 | SPRCOLOR | ||
254/$FE 9/$09 | RREG | ||
254/$FE 10/$0A | ENVELOPE | ||
254/$FE 11/$0B | SLEEP | ||
254/$FE 12/$0C | CATALOG | ||
254/$FE 13/$0D | DOPEN | ||
254/$FE 14/$0E | APPEND | ||
254/$FE 15/$0F | DCLOSE | ||
254/$FE 16/$10 | BSAVE | ||
254/$FE 17/$11 | BLOAD | ||
254/$FE 18/$12 | RECORD | ||
254/$FE 19/$13 | CONCAT | ||
254/$FE 20/$14 | DVERIFY | ||
254/$FE 21/$15 | DCLEAR | ||
254/$FE 22/$16 | SPRSAV | ||
254/$FE 23/$17 | COLLISION | ||
254/$FE 24/$18 | BEGIN | ||
254/$FE 25/$19 | BEND | ||
254/$FE 26/$1A | WINDOW | ||
254/$FE 27/$1B | BOOT | ||
254/$FE 28/$1C | WIDTH | ||
254/$FE 29/$1D | SPRDEF | ||
254/$FE 30/$1E | QUIT | ||
254/$FE 31/$1F | STASH | ||
254/$FE 32/$20 | none | ||
254/$FE 33/$21 | FETCH | ||
254/$FE 34/$22 | none | ||
254/$FE 35/$23 | SWAP | ||
254/$FE 36/$24 | OFF | ||
254/$FE 37/$25 | FAST | ||
254/$FE 38/$26 | SLOW |
All of the additional tokens in BASIC 4.0 occur in BASIC 7.0, but with different token codes. Some also occur in BASIC 3.5.
Keyword | BASIC 4.0 | BASIC 3.5 | BASIC 7.0 |
---|---|---|---|
CONCAT | 204/$CC | 254/$FE 19/$13 | |
DOPEN | 205/$CD | 254/$FE 13/$0D | |
DCLOSE | 206/$CE | 254/$FE 15/$0F | |
RECORD | 207/$CF | 254/$FE 18/$02 | |
HEADER | 208/$D0 | 241/$F1 | 241/$F1 |
COLLECT | 209/$D1 | 243/$F3 | 243/$F3 |
BACKUP | 210/$D2 | 246/$F6 | 246/$F6 |
COPY | 211/$D3 | 244/$F4 | 244/$F4 |
APPEND | 212/$D4 | 254/$FE 14/$0E | |
DSAVE | 213/$D5 | 239/$EF | 239/$EF |
DLOAD | 214/$D6 | 240/$F0 | 240/$F0 |
CATALOG | 215/$D7 | 254/$FE 12/$0C | |
RENAME | 216/$D8 | 245/$F5 | 245/$F5 |
SCRATCH | 217/$D9 | 242/$F2 | 242/$F2 |
DIRECTORY | 218/$DA | 238/$EE | 238/$EE |