Efficient BASIC coding for the ZX Spectrum (II)

[Click here to read this in English ]

Éste es el segundo de una serie de artículos que explican los fundamentos de la (in)eficiencia de los programas en BASIC puro para el ZX Spectrum:
I. Sobre los números de línea
II. Sobre las variables
III. Sobre las expresiones
IV. Funcionalidades diversas y medida del tiempo
V. Operaciones en la pantalla basadas en caracteres

En la entrega anterior de esta serie de artículos explicábamos que el intérprete de BASIC de la ROM del ZX Spectrum no dispone de una tabla de acceso aleatorio para las líneas de programa, lo que incrementa el tiempo de ejecución a la hora de buscar una concreta, pasando de coste computacional constante a lineal.

En esta segunda parte descubriremos un problema análogo con las variables:

El intérprete no dispone de ninguna información que relacione en tiempo constante el nombre de una variable con la dirección de memoria donde se halla

Hay que reconocer que esto hubiera sido complicado de incluir en aquella época, pues hubiera implicado implementar una estructura de datos más compleja que una simple tabla, debido a que la clave de búsqueda es una cadena de texto (el nombre de la variable) y no un número. A no ser que se hubiera usado un hash, no hubiera tenido coste de O(1) en ningún caso, por lo que tampoco habría merecido mucho la pena en una máquina tan limitada como el ZX.

En cualquier caso, la consecuencia de esta carencia es que encontrar una variable en memoria tiene un coste en el peor caso de O(n*m), donde n es el número de variables existentes y m la longitud del nombre más largo de variable (numérica) definida en el programa. Indicamos numérica porque las variables de texto, las tablas y las de bucle FOR no pueden tener más de una letra en su nombre.

Este proceso de búsqueda es igual al de los números de línea: el intérprete empieza con un puntero a la primera variable en memoria, comprueba si es la que busca recorriendo las letras de su nombre, y, en caso contrario, incrementa el puntero en base al espacio ocupado en memoria por la variable para pasar a comprobar la siguiente.

Por tanto, aquí podemos recomendar, para incrementar la velocidad de ejecución del código, lo siguiente:

Aquellas variables que se usen más a menudo o en zonas que requieran un tiempo de ejecución crítico a) deben crearse antes en el programa y b) deben tener nombres más cortos.

En otras palabras: es conveniente que exista una parte del programa que se encargue de dar valor a todas las variables, en orden de uso decreciente, y que se ejecute al comenzar (debería estar situada al final del programa, por lo que explicamos en la primera entrega de esta serie). También, deberíamos usar nombres cortos en el caso de variables numéricas escalares de uso frecuente. De esa manera las más utilizadas ocuparán lugares iniciales en la zona de variables y se encontrarán más rápido.

A este respecto, la utilidad de transformación de código de ZX-Basicus tiene una opción (--shortenv) que acorta automáticamente los nombres de todas las variables numéricas de nombres largos que pueda. Su contrapartida es que los nombres acortados ya no serán tan comprensibles, por lo que es conveniente usarla sobre un código ya terminado para producir el optimizado.

El programador de juegos BASIC @igNaCoBo me ha sugerido hacer una medida de tiempos incurridos por el intérprete al buscar variables, pues había detectado que a partir de un número determinado de ellas es más conveniente utilizar la memoria directamente para, al menos, las variables de tamaño byte (mediante POKE / PEEK) en lugar de crear esas variables. Para hacer estas medidas se puede partir del siguiente programa:

1  LET a = 1000 :  POKE 23672 , 0 :  POKE 23673 , 0 :  POKE 23674 , 0
2  LET a = a – 1 :  IF a > 0 THEN  GOTO 2
3  LET t =  PEEK 23672 + 256 *  PEEK 23673 + 65536 *  PEEK 23674 :  LET v =  PEEK 23641 + 256 *  PEEK 23642 –  (  PEEK 23627 + 256 *  PEEK 23628 )  :  PRINT “vars size: “ ; v ; “… time: “ ;  ( t * 0.02 / 1000 )  ; ” secs” :  LPRINT v ,  ( t * 0.02 / 1000 )  :  STOP 
10  LET z = 0
11  LET y = 0
12  LET x = 0
13  LET w = 0
14  LET v = 0
15  LET u = 0
16  LET t = 0
17  LET s = 0
18  LET r = 0
19  LET q = 0
20  LET p = 0
21  LET o = 0
22  LET n = 0
23  LET m = 0
24  LET l = 0
25  LET k = 0
26  LET j = 0
27  LET i = 0
28  LET h = 0
29  LET g = 0
30  LET f = 0
31  LET e = 0
32  LET d = 0
33  LET c = 0
34  LET b = 0
35  LET a = 0
36  GOTO 1
37  REM Execute with RUN X, being X a line from 10 to 35 (i.e., 26 tests). The largest X, the faster the execution. The number of vars before the one tested will be (35 – X).

En las líneas 1 a 3 se hace un bucle del que se mide el tiempo total (usando la variable del sistema FRAMES); dividiéndolo por el número de iteraciones del bucle se obtiene el tiempo medio de cada iteración; lo más importante de dicho tiempo es el dedicado a acceder a la variable a en la línea 2, dos veces para lectura y una para escritura.

Queremos saber cuánto influye en dichos accesos el que haya variables creadas en memoria antes de crear la variable a. Para ello están el resto de líneas del programa. Haciendo RUN con un número de línea a partir de 10 se crean un número de variables previas; tantas menos cuanto mayor sea la línea donde comienza la ejecución. Así, RUN 10 creará 25 variables previas a la variable a, mientras que RUN 35 no creará ninguna. Con paciencia se pueden obtener los datos para dibujar la progresión de tiempos incurridos por tener esas variables.

Este programa puede modificarse para usar variables de texto y variables de tabla simplemente cambiando el nombre de las mismas en el primer caso y además usando DIM en vez de LET en el segundo.

Además, podemos plantear el siguiente programa para medir los tiempos incurridos al buscar variables en memoria cuando antes de las mismas hay una que tiene nombre largo:

1  LET a = 1000 :  POKE 23672 , 0 :  POKE 23673 , 0 :  POKE 23674 , 0
2  LET a = a – 1 :  IF a > 0 THEN  GOTO 2
3  LET a =  PEEK 23672 + 256 *  PEEK 23673 + 65536 *  PEEK 23674 :  PRINT  ( a * 0.02 / 1000 )  ; ” secs” :  LPRINT  ( a * 0.02 / 1000 )  :  STOP 
4  LET aa = 0 :  GOTO 1
5  LET aaa = 0 :  GOTO 1
6  LET aaaa = 0 :  GOTO 1
7  LET aaaaa = 0 :  GOTO 1
8  LET aaaaaa = 0 :  GOTO 1
9  LET aaaaaaa = 0 :  GOTO 1
10  LET aaaaaaaa = 0 :  GOTO 1
11  LET aaaaaaaaa = 0 :  GOTO 1
12  LET aaaaaaaaaa = 0 :  GOTO 1
13  LET aaaaaaaaaaa = 0 :  GOTO 1
14  LET aaaaaaaaaaaa = 0 :  GOTO 1
15  LET aaaaaaaaaaaaa = 0 :  GOTO 1
16  LET aaaaaaaaaaaaaa = 0 :  GOTO 1
17  LET aaaaaaaaaaaaaaa = 0 :  GOTO 1
18  LET aaaaaaaaaaaaaaaa = 0 :  GOTO 1
19  LET aaaaaaaaaaaaaaaaa = 0 :  GOTO 1
20  LET aaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
21  LET aaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
22  LET aaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
23  LET aaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
24  LET aaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
25  LET aaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
26  LET aaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
27  LET aaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
28  LET aaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
29  LET aaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
30  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
31  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
32  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
33  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
34  REM Execute with RUN X, being X a line from 4 to 33 (i.e., 30 tests). The largest X, the longest the variable before A. The length of the variable before A will be (X – 2)

La estructura es muy similar, con una parte inicial para hacer las iteraciones y otra que, según por donde comience a ejecutar el programa, crea una sola variable numérica de tamaño distinto; esta variable quedará situada en memoria antes de la que se usa en el bucle, por lo que someterá a esta última a un retraso que dependerá de la longitud de su nombre (el intérprete ha de leer el nombre entero de una variable antes de poder saltarla para buscar en la siguiente, ya que no guarda información de la longitud de dicho nombre).

En definitiva, con estos programas y algo de paciencia (más un procesado simple de los datos para visualizarlos) se obtiene esta gráfica, que explicamos a continuación:

Si consideramos el eje horizontal como indicativo del número de variables que existen en memoria antes de una dada, y el eje vertical como el tiempo medio (esperado) en que se incurre para poder acceder a esa dada, vemos que, en el caso de que las variables que hay que saltar sean numéricas de una sola letra (cuadrados azules), el intérprete gasta 248 microsegundos por cada variable de ese tipo que tenga que saltarse. En el caso de que las variables a saltar sean de texto, tarda 219 microsegundos por cada una. En caso de que sean de tabla (numérica), tarda lo mismo (independientemente del tamaño de la tabla). Los dos últimos valores son iguales por la forma en que se guardan estos dos tipos de variable en memoria: el cómputo para poder saltarlas es idéntico, mientras que en las numéricas difiere. Estos valores de tiempo parecen pequeños, pero si se multiplica por 10 (o sea, si tenemos 10 variables en el programa) ya pasan al ámbito de los milisegundos, y el ZX tarda 20 milisegundos en volver a refrescar la pantalla, por lo que consultar la variable número 11 nos reduciría los frames por segundo, y eso sin contar el resto del programa.

En la gráfica también está marcado lo que tardaría el mismo bucle de las primeras líneas de programa si en lugar de usar la variable a se accede directamente a memoria mediante POKE / PEEK (sólo válido si la variable es de tamaño byte). Las medidas de tiempo en este caso se han hecho con el siguiente programa:

10  POKE 16384 , 255 :  POKE 23672 , 0 :  POKE 23673 , 0 :  POKE 23674 , 0
20  POKE 16384 ,  PEEK 16384 – 1 :  IF  PEEK 16384 <> 0 THEN  GOTO 20
30  LET t =  PEEK 23672 + 256 *  PEEK 23673 + 65536 *  PEEK 23674 :  PRINT “time: “ ;  ( t * 0.02 / 255 )  ; ” secs”

En la gráfica de arriba, el resultado de este manejo de variables mediante accesos directos a memoria es la línea negra discontinua. De esta forma, se observa que tener más de 12 ó 13 variables de cualquier tipo (aproximadamente) es lo máximo que deberíamos usar si quisiéramos ser óptimamente eficientes; el resto deberían ser accedidas en memoria directamente. Esto no siempre puede hacerse (¡no todas las variables son de tamaño ni tipo byte!), pero en algunos casos puede ser importante para acelerar el código.

Nótese que la variable de tipo byte está situada en la pantalla, es decir, en memoria compartida con la ULA, lo que podría tener retardos adicionales. Se puede situar más allá de la memoria compartida simplemente cambiando su dirección en el código de arriba, pero los tiempos obtenidos no varían. Esto se debe a que el acceso a la variable en sí desde el código máquina del intérprete es una parte despreciable del tiempo total.

El efecto de la memoria compartida con la ULA se puede observar si todo el programa se desplaza a zonas de memoria no compartida, no sólo la dirección de la variable. Esto puede hacerse insertando líneas REM al principio, de forma que el bucle principal alcance posiciones de memoria mayores. Con una cuidadosa medida de lo que ocupa cada línea (véase la primera entrada de esta serie) puede controlarse exactamente en qué posición de memoria acaba el bucle principal (y también puede cambiarse la dirección de comienzo de programa, como también se explica en aquella entrada, para que el intérprete ignore todas las líneas REM a la hora de ejecutar el bucle). Haciendo todo esto hemos obtenido la línea magenta discontinua de la figura de arriba: se observa que la compartición de memoria con la ULA añade 78.4 microsegundos de media en cada iteración del bucle de este programa.

Finalmente, en la gráfica se incluye también el problema de usar variables numéricas de nombres largos. Si ahora interpretamos el eje horizontal como la longitud de dichos nombres, observamos (estrellas amarillas) qué tiempo se gasta en saltar una sola variable cuyo nombre tenga esa longitud. Se ve que por cada carácter de más se incurre en 26 microsegundos extra.

Hay una reflexión adicional que se puede hacer en este artículo: escribir código en Sinclair BASIC es algo penoso hoy en día, entre otras cosas porque numerar cada línea manualmente implica que, para optimizar posteriormente el código moviendo trozos de un lugar a otro (como explicábamos en el artículo anterior), se debe llevar un registro de las líneas donde está cada rutina con el fin de actualizarlas en cada GO SUB, GO TO, RESTORE, etc.

En este sentido se podría pensar que una buena práctica de programación sería definir variables que contengan las líneas donde se sitúan cada una de las rutinas o bucles que haya en el programa, y usar esas variables en GO TOs, GO SUBs y demás. De esa forma, cuando movamos una rutina de sitio, simplemente podemos cambiar el valor de la variable que indica su posición y el programa seguirá funcionando (nótese que esta estrategia de programación no es necesaria si usamos la utilidad --move de ZX-Basicus, que ya se encarga de actualizar automáticamente todas las referencias a las líneas movidas). Esto es análogo al uso de constantes en un lenguaje moderno: se definen para no tener que repetir el mismo valor en distintos lugares y poder modificar el programa más fácilmente.

El problema de utilizar variables para contener líneas a la que saltar es que, aunque el código queda ciertamente más legible y mantenible, e incluso más corto en memoria si el nombre de la variable es menor de lo que ocupa el literal numérico, hay que tener cuidado con la pérdida de eficiencia: es siempre mejor no tener que evaluar una variable, sino consultar el valor numérico de la línea. Este coste de evaluación puede ser grande, por lo que, en definitiva, parece más conveniente usar números de líneas sólo en forma de literales numéricos y referirse a las mismas con esos literales, confiando en una herramienta automática como la de ZX-Basicus para mover trozos de código cuando haga falta.

En cualquier caso, si optamos por usar variables para almacenar números de línea, al menos en un estadio preliminar de nuestro programa, ZX-Basicus ofrece la opción de transformación --subs1var. Ésta se encarga de buscar todos los usos de una variable numérica que aparezca como argumento único de una sentencia (p.ej., en GO TO, GO SUB, etc.) y sustituirla por su valor numérico, siempre que éste sólo se haya asignado una vez en el programa, es decir, siempre que efectivamente esa variable se esté usando como constante. Una vez hecho eso se puede utilizar la opción de transformación --delunusedv para eliminar las asignaciones de valores a variables que ya no se usen más, y habremos así transformado un programa con saltos a variables en un programa con saltos a números de línea literales, es decir, habremos construido una versión de nuestro programa más eficiente en ejecución sin perder la original, que era mejor en legibilidad y mantenibilidad.

Otra reflexión sobre la creación de variables:

Toda variable nueva se crea al final de la lista existente de variables, pero, aunque el intérprete tiene un puntero a dicho lugar para poder alcanzarlo en O(1), tiene que mover todo el contenido posterior de la memoria BASIC hacia arriba (espacios de trabajo sobre todo) para dejar hueco, lo que es O(n), siendo n el número de bytes que hay desde el final actual de las variables hasta el STKEND.

Estos movimientos de memoria pueden ser costosos; por tanto, es conveniente crear el mínimo número de variables durante la ejecución de partes de código que requieran especial velocidad. Una buena práctica es, como se ha explicado antes, definir todas las variables que usará el programa durante la inicialización del mismo, que se ejecuta al comenzar, una sola vez. Esto es bueno no sólo por rapidez de ejecución, sino por claridad del código: tener un catálogo fácilmente localizable de las variables que usa el programa en un lenguaje que no tiene ámbitos es imprescindible para no cometer errores que luego resultan muy difíciles de detectar.

ZX-Basicus también ayuda a optimizar en este sentido porque permite obtener un catálogo de todas las variables usadas en el programa mediante la utilidad de análisis (-a), que produce muchísima información sobre el mismo. Asimismo, permite detectar y borrar variables que no se usan nunca mediante la opción de transformación --delunusedv, mencionada anteriormente.

Quedan por mencionar otras cuatro particularidades relacionadas con las variables de programa.

La primera tiene que ver con los bucles FOR. Las variables de iteración de dichos bucles son tratadas como un tipo distinto al de las variables numéricas escalares, pero ambas son intercambiables: cuando un FOR usa una variable que ya existía antes como numérica escalar, no crea una variable nueva sino que reutiliza la existente cambiándola a tipo “FOR“; por otra parte, cuando un LET o un READ o un INPUT asigna un valor a una variable numérica escalar que existía anteriormente con tipo “FOR“, ésta última también se reutiliza. Por tanto, las variables de iteración de sentencias FOR no sufren tanto por ser creadas al entrar en el FOR si anterioremente existieran aunque no como tipo FOR. En otras palabras: es bueno añadir al catálogo inicial de variables las que usemos en los FOR, aunque no se creen en ese momento inicial como de tipo “FOR“, sino con LET.

Otra particularidad es que las variables numéricas de tabla (creadas con DIM) pueden tener el mismo nombre que las escalares, y nunca existe colisión entre ellas en memoria; sin embargo, las de texto creadas con LET, READ o INPUT no pueden existir con el mismo nombre que las de texto creadas con DIM, por lo que cualquiera de estas sentencias implica que el intérprete tiene que borrar la variable previamente existente, o ampliar o reducir su actual tamaño, si no es el deseado, lo que de nuevo implica mover una zona de memoria potencialmente grande. La conclusión es que se deben usar nombres diferentes para las variables de texto simples y las de tabla (DIM), a pesar de que esto pueda resultar difícil por la escasez de letras disponibles para ambas.

La tercera particularidad: Los parámetros de las funciones de usuario (DEF FN) son variables que sólo existen mientras se evalúan dichas funciones, “desapareciendo” luego (no implican movimiento de memoria para hacerles sitio ni para desaparecer, ya que los argumentos se copian en unos recipientes especiales previamente reservados en la misma sentencia DEF FN). Esto es, de hecho, lo más parecido a tener un ámbito local en el lenguaje, pero en un sólo nivel de la pila de llamadas. Cuando la función se evalúa, la expresión del cuerpo de la función usará los argumentos correspondientes en lugar de variables de programa que puedan existir con el mismo nombre; recurrirá a estas últimas en caso de que no coincidan con ninguno de los parámetros. Esto no tiene especiales consecuencias en cuanto a la eficiencia del código (aunque es importante recordar que, a causa de ello, no podemos tener funciones de usuario recursivas). Sí que puede causar problemas si usamos nombres de parámetros iguales a los que usa el programa para sus variables y eso nos causa confusiones. Los nombres de parámetros de DEF FN deben ser de una sola letra, por lo que ese riesgo no es menor.

Y por último: Siendo Sinclair BASIC un lenguaje “case-insensitive“, es conveniente seguir reglas fijas para escribir variables (todo en minúsculas o todo en mayúsculas); de lo contrario, las posibilidades de recurrir a una variable pre-existente creyendo que no ha sido usada antes se incrementan bastante, llevando a problemas graves y dificilísimos de identificar.

.oOo.

[Click here to read this in Spanish ]

This is the second in a series of posts that explain the foundations of the (in)efficiency of pure BASIC programs written for the ZX Spectrum:
I. On line numbers
II. On variables
III. On expressions
IV. Some statements and time measurement
V. Screen operations based on characters

In the last post we explained that the Sinclair BASIC interpreter in the ZX Spectrum ROM has no table to map program line numbers into memory addresses, which produces inefficiency when referring to lines during execution.

In this second part of the series, we will deal with a similar problem with variables:

The interpreter has no information that maps, in constant time, names of variables to the memory addresses where they are stored

It is fair to notice that including such mapping in the original ZX would be unrealistic at the time: it involves a data structure far more complicated than a simple table due to the fact that searches must be done on text string keys (variable names). Unless a hash table is implemented, that search cannot be done in constant time (O(1)).

Anyway, the consequence of this is that finding a variable in memory has an O(n*m) worst-case cost, where n is the number of existing variables and m the length of the longest variable name. Such longest name must belong to a numerical, scalar variable, since strings, char arrays, numerical arrays and FOR variables cannot have more than one letter. All in all, the procedure used by the interpreter to seek for a variable in memory is to start pointing to the first one, checking its name (potentially, all its letters), and, if it is not the one that is being looked for, adding its length in memory to point to the next one and repeating.

Therefore, the first recommendation is:

Those variables that are more frequently accessed or that are in regions of code with critical time requirements should be a) created before others and b) have shorter names.

This can be achieved by having part of the program devoted to create all variables, executing it only once at start and placing it at the end of the listing due to the problems discussed in the previous post. The creation of variables should be done from most frequently used to least, and using shorter names for the former.

In this aspect, the ZX-Basicus tool has an option (--shortenv) that automatically shortens all scalar variable names. Alas, the resulting names will not be as meaningful as the original ones!

The programmer of BASIC games @igNaCoBo has suggested to measure the time involved in searching for variables. He has detected that, from a given number of variables up, it is more efficient to not use variables but direct accesses to memory (through POKE / PEEK). In order to get the time measurements, the following program comes in handy:

1  LET a = 1000 :  POKE 23672 , 0 :  POKE 23673 , 0 :  POKE 23674 , 0
2  LET a = a – 1 :  IF a > 0 THEN  GOTO 2
3  LET t =  PEEK 23672 + 256 *  PEEK 23673 + 65536 *  PEEK 23674 :  LET v =  PEEK 23641 + 256 *  PEEK 23642 –  (  PEEK 23627 + 256 *  PEEK 23628 )  :  PRINT “vars size: “ ; v ; “… time: “ ;  ( t * 0.02 / 1000 )  ; ” secs” :  LPRINT v ,  ( t * 0.02 / 1000 )  :  STOP 
10  LET z = 0
11  LET y = 0
12  LET x = 0
13  LET w = 0
14  LET v = 0
15  LET u = 0
16  LET t = 0
17  LET s = 0
18  LET r = 0
19  LET q = 0
20  LET p = 0
21  LET o = 0
22  LET n = 0
23  LET m = 0
24  LET l = 0
25  LET k = 0
26  LET j = 0
27  LET i = 0
28  LET h = 0
29  LET g = 0
30  LET f = 0
31  LET e = 0
32  LET d = 0
33  LET c = 0
34  LET b = 0
35  LET a = 0
36  GOTO 1
37  REM Execute with RUN X, being X a line from 10 to 35 (i.e., 26 tests). The largest X, the faster the execution. The number of vars before the one tested will be (35 – X).

In lines 1 to 3 there is a loop; we measure the total time spent in the loop (using the FRAMES system variable) and then divide it by the number of iterations, thus getting the average time per each one. The most important part of that time is the one spent in accessing variable a in line 2, twice for reading and 1 for writing.

We wish to know the influence on those accesses of the fact that there exist other variables previously created in memory. The rest of lines in the program prepare those previous variables; depending on the starting line of execution, more or less of them are created. Thus, RUN 10 creates 25 variables before variable a, and RUN 35 does not create any one else. With some patience we can gather data to draw the progression of times.

That program can be slightly modified to use string variables instead, and also numberic tables: just change their names and/or use DIM instead of LET.

In addition, we can propose the following program to measure the times involved in looking for variables when there is a previous one created with a long name:

1  LET a = 1000 :  POKE 23672 , 0 :  POKE 23673 , 0 :  POKE 23674 , 0
2  LET a = a – 1 :  IF a > 0 THEN  GOTO 2
3  LET a =  PEEK 23672 + 256 *  PEEK 23673 + 65536 *  PEEK 23674 :  PRINT  ( a * 0.02 / 1000 )  ; ” secs” :  LPRINT  ( a * 0.02 / 1000 )  :  STOP 
4  LET aa = 0 :  GOTO 1
5  LET aaa = 0 :  GOTO 1
6  LET aaaa = 0 :  GOTO 1
7  LET aaaaa = 0 :  GOTO 1
8  LET aaaaaa = 0 :  GOTO 1
9  LET aaaaaaa = 0 :  GOTO 1
10  LET aaaaaaaa = 0 :  GOTO 1
11  LET aaaaaaaaa = 0 :  GOTO 1
12  LET aaaaaaaaaa = 0 :  GOTO 1
13  LET aaaaaaaaaaa = 0 :  GOTO 1
14  LET aaaaaaaaaaaa = 0 :  GOTO 1
15  LET aaaaaaaaaaaaa = 0 :  GOTO 1
16  LET aaaaaaaaaaaaaa = 0 :  GOTO 1
17  LET aaaaaaaaaaaaaaa = 0 :  GOTO 1
18  LET aaaaaaaaaaaaaaaa = 0 :  GOTO 1
19  LET aaaaaaaaaaaaaaaaa = 0 :  GOTO 1
20  LET aaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
21  LET aaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
22  LET aaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
23  LET aaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
24  LET aaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
25  LET aaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
26  LET aaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
27  LET aaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
28  LET aaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
29  LET aaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
30  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
31  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
32  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
33  LET aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0 :  GOTO 1
34  REM Execute with RUN X, being X a line from 4 to 33 (i.e., 30 tests). The largest X, the longest the variable before A. The length of the variable before A will be (X – 2)

The structure is quite similar to the program above. It creates a single numeric variable with a name that is longer as the entry point in the program is also higher. The accesses in the loop to the iteration variable will take a time that depends on the length of that long named variable (the BASIC interpret has to read the entire name of a long named variable in order to skip it in its search for others, since it does not stores the length of that long name).

All in all, with these programs and again some patience (and some processing of the data to visualize them) we get this graph, that is explained below:

Considering the horizontal axis as the number of variables that are before another one, and the vertical axis as the expected time to access the latter variable after skipping all the rest, we see that if the skipped variables are one-lettered numeric ones (blue squares), the interpret takes 248 microseconds per variable. In the case that the skipped variables are of type string, it takes 219 microseconds per each one, the same as if they are numeric tables (unregarding the table size). These two latter values are the same due to the way these types are stored in memory, different from the way a numeric var is stored. The times recorded here seem short, but if they are multiplied by just 10 variables that may exist in a program, they go to milliseconds, and the ZX has 20 milliseconds to complete a display frame, reducing the time to do the rest of computations in a relevant amount.

In the graph you can also see the time spent in accessing a byte numeric variable directly on memory through POKE / PEEK. For that case, the following program has been used:

10  POKE 16384 , 255 :  POKE 23672 , 0 :  POKE 23673 , 0 :  POKE 23674 , 0
20  POKE 16384 ,  PEEK 16384 – 1 :  IF  PEEK 16384 <> 0 THEN  GOTO 20
30  LET t =  PEEK 23672 + 256 *  PEEK 23673 + 65536 *  PEEK 23674 :  PRINT “time: “ ;  ( t * 0.02 / 255 )  ; ” secs”

In the graph above, the corresponding result is the dashed black line. You can see how having more than 12 or 13 variables of any type is the maximum if we wish to execute optimally; above that it is more efficient to use direct memory accesses (as long as our variables are of byte type!). The obtained acceleration may be quite important.

Notice that the variable is stored in the screen, that is, within the memory shared with the ULA, which could produce additional delays due to memory contention. It could be placed beyond that memory simply by modifying its address in the BASIC code, but the resulting times are the same. This is due to the fact that the actual access to the variable in memory by the machine code of the interpreter is a negligible part of the total time.

The effect of contended memory can be observed if the program itself is moved to higher memory regions, not just the accessed variable. This can be done by inserting lines with REM at the beginning, thus the main loop reaches higher memory addresses. With a careful measurement of the size of each of such lines (see the first post in this series) you can control the exact position in memory of the main loop of the program (also, you can modify the start address of the program for the interpreter to ignore the previous lines when executing the loop). Doing all of that you can obtain the magenta dashed line in the above figure: there it is shown that memory contention adds 78.4 microsegundos on average at each iteration of the main loop of our program.

Finally, the graph also shows the problem of using long names for the numeric variables. For seeing that, the horizontal axis has to be interpreted as the length of the name of a variable of that type that is to be skipped to reach another one in memory. It is shown that for each character in the name of the long named variable, the interpret spends 26 microseconds.

A first additional comment on this has to do with the numbering of lines: The original Sinclair BASIC language makes programming very tedious today, particularly because the programmer must explicitly write, and therefore maintain, all the numbers of lines. If, during coding, we re-place parts of the code (as we explained in the previous post), all the affected lines must be renumbered. Although ZX-Basicus has an automatic option to do this (--move), if we wish to do it manually it requires to keep a list of line numbers and their meaning.

Regarding this, it could be useful to define variables that contain the line numbers of significant routines or loops, and use them in the corresponding GO TO, GO SUB, etc. In this way, moving a piece of code amounts to just changing the value of the involved variables. This is analogous to the modern use of constants to refer to information that does not vary along the code or during execution.

The issue with this solution is that, although the code gets more maintainable and easier to read, and even shorter in memory if the variable names of the involved lines are shorter than their numeric literals, you must be careful with efficiency: any reference to those variables (a GO TO, GO SUB, etc.) will involve the evaluation of an expression (the one containing the variable name), which in turn involves consulting the variable in memory. All in all, it seems more effective to use the automatic tool of ZX-Basicus to move pieces of code and renumber automatically, and do not use variables for holding line numbers.

Nevertheless, if your code has already this use of variables for line numbers, ZX-Basicus helps in getting rid of them through the option --subs1var. It looks automatically for all the uses of any numerical variable as an argument of a statement that requires a line number, and substitutes them by the line number itself, as a numeric literal, as long as the variable has been assigned only once, i.e., it is a “constant” in the program. Once this is done, another option (--delunusedv) can be used to delete any statement using those variables, since they are no longer used.

The second additional comment in this post has to do with the creation of variables:

Each variable that is created anew is stored at the end of the existing variable area in RAM. However, although there exist a pointer to that place that the interpreter uses for that, the creation involves to move all the memory content above it in order to leave enough room for the new variable. That content may be potentially large (work spaces, mostly), which makes the operation O(n), being n the number of bytes from the end of the variable area to the STKEND system variable.

Therefore, it is convenient to create the minimum number of variables during the execution of the parts of the program that requiere some speed. A good practice is, as explained before, to define all the variables once at the start of the program instead of creating them by demand. This is not only good for improving the execution speed, but for the clarity and maintainability of the code: having a catalog of the existing variables in a language that has no scopes (all are global) becomes essential to avoid errors that can be very difficult to detect, find and fix.

ZX-Basicus can also help in this matter, since it can produce a list with all existing variables in a program (through the analysis tool, -a). Moreover, it can detect and delete automatically all unused variables through the optimization option --delunusedv commented before.

To end this post we will discuss four particularities related to program variables.

The first one has to do with FOR loops. Their iteration variables are considered of a type different from scalar, numeric variables, but both are exchangeable: when a FOR statement uses an already existing variable (existing as a scalar, not as a FOR), it does not create a new one but re-uses it by changing its type to FOR-type. Also, when LET, READ or INPUT assign a value to a scalar numeric value that already existed with FOR-type, the existing one is also re-used. The conclusion is that the iteration variables in FOR loops do not suffer much from the issue in the creation of variables explained before, since they can be created at start, as also explained, as simple scalars, with LET.

The second particularity is that array numerical variables (created with DIM) can have the same names as scalar ones without producing name collisions. However, string variables created with LET, READ or INPUT cannot co-exist with char arrays created with DIM if they have the same name. The conclusion is that any of these statements will create a new string variable, with all its consequences. Again, they should be created at start. This is specially problematic since string and char array variables can only have one-lettered names, which limits their number.

The third particularity is that the parameters of user functions (DEF FN) are variables that only exist while the function is being evaluated, “dissapearing” right after that. They do not require the conventional creation procedure, since there exist placeholders for them reserved when the program was pre-processed (before execution). This is the closest feature to having local scopes, but only permits one depth level in the function calls (there is no possible recursivity). We have to pay attention to the use of variables in the program with the same names as the parameters of user functions, since the latter will hide the former silently during execution and that can produce errors.

The last particularity is that the Sinclair BASIC language is case-insensitive, i.e., there is no distinction between variable names in lower and upper cases. It is convenient, thus, to use only one of these cases along the program; otherwise, names collisions, many of them potentially unnoticed, my produce errors difficult to debug and fix.

4 thoughts on “Efficient BASIC coding for the ZX Spectrum (II)”