Racket's match shows strange movements

Hello.

I'm trying translating the WhiteSpace language onto Racket, because it was originally written in Haskell, and that looks relatively easy to be written in Racket.
As you know, Haskell uses a bunch of Pattern Matching; therefore, I have to rely on the match of Racket, especially in the parsing process.

Original Source Code in Haskell

What I tried translating is like this:

#! /usr/bin/env racket
#lang racket

(require srfi/1 2htdp/batch-io)

(define (vm prog stack cstack heap pc)
  (let ((instr (list-ref prog pc)))
    ((doInstr prog stack cstack heap (+ pc 1)) instr)))

;; Running individual instructions

(define (doInstr prog stack cs heap pc)
  (lambda (instr)
    (case (car instr)
      ((Push) (let ((n (cadr instr)))
                (vm prog (cons n stack) cs heap pc)))
      ((Dup) (let ((n (car stack)))
               (vm prog (cons n stack) cs heap pc)))
      ((Ref) (let ((i (cadr instr)))
               (vm prog
                  (cons (list-ref stack i) stack)
                  cs heap pc)))
      ((Slide) (let ((i (cadr instr)) (n (car stack)))
                 (vm prog
                    (cons n (drop i (cdr stack)))
                    cs heap pc)))
      ((Swap) (let ((n (car stack)) (m (cadr stack)))
                (vm prog
                   (cons m (cons n (cddr stack)))
                   cs heap pc)))
      ((Discard) (vm prog (cdr stack) cs heap pc))
      ((Infix) (let ((op (cadr instr))
                     (y (car stack))
                     (x (cadr stack)))
                 (vm prog
                    (cons ((case op
                             ((Plus) +)
                             ((Minul) -)
                             ((Times) *)
                             ((Divide) /)
                             ((Modulo) modulo))
                           x y) (cddr stack))
                    cs heap pc)))
      ((OutputChar) (let ((n (car stack)))
                      (write-char (integer->char n))
                      (flush-output)
                      (vm prog (cdr stack) cs heap pc)))
      ((ReadChar) (let ((loc (car stack))
                        (ch (read-char)))
                    (let ((hp
                           (store (char->integer ch) loc heap)))
                      (vm prog (cdr stack) cs hp pc))))
      ((ReadNum) (let ((loc (car stack))
                       (ch (read-line)))
                   (let ((num (string->number ch)))
                     (let ((hp (store num loc heap)))
                       (vm prog (cdr stack) cs hp pc)))))
      ((OutputNum) (let ((n (car stack)))
                     (write-string (number->string n))
                     (flush-output)
                     (vm prog (cdr stack) cs heap pc)))
      ((Label) (let ((_ (cadr instr)))
                 (vm prog stack cs heap pc)))
      ((Call) (let ((l (cadr instr)))
                (let ((loc (findLabel l prog)))
                  (vm prog stack (cons pc cs) heap loc))))
      ((Jump) (let ((l (cadr instr)))
                (let ((loc (findLabel l prog)))
                  (vm prog stack cs heap loc))))
      ((If) (let ((t (cadr instr))
                  (l (caddr instr))
                  (n (car stack)))
              (if ((case t
                     ((Zero) zero?)
                     ((Negative) negative?)) n)
                 (let ((loc (findLabel l prog)))
                   (vm prog (cdr stack) cs heap loc))
                 (vm prog (cdr stack) cs heap pc))))
      ((Return) (let ((c (car cs)))
                  (vm prog stack (cdr cs) heap c)))
      ((Store) (let ((n (car stack)) (loc (cadr stack)))
                 (let ((hp (store n loc heap)))
                   (vm prog stack cs hp pc))))
      ((Retrieve) (let ((loc (car stack)))
                    (let ((val (retrieve loc heap)))
                      (vm prog (cons val stack) cs heap pc))))
      ((End) #f)
      (else (error "Can't do " instr)))))

;; Digging out labels from wherever they are

(define (findLabel l p)
  (let loop ((m l) (p p) (i 0))
    (if (null? p)
       (error "Undefined label" l)
       (let ((l (car p)))
         (if (eqv? l m)
            i
            (loop m (cdr p) (+ i 1)))))))

;; Heap management

(define (retrieve x heap) (list-ref heap x))

(define (store x n Heap)
  (cond ((zero? n) (if (null? Heap)
                      `(,x)
                      (cons x (cdr Heap))))
       ((null? Heap) (append (make-list n 0) `(,x)))
       (else
        (let-values (((head tail) (split-at Heap n)))
          (append head `(,x) (cdr tail))))))

(define *A* #\space)
(define *B* #\tab)
(define *C* #\newline)

(define (execute fname)
  (let ((prog (read-file fname)))
    (let ((tokens (tokenize prog)))
      (let ((runtime (parse tokens)))
        (vm runtime '() '() '() 0)))))

(define (tokenize str)
  (let loop ((ls (string->list str)) (acc '()))
    (if (null? ls)
       (reverse acc)
       (let ((x (car ls)))
         (loop (cdr ls) (if (or (char=? x *A*)
                               (char=? x *B*)
                               (char=? x *C*))
                           (cons x acc)
                           acc))))))

(define (parse ls)
  (let loop ((ls ls) (acc '()))
    (if (null? ls)
       (reverse acc)
       (match ls
         ((list *A* *A* xs ...)
          (let-values (((num rest) (parseNumber xs)))
            (loop rest (cons `(Push ,num) acc))))
         ((list *A* *C* *A* xs ...)
          (loop xs (cons '(Dup) acc)))
         ((list *A* *B* *A* xs ...)
          (let-values (((num rest) (parseNumber xs)))
            (loop rest (cons `(Ref ,num) acc))))
         ((list *A* *B* *C* xs ...)
          (let-values (((num rest) (parseNumber xs)))
            (loop rest (cons `(Slide ,num) acc))))
         ((list *A* *C* *B* xs ...)
          (loop xs (cons '(Swap) acc)))
         ((list *A* *C* *C* xs ...)
          (loop xs (cons '(Discard) acc)))
         ((list *B* *A* *A* *A* xs ...)
          (loop xs (cons '(Infix Plus) acc)))
         ((list *B* *A* *A* *B* xs ...)
          (loop xs (cons '(Infix Minus) acc)))
         ((list *B* *A* *A* *C* xs ...)
          (loop xs (cons '(Infix Times) acc)))
         ((list *B* *A* *B* *A* xs ...)
          (loop xs (cons '(Infix Divide) acc)))
         ((list *B* *A* *B* *B* xs ...)
          (loop xs (cons '(infix Modulo) acc)))
         ((list *B* *B* *A* xs ...)
          (loop xs (cons '(Store) acc)))
         ((list *B* *B* *B* xs ...)
          (loop xs (cons '(Retrieve) acc)))
         ((list *C* *A* *A* xs ...)
          (let-values (((string rest) (parseString xs)))
            (loop rest (cons `(Label ,string) acc))))
         ((list *C* *A* *B* xs ...)
          (let-values (((string rest) (parseString xs)))
            (loop rest (cons `(Call ,string) acc))))
         ((list *C* *A* *C* xs ...)
          (let-values (((string rest) (parseString xs)))
            (loop rest (cons `(Jump ,string) acc))))
         ((list *C* *B* *A* xs ...)
          (let-values (((string rest) (parseString xs)))
            (loop rest (cons `(If Zero ,string) acc))))
         ((list *C* *B* *B* xs ...)
          (let-values (((string rest) (parseString xs)))
            (loop rest (cons `(If Negative string) acc))))
         ((list *C* *B* *C* xs ...)
          (let-values (((string rest) (parseString xs)))
            (loop rest (cons '(Return) acc))))
         ((list *C* *C* *C* xs ...)
          (loop xs (cons '(End) acc)))
         ((list *B* *C* *A* *A* xs ...)
          (loop xs (cons '(OutputChar) acc)))
         ((list *B* *C* *A* *B* xs ...)
          (loop xs (cons '(OutputNum) acc)))
         ((list *B* *C* *B* *A* xs ...)
          (loop xs (cons '(ReadChar) acc)))
         ((list *B* *C* *B* *B* xs ...)
          (loop xs (cons '(ReadNum) acc)))
         (else (error "Unrecognised input"))))))

(define (parseNumber ts)
  (let loop ((ts ts) (acc '()))
    (let ((x (car ts)) (rest (cdr ts)))
      (if (char=? x *C*)
         (values (makeNumber acc) rest)
         (loop rest (cons x acc))))))

(define (parseString ts)
  (let loop ((ts ts) (acc '()))
    (let ((x (car ts)) (rest (cdr ts)))
      (if (char=? x *C*)
         (values (makeString acc) rest)
         (loop rest (cons x acc))))))

(define (makeNumber t)
  (let ((sign (last t)) (ls (reverse (take t (- (length t) 1)))))
    (if (null? ls)
       0
       (let ((num (string->number
                   (list->string
                    (map (lambda (x)
                           (cond ((char=? x *A*) #\0)
                                ((char=? x *B*) #\1))) ls)) 2)))
         (if (char=? sign *A*)
            num
            (- num))))))

(define makeString list->string)

(define (main args)
  (if (not (= (length args) 1))
     (usage)
     (execute (car args))))

(define (usage)
  (display "wspace with Racket 0.1 (c) 2021 Cametan\n")
  (display "-------------------------------\n")
  (display "Usage: wspace [file]\n"))

;(main (vector->list (current-command-line-arguments)))

I try running hello world on the program I wrote; however I noticed the pattern matching in the parsing process does not work properly.
To be specific, with the debugger, the pattern of #\space, #\tab, and #\newline only matches in the first clause, or (list A A xs ...), where A is #\space.
(This makes the accumulator, or acc, be a list, fully consisting of list of (Push n))
As a result, the program raises an Error.
Did I make any mistakes to use match of Racket?

Thanks.

1 Like

When you write a pattern (list *A* *A* xs ...) you are binding to new variables *A* and xs.

What you most likely want for patterns that need to match existing variables is the (== expr) pattern. So the pattern you wrote (list *A* *A* xs ...) should be re-written to (list (== *A*) (== *A*) xs ...).

1 Like

Oh, I see.
O.K. The variables used in match does not refer to the variables outside.
In fact, they are new variables defined.

Hmmm... Haskell codes used A, B, and C, so I adjusted; however it was a bad idea.

Anyway, thanks. Your advise is very helpful.

1 Like

Ummmm.... Strange.

I rewrote like:

(list #\space #\space xs ...)

; but this gives me:

Unrecognised input

instead.
Constants, or Characters like #\space, #\tab, or #\newline can not be used in match......?
The reference manual said nothing about it, and basically almost all examples are related to lists, consisting of numbers.

Such a mysterious behavior of match......
I'm afraid of using this.

Something may be off with your input. match does work with literal values:

(match (list #\space #\space 1 2 3 4)
  [(list #\space #\space xs ...) 'here]
  [_ 'there])
;; => 'here

(match (list #\space 1 2 3 4)
  [(list #\space #\space xs ...) 'here]
  [_ 'there])
;; => 'there

With that large match there may be a case that you are missing or the input is not quite what you are expecting. One thing you may want to try is adding the value of ls to the error message for "Unrecognised input" to see what is failing.

The Fly said "Be Afraid, Be Very Afraid.".

Anyway.
According to the specification of the "White Space" language, there might be something missed around the combination of #\space, #\tab, and #\newline.
Yes, it is obvious.

The tokenizer I translated into Racket gives back a list like:

(#\space #\space #\space #\space #\newline #\space #\space #\space #\tab #\space #\space #\tab #\space #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\newline #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\tab #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\tab #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\tab #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\space #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\tab #\tab #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\newline #\space #\space #\space #\tab #\space #\tab #\tab #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\tab #\space #\newline #\space #\space #\space #\tab #\space #\space #\space #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\tab #\tab #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\space #\space #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\tab #\tab #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\space #\tab #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\space #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\tab #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\newline #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\tab #\space #\space #\newline #\space #\space #\space #\tab #\space #\space #\space #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\tab #\space #\tab #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\tab #\tab #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\tab #\tab #\space #\newline #\space #\space #\space #\tab #\tab #\space #\space #\tab #\tab #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\tab #\tab #\tab #\newline #\space #\space #\space #\tab #\space #\space #\space #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\space #\space #\space #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\space #\space #\tab #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\space #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\space #\tab #\space #\newline #\space #\space #\space #\tab #\tab #\space #\space #\space #\space #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\space #\tab #\tab #\newline #\space #\space #\space #\tab #\tab #\space #\space #\space #\tab #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\space #\space #\newline #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\space #\tab #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\space #\newline #\space #\space #\space #\tab #\space #\space #\space #\space #\tab #\newline #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\tab #\newline #\space #\space #\space #\space #\newline #\tab #\tab #\space #\space #\space #\space #\space #\newline #\newline #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\tab #\space #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\newline #\newline #\space #\tab #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\space #\tab #\tab #\space #\space #\space #\tab #\tab #\space #\tab #\space #\space #\tab #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\newline #\newline #\newline #\newline #\newline #\space #\space #\space #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\tab #\space #\space #\space #\newline #\tab #\newline #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\tab #\space #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\newline #\space #\newline #\space #\tab #\tab #\tab #\space #\newline #\space #\newline #\tab #\space #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\tab #\space #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\space #\tab #\tab #\tab #\tab #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\tab #\newline #\space #\space #\space #\space #\space #\tab #\newline #\tab #\space #\space #\space #\newline #\space #\newline #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\tab #\space #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\newline #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\tab #\space #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\space #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\space #\tab #\tab #\tab #\tab #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\space #\newline #\newline #\space #\newline #\newline #\newline #\tab #\newline #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\space #\newline #\space #\space #\newline #\space #\tab #\newline #\tab #\space #\tab #\tab #\tab #\space #\newline #\space #\space #\space #\space #\tab #\space #\tab #\space #\newline #\tab #\space #\space #\tab #\newline #\tab #\space #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\space #\space #\tab #\space #\tab #\tab #\tab #\tab #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\space #\newline #\newline #\space #\space #\space #\tab #\newline #\tab #\space #\space #\space #\newline #\space #\newline #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\newline #\space #\space #\space #\tab #\tab #\tab #\space #\space #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\space #\space #\space #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\space #\space #\tab #\space #\tab #\tab #\tab #\tab #\tab #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\space #\newline #\space #\newline #\newline #\space #\space #\space #\tab #\newline #\tab #\space #\space #\space #\space #\space #\space #\space #\newline #\tab #\tab #\space #\newline #\tab #\newline #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\tab #\space #\tab #\tab #\space #\tab #\tab #\space #\space #\space #\tab #\tab #\space #\tab #\space #\space #\tab #\space #\tab #\tab #\space #\tab #\tab #\tab #\space #\space #\tab #\tab #\space #\space #\tab #\space #\tab #\newline #\space #\space #\space #\tab #\space #\tab #\space #\newline #\space #\space #\space #\tab #\tab #\space #\tab #\newline #\tab #\newline #\space #\space #\tab #\newline #\space #\space #\newline #\tab)

about "Hello, World" program, accoding to the example, which the original Haskell source code package has.
And, yes, as you have shown, match works well on the Racket interpreter.

> (define code (tokenize (read-file "hworld.ws")))
> (match code
    ((list #\space #\space xs ...) 'here)
    (_ 'there))
'here
>

This is reasonable, because both the car and the cadr of code, tokenizing "Hello World" program written in the Whitespace language, are #\space's.
However, I believe that the parse program I translated has as same structure as you had shown, except its size.
The debugger said nothing, that means, when the parser eats the list shown above, it immediately goes to the else clause, even though the 1st and the 2nd element of the list are #\space's. Nothing happens. No process. No Stack. No Variables.
Weird.

Anyway, I try testing around the code for a while.

Thank you!

I took a stab at running the the whitespace program you provided through the interpreter after changing the else clause of the match to (xs (error 'parse "Unrecognised input: ~s" xs)) and the error I get is Unrecognized input: (#\newline #\tab). My interpreter (adapted from the one you posted) may have diverged from yours, but it appears to make it to the end of the input before erroring for me.

1 Like

Thanks, you gave me a big-big clue of "What is wrong".

Actually I wonderd, because though the program should end with the End command, or #\newline #\newline #newline, or the Return command, or #\newline #\tab #\newline, the tokenized one does not have them at the end.
The problem may occur at "read-file" of 2htdp/batch-io. It may skip some white-space stuffs at the end of file.
Therefore, I have to make a read-file stuff myself.

Anyway I really-really appreciate. Thanks!

1 Like

Finally, by using examples of the original source code, the racket code has passed the debugging process.

Here is the Whitespace language implementation with Racket:

#! /usr/bin/env racket
#lang racket

(require srfi/1 srfi/13 srfi/48 2htdp/batch-io)

;SRFI 1: List Library
;https://srfi.schemers.org/srfi-1/srfi-1.html
;SRFI 13: String Libraries
;https://srfi.schemers.org/srfi-13/srfi-13.html
;SRFI 48: Intermediate Format Strings 
;https://srfi.schemers.org/srfi-48/srfi-48.html
;Batch Input/Output: "batch-io.rkt"
;https://docs.racket-lang.org/teachpack/2htdpbatch-io.html

;;; Stack machine for running whitespace programs 

(define (vm prog stack cstack heap pc)
  (let ((instr (list-ref prog pc)))
    ((doInstr prog stack cstack heap (+ pc 1)) instr)))

;; Running individual instructions

(define (doInstr prog stack cs heap pc)
  (lambda (instr)
    (case (car instr)
      ((Push) (let ((n (cadr instr)))
                (vm prog (cons n stack) cs heap pc)))
      ((Dup) (let ((n (car stack)))
               (vm prog (cons n stack) cs heap pc)))
      ((Ref) (let ((i (cadr instr)))
               (vm prog
                  (cons (list-ref stack i) stack)
                  cs heap pc)))
      ((Slide) (let ((i (cadr instr)) (n (car stack)))
                 (vm prog
                    (cons n (drop (cdr stack) i))
                    cs heap pc)))
      ((Swap) (let ((n (car stack)) (m (cadr stack)))
                (vm prog
                   (cons m (cons n (cddr stack)))
                   cs heap pc)))
      ((Discard) (vm prog (cdr stack) cs heap pc))
      ((Infix) (let ((op (cadr instr))
                     (y (car stack))
                     (x (cadr stack)))
                 (vm prog
                    (cons ((case op
                             ((Plus) +)
                             ((Minus) -)
                             ((Times) *)
                             ((Divide) /)
                             ((Modulo) modulo))
                           x y) (cddr stack))
                    cs heap pc)))
      ((OutputChar) (let ((n (car stack)))
                      (write-char (integer->char n))
                      (flush-output)
                      (vm prog (cdr stack) cs heap pc)))
      ((ReadChar) (let ((loc (car stack))
                        (ch (read-char)))
                    (let ((hp
                           (store (char->integer ch) loc heap)))
                      (vm prog (cdr stack) cs hp pc))))
      ((ReadNum) (let ((loc (car stack))
                       (ch (read-line)))
                   (let ((num (string->number ch)))
                     (let ((hp (store num loc heap)))
                       (vm prog (cdr stack) cs hp pc)))))
      ((OutputNum) (let ((n (car stack)))
                     (write-string (number->string n))
                     (flush-output)
                     (vm prog (cdr stack) cs heap pc)))
      ((Label) (let ((_ (cadr instr)))
                 (vm prog stack cs heap pc)))
      ((Call) (let ((l (cadr instr)))
                (let ((loc (findLabel l prog)))
                  (vm prog stack (cons pc cs) heap loc))))
      ((Jump) (let ((l (cadr instr)))
                (let ((loc (findLabel l prog)))
                  (vm prog stack cs heap loc))))
      ((If) (let ((t (cadr instr))
                  (l (caddr instr))
                  (n (car stack)))
              (if ((case t
                     ((Zero) zero?)
                     ((Negative) negative?)) n)
                 (let ((loc (findLabel l prog)))
                   (vm prog (cdr stack) cs heap loc))
                 (vm prog (cdr stack) cs heap pc))))
      ((Return) (let ((c (car cs)))
                  (vm prog stack (cdr cs) heap c)))
      ((Store) (let ((n (car stack)) (loc (cadr stack)))
                 (let ((hp (store n loc heap)))
                   (vm prog (cddr stack) cs hp pc))))
      ((Retrieve) (let ((loc (car stack)))
                    (let ((val (retrieve loc heap)))
                      (vm prog (cons val (cdr stack)) cs heap pc))))
      ((End) (format #t "Done.~%Stack size ~a~%Heap size ~a~%"
                    (length cs) (length heap)))
      (else (error "Can't do " (car instr))))))

;; Digging out labels from wherever they are

(define (findLabel l p)
  (let loop ((p p) (i 0))
    (if (null? p)
       (error "Undefined label" l)
       (let ((m (car p)))
         (if (and (eq? (car m) 'Label) (eq? (cadr m) l))
            i
            (loop (cdr p) (+ i 1)))))))

;; Heap management

(define (retrieve x heap) (list-ref heap x))

;(define (store x n Heap)
;  (match (list x n Heap)
;    ((list x 0 (cons h hs)) (cons x hs))
;    ((list x n (cons h hs)) (let ((hp (store x (- n 1) hs)))
;                              (cons h hp)))
;    ((list x 0 '()) (cons x '()))
;    ((list x n '()) (let ((hp (store x (- n 1) '())))
;                      (cons 0 hp)))))

(define (store x n Heap)
  (let ((h (length Heap)))
    (if (< n h)
       (let-values (((head tail) (split-at Heap n)))
         (append head `(,x) (cdr tail)))
       (append Heap (make-list (- n h) 0) `(,x)))))
       
;input to the whitespace VM.
;For convinience, three input characters
;#\space, #\tab, #\newline
;
;Numbers are binary (#\space = 0, #\tab = 1, #\newline = terminator)
;Strings are sequences of binary characters, terminated by #\newline .
;
;We have:
;
;* Stack instructions (Preceded by #\space)
;      Push (Integer)     #\space
;      Dup      #\newline #\space
;      Swap    #\newline  #\tab
;      Discart #\newline #\newline
;
;* Arithmetic (Preceded by #\tab #\space)
;      Plus   #\space #\space
;      Minus  #\space #\tab
;      Times  #\space #\newline
;      Divide #\tab   #\space
;      Modulo #\tab   #\tab
;
;* Heap access (Preceded by #\tab #\tab)
;      Store     #\space
;      Retrieve #\tab
;
;* Control (Preceded by #\newline)
;      Label Symbol  #\space #\space
;      Call Label    #\space #\tab
;      Jump Label    #\space #\newline
;      If Zero Label #\tab #\space
;      If Neg Label  #\tab #\tab
;      Return         #\tab #\newline
;      End             #\newline #\newline
;
;* IO instructions (Preceded by #\tab #\newline)
;      OutputChar #\space #\space
;      OutputNum  #\space #\tab
;      ReadChar   #\tab #\space
;      ReadNum    #\tab #\tab

(define (execute fname)
  ;; "read-file" of 2htdp/batch-io drops a "\n" at the end of text file when it eats the file 
  (let ((prog (string-append (read-file fname) "\n")))
    (let ((tokens (tokenize prog)))
      (let ((runtime (parse tokens)))
        (vm runtime '() '() '() 0)))))

(define (tokenize str)
  (filter (lambda (x)
            (or (char=? x #\space)
               (char=? x #\tab)
               (char=? x #\newline))) (string->list str)))

(define (parse ls)
  (let loop ((ls ls) (acc '()))
    (if (null? ls)
       (reverse acc)
       (match ls
         ((list #\space #\space xs ...)
          (let-values (((num rest) (parseNumber xs)))
            (loop rest (cons `(Push ,num) acc))))
         ((list #\space #\newline #\space xs ...)
          (loop xs (cons '(Dup) acc)))
         ((list #\space #\tab #\space xs ...)
          (let-values (((num rest) (parseNumber xs)))
            (loop rest (cons `(Ref ,num) acc))))
         ((list #\space #\tab #\newline xs ...)
          (let-values (((num rest) (parseNumber xs)))
            (loop rest (cons `(Slide ,num) acc))))
         ((list #\space #\newline #\tab xs ...)
          (loop xs (cons '(Swap) acc)))
         ((list #\space #\newline #\newline xs ...)
          (loop xs (cons '(Discard) acc)))
         ((list #\tab #\space #\space #\space xs ...)
          (loop xs (cons '(Infix Plus) acc)))
         ((list #\tab #\space #\space #\tab xs ...)
          (loop xs (cons '(Infix Minus) acc)))
         ((list #\tab #\space #\space #\newline xs ...)
          (loop xs (cons '(Infix Times) acc)))
         ((list #\tab #\space #\tab #\space xs ...)
          (loop xs (cons '(Infix Divide) acc)))
         ((list #\tab #\space #\tab #\tab xs ...)
          (loop xs (cons '(infix Modulo) acc)))
         ((list #\tab #\tab #\space xs ...)
          (loop xs (cons '(Store) acc)))
         ((list #\tab #\tab #\tab xs ...)
          (loop xs (cons '(Retrieve) acc)))
         ((list #\newline #\space #\space xs ...)
          (let-values (((symbol rest) (parseSymbol xs)))
            (loop rest (cons `(Label ,symbol) acc))))
         ((list #\newline #\space #\tab xs ...)
          (let-values (((symbol rest) (parseSymbol xs)))
            (loop rest (cons `(Call ,symbol) acc))))
         ((list #\newline #\space #\newline xs ...)
          (let-values (((symbol rest) (parseSymbol xs)))
            (loop rest (cons `(Jump ,symbol) acc))))
         ((list #\newline #\tab #\space xs ...)
          (let-values (((symbol rest) (parseSymbol xs)))
            (loop rest (cons `(If Zero ,symbol) acc))))
         ((list #\newline #\tab #\tab xs ...)
          (let-values (((symbol rest) (parseSymbol xs)))
            (loop rest (cons `(If Negative ,symbol) acc))))
         ((list #\newline #\tab #\newline xs ...)
          (loop xs (cons '(Return) acc)))
         ((list #\newline #\newline #\newline xs ...)
          (loop xs (cons '(End) acc)))
         ((list #\tab #\newline #\space #\space xs ...)
          (loop xs (cons '(OutputChar) acc)))
         ((list #\tab #\newline #\space #\tab xs ...)
          (loop xs (cons '(OutputNum) acc)))
         ((list #\tab #\newline #\tab #\space xs ...)
          (loop xs (cons '(ReadChar) acc)))
         ((list #\tab #\newline #\tab #\tab xs ...)
          (loop xs (cons '(ReadNum) acc)))
         (else (error "Unrecognized input"))))))

(define (parseNumber ts)
  (let loop ((ts ts) (acc '()))
    (let ((x (car ts)) (rest (cdr ts)))
      (if (char=? x #\newline)
         (values (makeNumber acc) rest)
         (loop rest (cons x acc))))))

(define (parseSymbol ts)
  (let loop ((ts ts) (acc '()))
    (let ((x (car ts)) (rest (cdr ts)))
      (if (char=? x #\newline)
         (values (makeSymbol acc) rest)
         (loop rest (cons x acc))))))

(define (makeNumber t)
  (let ((sign (last t)) (ls (reverse (take t (- (length t) 1)))))
    (if (null? ls)
       0
       (let ((num (string->number
                   (list->string
                    (map (lambda (x)
                           (cond ((char=? x #\space) #\0)
                                ((char=? x #\tab) #\1))) ls)) 2)))
         (if (char=? sign #\space)
            num
            (- num))))))

(define (makeSymbol ls)
  (let loop ((s (string-map (lambda (x)
                              (cond ((char=? x #\space) #\0)
                                   ((char=? x #\tab) #\1)))
                           (list->string (reverse ls))))
             (acc '()))
    (if (string-null? s)
       (string->symbol (list->string (reverse acc)))
       (loop (string-drop s 8)
            (cons (integer->char (string->number (string-take s 8) 2))
                 acc)))))

(define (main args)
  (if (not (= (length args) 1))
     (usage)
     (execute (car args))))

(define (usage)
  (display "wspace with Racket 0.1 (c) 2021 Cametan\n")
  (display "-------------------------------\n")
  (display "Usage: wspace [file]\n"))

(main (vector->list (current-command-line-arguments)))

Thanks!

3 Likes

Congrats on getting it working! That's always very satisfying.

If I may ask, is there a specific reason you're using the various srfi/N modules? I didn't read the code in detail but everything I noticed on a quick skim is provided by default in #lang racket.

2 Likes

Relatedly, main could be simplified with command-line.

1 Like

Thanks for reading the code and asking.

is there a specific reason you're using the various srfi/N modules?

Ouch, actually I did not need SRFI-1!
My fingers were light...... Racket has split-at....Wow.

SRFI/13

I needed to use two functions, string-drop and string-take here:

(define (makeSymbol ls)
  (let loop ((s (string-map (lambda (x)
                              (cond ((char=? x #\space) #\0)
                                   ((char=? x #\tab) #\1)))
                           (list->string (reverse ls))))
             (acc '()))
    (if (string-null? s)
       (string->symbol (list->string (reverse acc)))
       (loop (string-drop s 8)
            (cons (integer->char (string->number (string-take s 8) 2))
                 acc)))))

We need this function to make LABELs to GO TO.... Yes, of unstructured languages.
In Whitespace language, the data are expressed as 0(#\space) and 1(#\tab), or binary.
At first, I did not understand the binary series following "Label"s. When the program first passed some tests, "Label" took along a bunch of "#\space"s and "#\tab"s.
I was surprised like "What the hell...?", and I could slowly see they should be strings, but vaguely.
So I checked the length of it, and found out each length of them are divisible by 8; thus it implied the binary might be ASCII code.
Therefore, I've got to write a recursive function, taking 8 characters consisting of #\space and #\tab, each at once, converting them into an ASCII character.
That was why I needed SRFI-13's string-take and string-drop.

SRFI-48

This is just my favor. I'm fond of Common Lisp Style format......
Purely Racket users may love using printf instead...

Ah, yes, you are right!

Thanks!

That definitely works, although I note that makeSymbol spends a lot of time creating lists and then consing and/or reversing, and creating strings then splitting them up and putting them back together, as well as doing a lot of manual looping. It's also vulnerable to invalid input -- it will blow up with bad messaging if given a list that is not a multiple of 8. Here's another version that avoids all that and uses only basic Racket:

#lang racket

(require srfi/13)

; The original version                                                                         
(define (makeSymbol ls)
  (let loop ((s (string-map (lambda (x)
                              (cond ((char=? x #\space) #\0)
                                   ((char=? x #\tab) #\1)))
                           (list->string (reverse ls))))
             (acc '()))
    (if (string-null? s)
       (string->symbol (list->string (reverse acc)))
       (loop (string-drop s 8)
            (cons (integer->char (string->number (string-take s 8) 2))
                 acc)))))

; A suggested new version using only basic Racket
(define/contract (makeSymbol* ls)
  (->i ([ls (non-empty-listof (or/c #\space #\tab))])
       #:pre (ls) (zero? (modulo (length ls) 8))
       [result symbol?])

  ; The contract guarantees that our list contains only tabs and spaces, that                  
  ; the list is not empty, and that it contains a multiple of 8 elements.                      

  (define (list->binary group) (map (λ (x) (if (char=? x #\space) #\0 #\1)) group))
  (string->symbol
   (list->string
    (for/list ([next-group (in-slice 8 ls)])
      (integer->char
       ((curryr string->number 2)
        (list->string
         (list->binary next-group))))))))

(define chars '(#\space  ; 0                                                                   
                #\tab    ; 1                                                                   
                #\space  ; 0                                                                   
                #\space  ; 0                                                                   
                #\space  ; 0                                                                   
                #\space  ; 0                                                                   
                #\space  ; 0                                                                   
                #\tab    ; 1  =>  01000001 = 65 = A                                            
                #\space  ; 0                                                                   
                #\tab    ; 1                                                                   
                #\space  ; 0                                                                   
                #\space  ; 0                                                                   
                #\space  ; 0                                                                   
                #\space  ; 0                                                                   
                #\tab    ; 1                                                                   
                #\space  ; 0  =>  01000010 = 66 = B                                            
                ))

(makeSymbol  chars)  ; The original version yields:  'B\202
(makeSymbol* chars)  ; The new version yields:       'AB

I'm not sure which of those results is correct based on the order that the input is going to come out of the file, so it might be necessary to reorder the list before passing it to makeSymbol*. Still, something to think about.

EDIT:
Here's another version that unwinds some of the deeply nested code in order to make it easier to read:

(define/contract (makeSymbol2 ls)
  (->i ([ls (non-empty-listof (or/c #\space #\tab))])
       #:pre (ls) (zero? (modulo (length ls) 8))
       [result symbol?])
  (define (list->binary group) (map (λ (x) (if (char=? x #\space) #\0 #\1)) group))
  (define chars-list                                                                               
    (for/list ([next-group (in-slice 8 ls)])
      (let* ([next-group (list->binary next-group)]
             [next-group (list->string next-group)]
             [next-group (string->number next-group 2)])
	  (integer->char next-group))))
  (string->symbol (list->string chars-list)))
2 Likes

Yes, actually I was too stuck with the original Haskell code to understand how the original author meant; thus there is a bunch of room to be optimized for Racket, I believe.

The original Haskell code there is:

makeString :: [Token] -> String
makeString [] = ""
makeString (t:ts) = (show t)++(makeString ts)

and at first, I thought this one was enough.

(define makeString list->string)

What I got then was, as I said before, sequences with a bunch of #\newline's and #\tab's. Haha.

I do not know whether makeString of Haskell gives back a proper, meaningful string or not, because it is basically an internal representation for the Whitespace virtual machine, but not for output; thus I am not sure if it should be readable for us or not.
That means, the strings made with makeString are the names for "Label"s to jump to, and that implies, in Lisp languages, to use Symbols is better and more efficient way to compare "Label"s, as you know.
That is why I changed the result of the function from "strings" into "symbols".

Anyway, what I did was basically to know what the original author meant with the bunch of #\space's and #\tab's there.
And thanks to help letting me to know more efficient way to convert them in Racket.

I'm not sure which of those results is correct based on the order that the input is going to come out of the file, so it might be necessary to reorder the list before passing it to makeSymbol*. Still, something to think about.

Yes, you are right. The original Haskell code passes the "reversed" one which consing makes. Surprisingly speaking, the original author likes to process anything in the reversed order......
Therefore, in the original Haskell version, the byte code passed here is 10000010 which is A, 01000010 which is B, and so on.
Would you please give me a break?

You may notice this in makeNumber process much more apparently to be cleared.

makeNumber :: Num x => [Token] -> x
makeNumber t
   | (last t) == A = makeNumber' (init t) 1
   | otherwise = -(makeNumber' (init t) 1)
  where
     makeNumber' [] pow = 0
     makeNumber' (A:rest) pow = (makeNumber' rest (pow*2))
     makeNumber' (B:rest) pow = pow + (makeNumber' rest (pow*2))

When I at first saw this, I did not understand how important the last element of a list is. This implies the last element controls if the number is negative or positive.
As you know, if we assumed that the list is something reversed, that would make sense. Yes, it must be a reversed signed binary number.
Anyway, the original Haskell code must be born in the World of Mirror Image......