Syntax of `id` in grammar for FEP

If I look at the documentation for a linklet grammar here: 14.14 Linklets and the Core Compiler

it says:

The grammar of an defn-or-expr is similar to the expander’s grammar of fully expanded expressions (see Fully Expanded Programs) ...

In Fully Expanded Programs) though, I feel like the syntax for an id is missing. I understand this is a tiny detail but actually for a long time I thought ids couldn't start with a number until I saw I could write 1/xpto as an id. What's possible? Is any sequence of graphical chars until a space or paren, an id?

2 Likes

Section 1.2.2 of the Reference (1.2 Syntax Model) says that "an identifier is represented as a syntax object containing a symbol." The rules for symbols are given in Section 3.6 of the Racket Guide (3.6 Symbols). I agree that the documentation should be clearer about this.

1 Like

In linklets, an id is a symbol. Any string can be a symbol, since you can use | to escape arbitrary characters. That symbol is transmitted as-is through the linklet process. Here's an example of running a program that uses a single space as an identifier.

[samth@huor:~/.../racket/benchmarks/shootout (show-pass) plt] PLT_LINKLET_SHOW=1 r -e '(define | | 1)'
;; linklet ---------------------
(linklet
  ((.top-level-bind! .top-level-require!)
    (.mpi-vector .syntax-literals)
    (.namespace .phase .self .inspector .bulk-binding-registry
      .set-transformer!))
  (\x20;) (define-values (\x20;) 1)
  (if #f (begin (set! \x20; #f)) (void))
  (begin
    (.top-level-bind! (unsafe-vector*-ref .syntax-literals 0)
      (unsafe-vector*-ref .mpi-vector 0) 0 .phase .namespace
      '\x20; #f '#f)))
;; schemified ---------------------
(lambda (instance-variable-reference .top-level-bind!1
         .top-level-require!2 .mpi-vector3 .syntax-literals4
         .namespace5 .phase6 .self7 .inspector8
         .bulk-binding-registry9 .set-transformer!10 \x20;11)
  (define \x20; 1)
  (variable-set!/define \x20;11 \x20; '#f)
  (let ([app_12 (unsafe-vector*-ref .syntax-literals4 0)])
    (#%app .top-level-bind!1 app_12
      (unsafe-vector*-ref .mpi-vector3 0) 0 .phase6 .namespace5
      '\x20; #f '#f)))
;; compiled ---------------------
done
1 Like

Thanks - that's interesting and scary at the same time. :slight_smile:
I don't think I knew about the power of |.

So either you have anything within |, or any sequence of characters except space and parens can be an id. Is that it?

It's even better, since the characters can be Unicode !

Welcome to DrRacket, version 8.5 [cs].
Language: racket, with debugging; memory limit: 4096 MB.

(define (:grinning::laughing::joy: :wink:) (+ :wink: 8))
(:grinning::laughing::joy: 5)
13

2 Likes

I think I would just say that it can be any symbol, and symbols can have arbitrary strings as their content. The use of | is part of the printer, not the symbol itself.

Not quite so simple.

'foo

means
(quote foo)

And I suspect 3.5 is a number, not a symbol.

But |'foo| and |3.5| are probably symbols.

Can anyone confirm?

And how does one write | as a one=character symbol? ||| ?

-- hendrik

Outside |...|, you can use backslash to escape, so

>foo)

can be produced with:

\||foo)|

You mean "they are probably identifiers", not symbols.

They are only symbols if you do '|'foo| or '|3.5|.

Welcome to Racket v8.5 [cs].
> |'foo|
'foo: undefined;
 cannot reference an identifier before its definition
  in module: top-level
 [,bt for context]
> |3.5|
3.5: undefined;
 cannot reference an identifier before its definition
  in module: top-level
 [,bt for context]
> (define |3.5| 3.5)
> |3.5|
3.5
> (symbol? |3.5|)
#f
> '|3.5|
'|3.5|
> (symbol? '|3.5|)
#t

This is horrendous, even if very general.

1 Like

I think it can be a useful feature, especially considering that other kinds of #langs may have very different requirements for what is allowed as a valid identifier, it is useful to have an underlying implementation language that can use pretty much everything.

(Imagine having to mangle/encode/decode your dsl identifiers to/from a very specific narrow set of allowed identifiers in the implementation language, personally I would find that more annoying)

Another case where it was useful for me is in the syntax-class that is used for a name mapping in define-attributes there an empty symbol is used for the case when you want to "drop" the name, making the implementation simpler.

(define-attributes ([l]) vec3- (x y z)) ;; x y z is bound to (vec3-x l) (vec3-y l) (vec3-z l)

Looking a bit closer at this, what sort of syntax is that \x20;? I assume something starting with \x and ending with ; has some special meaning ? Although I can't find anything in the docs.

I guess that's a matter of preference. :slight_smile: For someone who works on C/C++ compilers, name mangling/demangling is just accepted so I guess my views here are biased.

I believe this is from Chez Scheme. You can search for “\x” in https://cisco.github.io/ChezScheme/csug9.5/csug9_5.pdf (esp. the ones under the section 7.9 and print-extended-identifiers)

Oh, interesting. You seem to be definitely right. Thanks.