DSL coloring question

I'm trying to create a DSL (for AP Comp Sci Principles' Pseudocode) and would like a few more categories for the syntax colorer. Are those permanently fixed such that they can't be changed? If so, is there a way to load color preference into DrRacket when you specify the language? Currently, Symbol and Keyword are the same, as are String, Text, and Constant. If I could change those, I might be able to get by with the provided color types. (Although I'm wondering if the docs are out of date, since they only have 'string and 'constant, but not 'text as a possible value for the token type.

1 Like

I did find this ticket.

So I guess my question is now, is there a way to load preferences with a #lang declaration.

See 1 Tool support for #lang-based Languages about the general mechanism of a lang supplying a "get-info" procedure.

The first subsection is about syntax coloring. You'll need to chase a link to the start-colorer method of color:text<%>.

Although I've only used these via get-info in a tool (Emacs), not implemented one for a lang, I think the hand-wavy summary is you can wrap and reuse the lexer you probably already wrote for your lang, to supply the coloring tokens to tools?

1 Like

Yeah. I've set up the coloring, but I wanted more colors than are available, since the default color scheme sets some of the colors to the same. Is there any way to load new color preferences based on the #lang chosen, or do I have to just tell my students to go in and change some colors?

My understanding is that the set of symbols (for the token "type") returned by the lang's lexer is open -- it's not limited to those for the #lang racket lexer, for example.

Furthermore, instead of returning a symbol, the lexer can can return a hash-table with a 'type mapping... as well as any other mappings it wants to. (Some of these extra mappings are motivated by #lang rhombus, and I know some of them ad hoc. I don't know if/when some might get "promoted" to be documented as "official" or "recommended" to use for other langs with similar characteristics?)

As for what token types get mapped to what colors, that's left up to each tool, such as DrRacket.

1 Like

p.s. To discover more, I suppose you could look at what rhombus does for its color-lexer.

That chain seems to start here: rhombus/rhombus-lib/rhombus/private/core.rkt at eb600b73f8004f591a49ba25511fcdba568d126c · racket/rhombus · GitHub

and continue here: rhombus/rhombus-lib/rhombus/private/syntax-color.rkt at eb600b73f8004f591a49ba25511fcdba568d126c · racket/rhombus · GitHub

You could get a sense for its variety of token types (and maybe some other interesting details?).

Then, you could open DrRacket and discover what kind of UI is provided for configuring them to be various colors.


Apologies if none of this is exactly on point. Hopefully someone else can chime in.

1 Like

This is helpful. I was able to get a sufficient number of colored types, but without a way to provide a default color scheme, several of them end up the same color. I suppose I could create my own color scheme and provide instructions for how to use it.

It's an interesting problem.

On the one hand, as you say, a lang may want distinct colors for distinct token types.

On the other hand, the lang probably shouldn't supply an exact color, because doing so shades off (pun) into the responsibility of themes. If a lang supplies some literal color, it could look bad or even be unreadable in certain themes. There needs to be some level of indirection.


Spitballing, two ideas:

  1. Simplest: Tools like DrRacket and Racket Mode for Emacs could let users/themes pick some distinct color for "all other" token types. Although this might help the novel tokens stand out from "typical" ones -- a sort of, "hey, configure me!" signal -- the novel ones would default to all looking like each other. So not ideal.

  2. Building on a concept from e.g. Emacs, where a "face" (~= font + attributes) can "inherit" plus modify some other face: There could be some little DSL for saying things like, "The foo token should look like the symbol (or string or whatever) token, except bold (or italic, underlined, lighter, darker, whatever)".

    But... this could be complicated to implement reliably -- especially in a "portable" way that could work well across diverse tools? And I worry even done well, such a DSL could turn out to be insufficient for langs with many distinct tokens?

Probably there's some third, better idea?