Racket and LLMs

I was using ChatGPT recently to help me generate a TSX parser using the parser-tools package and it failed spectacularly: it just couldn't figure out that define-tokens defines new functions of the form token-FOO.

Seeing more and more LLM use across the board, I'm starting to get worried that smaller languages like Racket will all but disappear in the near future, because they don't play well with LLMs.

This article (F# being the endangered language there) echoes that sentiment.

  1. Is something being done on that front already?
  2. Is this even a concern for the developers of Racket?

Personally, I just started getting into Racket and it feels like a well-designed language for a change :smiley:

It'd be a shame to see it disappear and not grow more popular.

2 Likes

Hi, @dhamidi!

Interesting topic.

There have indeed been some experiments that I am aware of, see for example the recent llm lang.

I am not a great developer, so my opinion probably doesn't matter. Lots of work at my current job is being re-evaluated under the guise of "agentic workflows", and it is somewhat scary because it is highlighting how much work we do is in fact just nonsense, and, how much room for boring old automations there still is. But the new shiny makes things look different to the powers that be.

However, and I think this is important for my own sanity, programming is a tool created by humans to solve human problems.

The only reason I program, is because it makes me feel good in an odd sort of way--which most people in creative work would understand. It's not always great, but the drive never goes away to do interesting things, even when the going gets tough.

If someone came to me today and said, "you never have to program again, it's the machine's job now," I would still program, just not for money.

My point being: who cares? Not in a snide way, mind you, but in the sense that the language finds you. If you happen to find yourself on a track which never crossed paths with Racket, that is indeed a shame. But, if your path never intersected this one, there is no loss, either. There is no sunk-cost, and you weren't deprived of anything, really.

I don't think it makes sense to measure the success of the language in terms of the raw numbers of people adopting it at any given moment. It is a healthy indicator, but even a rotten tree may house many birds before it perishes.

Anyways, I'd love to hear other Racketeers' opinion on the matter, too. From my meager vantage, it seems like the space is healthy and thriving, and as long as we have all these smart people being curious, I doubt the language will die, irrespective of the tides of the world.

I would bet good money that Racket's ecosystem, although not massive, is probably more often than not on the cutting edge of programming language research, whichever corner of that space you happen to inhabit.


P.S. Something that comes to mind, is that the probability of an entity surviving into the future is correlated with how long it has already survived in the past. In this sense, autopoiesis is more important than mere growth. Racket, being the language for creating languages, definitely has an autopoietic function which is hard to disrupt, even with something as "cataclysmic" as the current pace of machine learning adoption.

4 Likes

I could very well see this happening--that is, everything becomes C# or Java or Python (especially Python) or JS because the models are all trained on that and that's most likely to give a developer the most leverage.

I think that all we can do about that is to try to get more repos on GH so that the models can see more Racket code. Once the LLM's see more Racket then we'll see better support for it. But it's also very likely that in companies that aren't first and foremost concerned about technical issues (that is, most of them) that this will force an even stronger move to "everything must be OOP or everything must be JS." And that's a pity because I think OOP (at least the way that C# and Java do it) is more of an anti-pattern than a good path forward.

1 Like

General:

Arjun Guha @ Northeastern (https://www.khoury.northeastern.edu/home/arjunguha/main/home/) has been investigating the use of LLMs for “minority languages”. You may wish to check out his work.

Specific:

When you use a miniDSL or just macros, you need to realize that your code is essentially in a minority language within a minority language. Suppose you use some builder pattern (“fluid API”) to create a DSL within C#. Then write some code in it and also ask the LLM to write code in this embedded DSL. How well would that go?

We have used LLMs to solve some of the silly Leet code problems in Racket. A lot of the code solves the problem

— in awful ways
— in old-style Scheme that we used to write 20 years ago
— with a line of comment for every line code (set! x (add1 x)) ;; add 1 to x

Some of the code is plain wrong. Some of the code is half wrong. All of this happens in the “core” language.

Back to general:

Unless LLMs improve for mainstream PLs (C#/Java/JS/TS) so that they don’t produce (1) buggy code that causes GH churn (2) code with large security problems (see on-line articles, say CACM Jan 2025), we will need to continue to conduct research on PL to help inexperienced (who shouldn’t use LLMs right now) and experienced developers with the use of LLMs to write code. What tihs looks like is an open question.

— Matthias

5 Likes

I think you are missing the fact, that you can teach the LLM how define-tokens work.

Paste the relevant paragraph from the documentation.
Let it explain a few simple examples - and correct it if it misses something.
Then let it generate a new parser.
Complain about any mistakes.

The prompt explain followed by, say, a syntax-parser macro gives surprisingly good results.

1 Like

I wonder if the recent chain-of-thought results from DeepSeek would be beneficial along this trajectory. It is basically what you're saying, in any case.

@EmEf thank you for your thoughtful reply! It did put me at ease :slight_smile:

@soegaard instructing the LLM was indeed something I missed! :bulb:

After some back-and-forth, I managed to get useable (i.e. they don't crash immediately) bindings for Tree Sitter in about 20 minutes.

This made the task small enough for me to actually end up with useable tree-sitter bindings as opposed to not getting anywhere because manually translating the header file would have taken too much time.

For the curious, here's the chat transcript: ChatGPT - Racket Foreign Library Summary

It's fun to see the transcript. Getting it to read the relevant documentation seemed to work well.

Have you tried getting it to generate documentation in form of Scribble?

It would be interesting to see both a "Reference" and a "Tutorial" for the bindings it produced.

@EmEf -- I would be interested in hearing how Scheme programmers have changed the kind of code they write over the last 20 years. (I ask because I worry that my own style is 20 years out of date.) I don't suppose there's something I could read on that?