Making sense of the documentation on continuations

Hi,

The documentation on continuations talks about extending the current continuation. This sounded very mysterious to me, until I realized that this must be what most programming languages refer to as the call stack (I assume that there are reasons for calling it something different). Extending the current continuation, I surmise, is basically like pushing things onto the call stack so that when the current function returns, it "returns" to these other places first. Extending the current continuation is not unique to first class continuations, I think. That happens every time you call a normal function. Doing an abort is like erasing the stack down to the nearest prompt, or whichever prompt applies. And none of this seems to matter unless you return from the function that captured the continuation. If you keep invoking continuations and don't return, then the behavior is the same between call/cc and composable continuations.

Am I on the right track?

Thanks..
Neal

On Feb 5, 2026, at 6:03 PM, Neal notifications@racket.discoursemail.com asked whether he was "on the right track” wrt to continuations and extensions.

No.

— Continuations are a concept, a specification if you so wish. The definition says the “rest of the computation from a certain point.”
— The word has a second meaning with many flavors in the docs: a data representation of the concept: abortive (call/cc), imperative (shift, reset), functional (control, prompt).
— Like all kinds of data, it comes with various operations. For example, functional continuations look like functions and act like them. When a program composes two of them (in the sense of f o g from math, or (g (f x))), the underlying implementation “appends” the data structures (because they are not closures).

— A call stack is part of an implementation. It is an old-fashioned word that “old” people w/o understanding of modern PL concepts use, especially when they understand PL ideas only in terms of compilers.

As someone without a formal PL background, I found the “Evaluation Model” chapter in the Racket Reference helpful in understanding the concept of continuations:

I also like the box diagrams in many of Matthew's papers and talks. I haven't fully re-watched it now, but I think the talk Delimited and Composable Continuations in PLT Scheme - Microsoft Research and the underlying paper might be help to understand how the zoo of racket/control fits together.

5 Likes

I'm going to check out those resources. Thanks.

Are delimited and composable continuations the same, or are they different, but overlapping, concepts? The composable continuation seems to behave more like a regular function, in that it returns back to the point where it was invoked. But that in and of itself doesn't really tell you how and why you would compose them. Are we literally talking about:

(compose continuation1 continuation2)

That looks intriguing to me, but I haven't wrapped my head around what that would do and what it's good for. I've inserted comments in the excerpt from the documentation that follows:

For example, in

(+ 1 (+ 1 (+ 1 0)))

at the point where 0 is evaluated, the expression context includes three nested addition expressions. We can grab that context by changing 0 to grab the continuation before returning 0:

(define saved-k #f)
(define (save-comp!)
(call-with-composable-continuation
(lambda (k) ; k is the captured continuation
(set! saved-k k)
0)))
(+ 1 (+ 1 (+ 1 (save-comp!))))
3

This example is very contrived. It seems like most of the examples of what continuations are used for are either very contrived and trivial, or else they deal with problems, like exceptions, that are highly idiosyncratic and hard (at least for me) to generalize from.

Also, a lot of the examples use set! to save the continuation. We all know that we should use set! sparingly. So I'm wondering whether that's done because it's really the state of the art, or if it's more of a convenience for the sake of documentation examples.

I could fill a much larger post with all of the questions I could ask, but I'll leave it here.

Thanks,
Neal

The continuation saved in saved-k encapsulates the program context (+ 1 (+ 1 (+ 1 ?))), where ? represents a place to plug in a result value—because that was the expression context when save-comp! was called. The continuation is encapsulated so that it behaves like the function (lambda (v) (+ 1 (+ 1 (+ 1 v)))):

(saved-k 0)
3
(saved-k 10)
13
(saved-k (saved-k 0))
6

EmEf wrote:

The word has a second meaning with many flavors in the docs: a data representation of the concept: abortive (call/cc), imperative (shift, reset), functional (control, prompt).

I didn't know that there are entirely different paradigms for using continuations, and I would definitely like to look into that.

A call stack is part of an implementation. It is an old-fashioned word that “old” people w/o understanding of modern PL concepts use, especially when they understand PL ideas only in terms of compilers.

Call stacks are not just an abstraction, they're built into the hardware. x86 has call and ret instructions for maintaining the call stack. The first programming language I really loved was C, and its concepts are deeply embedded in my brain. But it's not a terrible worldview, because it's basically a thin layer on top of machine language, and the machine is ground truth.

Neal

Is it, though? I think that a tremendous amount of the work in the last hundred years has been on moving programming from "here's a machine, how do I poke it to get it to do something" to "here's a description of a language that is independent of a particular machine; a program written in this language has a particular meaning and behavior, and if it behaves differently on these two machines, then the machine is not correctly evaluating the program." I think Herb Simon's "Sciences of the Artificial" has a lot to say about the difference between these two.

With that said, I think it's important to acknowledge that many classrooms, teachers, and textbooks use "call stack" not as a description of a machine but as part of a language specification, albeit one that breaks pretty badly when applied naively to languages with closures. Cue discussions of the definition of the term "notional machine"....

That actually wasn't the best argument I could have made. Just because the machine has instructions for something doesn't mean you have to use them. Evidently racket uses the heap for what you would usually use the stack for, and so you don't have to worry as much about stack overflow.

What I was trying to do, anyway, was understand what "extending the current continuation" means in the documentation. The documentation for racket is very extensive and on the whole, excellent. But some parts can be terse and jargony and deficient in examples.

As to your point about whether the machine matters when you're talking about things like software or programming languages in the abstract. If you're Alan Turing, real hardware didn't matter much when he was working on things like the halting problem, but it did matter a lot when he was trying to crack the enigma code in WWII.

Neal

To agree and give an "and also":

In my very early days C was a reasonably useful model of, and thin layer over, "machine ground truth". But less so today.

For one thing, C is a fatter layer over machine code, due to more aggressive optimizing compilers.

At the same time, what's under the machine code has grown more complicated, e.g. instruction pipelining and branch prediction.

More than ever, machine code is really "just" bytecode for a micro code interpreter, and C is waving fondly at machine ground truth as it goes below the horizon. :slight_smile:

4 Likes

p.s. Not to belabor this, but:

I wrote some 6502 assembly. I seem to recall it lacked MUL and DIV opcodes; you had to code that yourself.

Later, on 8086, I seem to recall that it offered multiply/divide opcodes --- but those were just implemented in micro code... not the arithmetic hardware.

Perhaps I'm misremembering, but I think the basic idea is that systems have layers, nearly all of which are "just code", written in various languages.


Hey if we're lucky, someday maybe some group of visionaries will build a programming language environment as a kind of "tower of languages". :face_with_tongue:

3 Likes

p.s. Not to belabor this, but:

I wrote some 6502 assembly. I seem to recall it lacked MUL and DIV opcodes; you had to code that yourself.

Later, on 8086, I seem to recall that it offered multiply/divide opcodes --- but those were just implemented in micro code... not the arithmetic hardware.

Perhaps I'm misremembering, but I think the basic idea is that systems have layers, nearly all of which are "just code", written in various languages.

There's "just code" and there's "just code". The big difference is that the code in some layers is really hard to change.


Hey if we're lucky, someday maybe some group of visionaries will build a programming language environment as a kind of "tower of languages". :face_with_tongue:

See the so-called "language-oriented programming" championed by Racket.

-- hendrik

Of course, there are properties of the machine that can't be ignored at any level. There are the limitations of the von Neumann Architecture and the absence of true nondeterminism. If you could run all code paths non-deterministically at the same time for free, we probably wouldn't need continuations.

Neal

The Matthew Flatt video was very helpful. I was able to solve the continuation issue I was struggling with. Thanks.

Absolutely! I claim you've essentially just identified the Church/Turing thesis. Specifically, it turns out that von Neumann machines can compute the same things as the Lambda Calculus, Turing machines, and (Gödel's terminology) recursive functions. In my opinion, it is indeed mysterious that all of these are so perfectly aligned, and it's what makes it useful to construct abstract machines and formal semantics; it turns out that the languages you can describe formally with these systems are exactly the ones that can be implemented an run on von Neumann machines. So, the limitations that you say can't be ignored at any level also can't be ignored by the languages we describe with our formal tools.

1 Like

Except that the hardware stack is much more eficient than
putting eerything on the heap.
There has been a lot of work done figuring out how the hardware
stack can safely be used as a stack and when you really need
the garbage-colected heap. This is nontrivial.

-- hendrik

1 Like

I didn't state that very clearly. I was thinking of constraints like, e.g., the fact that up until a few years ago, persistent storage meant hard disks, which had really poor random access performance. That expectation of slow seeking had a big effect on the design of relational database systems, for instance. Now we have SSDs and large amounts of DRAM available.

So I was thinking of constraints imposed by hardware, like the expectation of slow disks, that may have bubbled up to the design of programming languages. But maybe it's not obvious that these are artifacts of hardware, because we take them for granted. I don't know all that much about Programming Languages as a field of study, so I'm just musing.

1 Like