committed 10:55PM - 28 Jun 21 UTC
Summary:
Syntax arming and disarming were part of a design to allow sandboxing
…untrusted code without unduly constraining the sandboxed code. This
commit replaces that approach with a simpler one. The trade-off is
that some advanced macro-implementation tools, including
`syntax-local-value` and `local-expand`, cannot be referenced directly
within a sandbox (i.e., in a context where the code inspector is
different than the original one).
Long version:
Unsafe operations --- or operations that would otherwise break
invariants within a trusted library --- can be protected by not
exporting them or by using `protect-out` on an export. Protected
bindings cannot be referenced directly in sandboxed programs. It's
common, however, for macros to expand to (suitably guarded) uses of
unsafe or unexported bindings. Sandboxed programs need to be able to
use those macros, even though expansions refer to bindings that cannot
be referenced directly by the sandboxed program.
To make that combination work, quoted syntax objects that appear in a
macro retain a right to access the same bindings that would be allowed
in the enclosing module. This system of protected bindings and
quoted-syntax access is working, as far as we can tell, EXCEPT for
some cases when untrusted code uses `expand`, `local-expand,`
`syntax-local-value`, and variants of those functions. Some trusted
macros and tools need to force expansions and rearrange the result, as
in the `class` macro or the errortrace library, which is why things
like `local-expand` exist. The danger is that portions of the
expansion can be extracted, with permissions intact on the extracted
part, and then abused by untrusted programs.
Syntax "arming" was an attempt to close that hole by distinguishing
trusted and untrusted uses of expanded code, revoking permissions on
an identifier by tainting it when it is extracted from an expansion by
an untrusted party. The tricky part has been drawing the line between
trusted and untrusted uses through a mixture of inferred and explicit
boundaries. Explicit boundaries usually involve `syntax-protect`.
Experience has shown that it's difficult to remember to use
`syntax-protect` consistently enough, and we have not been able to
enlarge the role of inference enough to close the hole.
This commit tries a simpler approach, which is to discard the complex
arming system and instead just protect operations like `local-expand`
that force macro expansions. Protecting `syntax-local` makes a sandbox
less flexible: it prevents running an untrusted module in a sandbox
when the module implements its own macros with `local-expand`,
`syntax-local-value`, and some related expansion-time operations. Most
macros do not need those facilities, so most modules would still work
in a sandbox. Meanwhile, sandboxed modules can still use macros that
use those faciltiies and that are implemented in trusted modules.
A slightly different approach is used for the `expand` family of
functions, which are not used by macros, but are instead for debugging
and exploration. Each of those functions now takes an additional
inspector argument that defaults to `(current-inspector)`; if that
inspector is not the original one, then the result of expansion is
preemptively tainted, so identifiers in the expansion cannot be used.
(Syntax objects that are included in syntax-error exceptions are also
still tainted that way, as before.) So, `expand` can still be used in
a sandbox to explore expansions without necessarily accessing bindings
that are referenced in the expansion.
This change does not entirely remove the burden for implementing
modules that are intended to be trusted. Instead of requiring a
careful use of `syntax-protect` on macro expansions, the problem is
narrowed to using `local-expand` and related functions correctly, as
well as protecting any exports that would expose the capabilities of
`local-expand`. Non-protected exports that expose `local-expand` are
rare, in contrast to macros that lack `syntax-protect`. Another
benefit is that tools and macros that manipulate expansions no longer
need to use `syntax-disarm` and `syntax-rearm`, which was painful and
error-prone.
The functions `syntax-protect`, `syntax-arm`, `syntax-disarm`, and
`syntax-rearm` are still available, but they now just return the given
syntax object unmodified. The 'taint-mode and 'certify-mode
syntax-object properties are no longer specifically recognized by the
expander.
This change is work with @michaelballantyne, who identified problems
with the current system and moved the discussion in this direction,
plus @rmculpepper, @samth, and @mfelleisen.