Partial evaluation in guile › 钻进兔子洞 down the rabbit hole [#272]

到这里一切都还很简单:我们有一个 终止 ( terminates ) 的简单、 良类型 ( well-typed ) 的例子。但要成为真实世界的编译器的一部分,一个部分求值器需要能够应对真实世界的代码:可变数据的访问器,访问可变绑定(词法和全局的),可能无限的递归,未绑定变量,以及 贫瘠类型 ( poorly-typed ) 的程序。此外,一个真实世界的内联器还需要运行得快,并避免产生膨胀的留存代码。
Up to here things are easy: we have a simple, well-typed example that terminates. But to be part of a real-world compiler, a partial evaluator needs to handle real-world code: accessors for mutable data, access to mutable bindings (lexical and global), indefinite recursion, unbound variables, and poorly-typed programs. In addition, a real-world inliner needs to run quickly and avoid producing bloated residual code.

我应该花点时间指出,静态类型的函数式语言只要简单定义一下,就能避免许多这些问题。编译器专家们倾向于使用早期绑定就不奇怪了。Scheme 确实通过对词法作用域的使用展现了一定程度的早期绑定,但它并不是一个纯函数式语言。在开发 peval 时,是我第一次希望 Scheme 中有不可变的 ( pairs ) ,就像 Racket 和 R6RS 推广的那样。
I should take a moment and note that statically-typed, functional languages can avoid a number of these problems, simply by defining them away. It is no wonder that compiler people tend towards early binding. Scheme does exhibit a fair amount of early binding through its use of lexical scope, but it is not a pure functional language. Working on this peval was the first time that I wished for immutable pairs in Scheme, as promoted by Racket and R6RS.

无论如何,在你的语言中拥有可变性并不是那么的糟糕。你确实错过了一些优化机会,但这没关系。更糟的是生产环境中 peval 在一个表达式花费太多时间。
Anyway, having mutability in your language isn't so bad. You do miss some optimization opportunities, but that is OK. What is not OK in a production peval is spending too much time on an expression.

Guile 的解决方案,遵循 Waddell 和 Dybvig 卓越的 Fast and Effective Procedure Inlining 中的方法,只需计算通过内联器的次数。每次内联尝试都会得到一个新的计数器,在内联尝试中的任何工作都会减少计数器的值。当计数器归零时,内联尝试被中止,取而代之的是留存一个调用。由于程序中的 调用点 ( call sites ) 数量是固定的,且每个调用点将完成的工作量是有限的,因此最终的算法在源程序大小为 N 时是 O(N)。
Guile's solution, following Waddell and Dybvig's excellent Fast and Effective Procedure Inlining, is to simply count the number of times through the inliner. Each inlining attempt gets a fresh counter, and any work performed within an inlining attempt decrements the counter. When the counter reaches zero, the inlining attempt is aborted, and a call is residualized instead. Since the number of call sites in the program is fixed, and there is a maximum amount of work that will be done at each call site, the resulting algorithm is O(N) in the size of the source program.

为了让定义在他们被使用的上下文中被处理,Guile 的部分求值器还使用了 Waddell 和 Dybvig 的 按需 ( on-demand ) 在线 ( online ) 策略。例如,当 (cons 1 2) 被作为条件测试处理时,可能会被化简为 #t。如果在处理 let 的 body 后,一个绑定没有被引用,那么他会被作为 效应 ( effect ) 处理。诸如此类。
Guile's partial evaluator also uses the on-demand, online strategy of Waddell and Dybvig, to allow definitions to be processed in their use contexts. For example, (cons 1 2) may be reduced to #t when processed as a test, in a conditional. If, after processing the body of a let, a binding is unreferenced, then it is processed for effect. Et cetera.

工作量 ( effort ) 计数器设置后,Guile 只是简单地尝试内联程序中的每一个调用点,因为如果不奏效它自己会退出。这听起来有些疯狂,但正如 Waddell 和 Dybvig 展示的,这是有效的。工作量计数器也用于限制代码增长,尽管这方法有些粗糙。无论如何,当我优化 Guile 使用的 psyntax 扩展器时,代码的增长量不到百分之一,这对我来说是个胜利。
With the effort counter in place, Guile simply tries to inline every call site in the program, knowing that it will bail out if things don't work. It sounds a little crazy, but it works, as Waddell and Dybvig show. The effort counter also serves to limit code growth, though it is a bit crude. In any case I got less than a percent of code growth when optimizing the psyntax expander that Guile uses, which is a win in my book.