Translation. Zipper in scheme [jsr-000O]
- December 18, 2024
-
Oleg Kiselyov with contributions from Jinser Kafka
- Original
Translation. Zipper in scheme [jsr-000O]
- December 18, 2024
- Oleg Kiselyov with contributions from Jinser Kafka
- Original
This article is translated from Oleg Kiselyov's article titled "zipper in scheme".
本文翻译自 Oleg Kiselyov 题为《zipper in scheme》的文章
Title added by translator. 标题由译者添加。
tips: Click on the translation to expand and view the original text. 点击译文可展开查看原文。
1. 前言 [#271]
- December 18, 2024
- Oleg Kiselyov
1. 前言 [#271]
- December 18, 2024
- Oleg Kiselyov
zipper 是一种非常方便的数据结构,它允许我们替换复杂数据结构(例如树或 term)深处的项,而无需任何可变性。结果数据结构将尽可能多地与旧结构共享其组件 [见附录]。旧的数据结构仍然可用(这便于我们撤销该操作)。本质上,zipper 是一个“可更新”但纯函数式的数据结构游标。
Zipper is a very handy data structure that lets us replace an item deep in a complex data structure, e.g., a tree or a term, without any mutation. The resulting data structure will share as much of its components with the old structure as possible [see addendum]. The old data structure is still available (which can be handy if we wish to 'undo' the operation later on). Zipper is essentially an `updateable' and yet pure functional cursor into a data structure.
有用的参考:
Useful references:
http://www.nist.gov/dads/HTML/zipper.html
http://citeseer.ist.psu.edu/hinze01web.html
zipper 是一种由定界延续具体化而来的数据结构。不知何故,这个想法在 zipper 领域并不常见。因为 scheme 有一等支持的定界延续,我们可以更容易地派生和使用 zipper。
Zipper is a _delimited continuation_ reified as a data structure. Somehow that idea is not commonly discussed in the zipper literature. Because Scheme has first-class and delimited continuations, we can derive and use zipper far more easily.
本文将给出 zipper 的派生及其使用示例:交换两棵树的两个分支。该使用示例是遗传编程中典型的交叉操作。
Given below is a derivation of zipper and an example of its use: swapping out of two branches of a two trees. The latter is a typical cross-over operation in genetic programming.
[email protected] (Noel Welsh) 在消息中写
news:<[email protected]>...
我想以纯函数式的方式进行交叉,我设想的算法是:
[email protected] (Noel Welsh) wrote in message
news:<[email protected]>...
I would like to do the crossover in a purely functional manner. The
algorithm I envisage is:
- Count the number of nodes for each of the two trees
- Choose an random integer between 0 ... (number of nodes - 1)
This is the node where crossover will take place.
正如上文指出的,我们不需要通过计数来从树中选择随机节点。选择节点后,我们可以使用
eq?
测试来向下拉开树中的该节点。然而,在下文中,为了简单起见,我们跳过随机选择,按深度优先顺序和节点索引选择节点。
As pointed out earlier, we don't need counting to select a random node from a tree. After we selected the node, we can zip down to that node in the tree using the eq? test. In the following however, we skip the random selection for simplicity and thus we shall be selecting nodes by their index in the depth-first order.
2. 派生 [#272]
- December 18, 2024
- Oleg Kiselyov
2. 派生 [#272]
- December 18, 2024
- Oleg Kiselyov
为了派生 zipper,我们首先编写更熟悉的深度优先遍历例程:
To derive zipper, we first write the familiar depth-first traversal routine:
Welcome to Scheme 48 1.1
> ,open srfi-9
> ,open escapes signals
> ,load /usr/local/lib/scheme48/misc/shift-reset.scm
; deterministic, left-to-right map
(define (map* f l)
(if (null? l) l
(cons (f (car l)) (map* f (cdr l)))))
(define (depth-first handle tree)
(cond
((null? tree) tree)
((handle tree) => (lambda (new-tree) new-tree))
; the node was not handled -- descend
((not (pair? tree)) tree) ; an atom
(else
(cons (car tree) ; node name
(map* (lambda (kid) (depth-first handle kid)) (cdr tree))))))
函数
handle
,作为 depth-first
的第一个参数,接受一个 node
并应该返回一个 node
或 #f
。在第一种情况下,返回的 node
替换结果树中的现有节点。如果 handle
返回 #f
,则表明它已拒绝处理该节点,因此我们保留该节点,在可能的情况下继续处理该节点的子节点。
The function handle, the first-argument of depth-first, receives a node and should yield either a node or #f. In the first case, the return node replaces the existing node in the result tree. If the handle returned #f, it has declined to handle that node, so we keep that node and descend into it, if possible.
为了解其工作原理,我们定义两个示例树并打印出它们的节点:
To see how this works, we define two sample trees and print out their nodes:
(define tree1 '(a (b) (c (d 1 2)) e))
(define tree2 '(z (u) (v (w 10 12)) y))
(depth-first (lambda (node) (display node) (newline) #f) tree1)
==> prints
(a (b) (c (d 1 2)) e)
(b)
(c (d 1 2))
(d 1 2)
1
2
e
==> yields
'(a (b) (c (d 1 2)) e)
我们现在可以定义数据类型 zipper:
We can now define the zipper data structure:
(define-record-type zzipper
(zipper curr-node k)
zipper?
(curr-node z-curr-node)
(k z-k))
它包含两个字段:树的当前节点和延续。延续应该接收一个
node
或 #f
。在前一种情况,接收到的节点将替换现有节点。在后一种情况,我们保留现有节点并继续遍历。这个延续返回一个新的 zipper
或一课树(如果遍历完成)。可以看出,zipper 在某种意义上是函数 handle
的“逆”。
It contains two fields: the current node of a tree, and the continuation. The continuation should receive a node or #f. In the former case, the received node will replace the existing node. In the latter case, we keep the existing node and continue the traversal. The continuation returns either a new zipper, or a tree (if the traversal is finished). One can see that zipper is in a sense an 'inverse' of the function handle.
(define (zip-tree tree)
(reset (depth-first (lambda (tree) (shift f (zipper tree f))) tree)))
正如所承诺的,zipper 确实是定界延续的一种表现形式。
As promised, zipper is indeed a manifestation of a delimited continuation.
我们应该指出,record
zipper
和构造函数 zip-tree
都是 generic 的。它们本身既不依赖树的表示,也不依赖遍历策略。有关树数据结构及其遍历的所有信息都被封装在一个函数 depth-first
中。我们可以通过改变深度优先策略从深度优先策略切换到广度优先策略,或者从嵌套列表切换到树的嵌套向量实现。无论是 zipper
、zip-tree
还是任何使用 zipper
的代码(见下文)都不需要任何修改。我们的 zipper 的这一特性与 Gerard Huet 的 zipper 派生形成了鲜明的对比。在 Gerard Huet 的形式中,zipper 确实取决于数据类型的具体实现:zipper 是从数据类型 derived(双关语)的(译者按:派生和求导)。不同的数据类型(以及抽象数据类型的不同实现)将具有不同的对应 zipper 结构。在我们的形式中,zipper 是遍历函数的 generic
derivation(双关语)(译者按:派生和导数)。zipper 是遍历函数的导数——就只是机械的导数。因此,shift/reset 可以被视为遍历函数的导数运算符。
We should point out that both the zipper record and the constructor function zip-tree are _generic_. They by themselves depend neither on the representation of the tree nor on the traversal strategy. All the information about the tree data structure and its traversal is encapsulated in one single function depth-first. We can switch from depth-first to breadth-first strategy or from a nested list to a nested vector realization of trees just by changing depth-first. Neither zipper, nor zip-tree, nor any code that uses zipper (see below) will require any modifications. This property of our zipper is in a marked contrast with Gerard Huet's derivation of zipper. In Gerard Huet's formulation, zipper does depend on the concrete realization of the data type: zipper is derived (pun intended) from the data type. Different data types (and different realizations of an abstract data type) will have different corresponding zipper structures. In our formulation, zipper is a _generic_ derivation (pun intended) on the traversal function. Zipper is a derivative of the traversal function -- mechanical derivative at that. So, shift/reset can be considered traversal function derivative operators.
3. 使用 [#273]
- December 18, 2024
- Oleg Kiselyov
3. 使用 [#273]
- December 18, 2024
- Oleg Kiselyov
我们现在可以用一种不同的方式打印树:
We can now print out the tree in a different way:
(define (print-tree tree)
(do ((cursor (zip-tree tree) ((z-k cursor) #f)))
((not (zipper? cursor)))
(display (z-curr-node cursor))
(newline)))
我们使用 zipper(即
cursor
)逐个节点地检查整个树。从某种意义上说,我们翻转了 depth-first
的操作。
we use zipper, which is a cursor, to examine all of the tree, node by node. In a sense, we have inverted the operation of depth-first.
(print-tree tree1)
; prints as before
(print-tree tree2)
(z (u) (v (w 10 12)) y)
(u)
(v (w 10 12))
(w 10 12)
10
12
y
引入一些有用的函数
We introduce a few helpful functions
(define (zip-all-the-way-up zipper)
(if (zipper? zipper) (zip-all-the-way-up ((z-k zipper) (z-curr-node zipper)))
zipper))
(define (locate-nth-node n tree)
(do ((i 0 (+ 1 i)) (cursor (zip-tree tree) ((z-k cursor) #f)))
((and (= i n)
(if (zipper? cursor) #t
(error "too few nodes"))) cursor)
))
我们已准备好做一些事:
And we are ready for some action:
; replace the 3-d node of tree1 with 'xxx
(let ((desired-node (locate-nth-node 3 tree1)))
(display "Replacing the node: ")
(display (z-curr-node desired-node))
(newline)
(zip-all-the-way-up ((z-k desired-node) 'xxx)))
==> prints
Replacing the node: (d 1 2)
==> yieds
'(a (b) (c xxx) e)
它确实替换了,不是吗?
It did replace it, didn't it?
; cross-over of the 3d node of tree1 and 1st node of tree2
(let* ((desired-node1 (locate-nth-node 3 tree1))
(_ (begin
(display "Cross-over the node1: ")
(display (z-curr-node desired-node1))
(newline)))
(desired-node2 (locate-nth-node 1 tree2))
(_ (begin
(display "Cross-over the node2: ")
(display (z-curr-node desired-node2))
(newline)))
(new-tree1
(zip-all-the-way-up ((z-k desired-node1)
(z-curr-node desired-node2))))
(new-tree2
(zip-all-the-way-up ((z-k desired-node2)
(z-curr-node desired-node1))))
)
(display "new tree1: ") (display new-tree1) (newline)
(display "new tree2: ") (display new-tree2) (newline)
)
==> prints
Cross-over the node1: (d 1 2)
Cross-over the node2: (u)
new tree1: (a (b) (c (u)) e)
new tree2: (z (d 1 2) (v (w 10 12)) y)
嗯,看来可行...
Well, it seems to work...
如果我们交换
tree1
的第3个节点和 tree2
的第5个节点,我们得到
If we swap the 3d node of tree1 and the 5th node of tree2, we get
Cross-over the node1: (d 1 2)
Cross-over the node2: 12
new tree1: (a (b) (c 12) e)
new tree2: (z (u) (v (w 10 (d 1 2))) y)
4. 总结 [#274]
- December 18, 2024
- Oleg Kiselyov
4. 总结 [#274]
- December 18, 2024
- Oleg Kiselyov
总而言之,定界延续非常有用。可以在任何 R5RS scheme 系统中进行模拟;但如果它本身受支持,性能会更好。Scheme48 本身就支持定界延续(Martin Gasbichler and Michael Sperber, ICFP 2002)。如果你最喜欢的 Scheme 系统默认没有提供它们,请向实现者投诉。支持哪个特定的定界延续运算符(shift、control、shift0、splitter、cupto 等)并不重要——所有这些的表达性都是相同的:
To conclude, delimited continuations are quite useful. They can be emulated in any R5RS Scheme system; yet it is better for performance if they are supported natively. Scheme48 does support delimited continuations natively (Martin Gasbichler and Michael Sperber, ICFP 2002). If your favorite Scheme system does not offer them by default, please complain to the implementors. It doesn't matter which particular delimited continuation operator (shift, control, shift0, splitter, cupto, etc) is supported -- all of them are equally expressible:
- Chung-chieh Shan, Scheme2004 workshop
- http://www.eecs.harvard.edu/~ccshan/recur/
5. 附录 [#275]
- December 18, 2024
- Oleg Kiselyov
5. 附录 [#275]
- December 18, 2024
- Oleg Kiselyov
附录,2006 年 6 月 7 日 [受到 Andrew Wilcox 问题的启发] 更准确地说,zipper 与底层枚举器一样保留了共享。以下是最大共享保留枚举器。这两个函数应该取代本文中的函数。
Addendum, June 7, 2006 [inspired by a question from Andrew Wilcox] To be more precise, the zipper preserves sharing as much as the underlying enumerator does. The following is the maximal sharing preserving enumerator. Those two functions should replace the ones in the article.
; deterministic, left-to-right map
; It preserves sharing as much as possible: that is, if given the pair
; l == (cons h t), (and (eq? h (f h)) (eq? t (map* f t))) holds, then
; (eq? (map* f l) l) holds as well.
(define (map* f l)
(if (null? l) l
(let ((h (car l)) (t (cdr l)))
(let ((h1 (f h)) (t1 (map* f t)))
(if (and (eq? h1 h) (eq? t1 t)) l
(cons h1 t1))))))
(define (depth-first handle tree)
(cond
((null? tree) tree)
((handle tree) => (lambda (new-tree) new-tree))
; the node was not handled -- descend
((not (pair? tree)) tree) ; an atom
(else
(let ((kids1
(map* (lambda (kid) (depth-first handle kid)) (cdr tree))))
(if (eq? kids1 (cdr tree)) tree
(cons (car tree) ; node name
kids1))))))
为了测试新的
depth-first
确实保留了共享,我们求值
To test that the new depth-first indeed preserves sharing, we evaluate
(eq? tree1
(depth-first (lambda (node) (display node) (newline) #f) tree1))
以深度优先顺序打印所有节点后,给出结果
#t
。在这种情况下,深度优先(depth-first
)返回的树确实是原始树。
which, after printing all nodes in depth-first order, gives the result #t. The tree returned by depth-first in this case is indeed the original tree as it is.
zipper 代码无需更改,按原样工作,具有相同的结果。 为了测试共享是否保留,我们首先通过替换
tree2
中的第 6 个节点(即 y
)来生成一棵树:
The zipper code needs no changes, and it works as it was, with the same results. To test the sharing preservation, we first produce a tree by replacing the 6th node (which is y) in tree2:
(define tree2*
(let ((desired-node (locate-nth-node 6 tree2)))
(display "Replacing the node: ")
(display (z-curr-node desired-node))
(newline)
(zip-all-the-way-up ((z-k desired-node) 'newy))))
这是结果:
(z (u) (v (w 10 12)) newy)
here's the result: (z (u) (v (w 10 12)) newy)
现在,我们编写一个函数,它接受两棵树,同步遍历它们并打印出节点以及它们是否共享:
Now, we write a function that takes two trees, traverses them in lockstep and prints out the nodes and if they are shared:
(define (tree-compare-sharing t1 t2)
(do ((cursor1 (zip-tree t1) ((z-k cursor1) #f))
(cursor2 (zip-tree t2) ((z-k cursor2) #f)))
((cond
((and (zipper? cursor1) (zipper? cursor2)) #f)
((zipper? cursor1) (display "t2 finished early") #t)
((zipper? cursor2) (display "t1 finished early") #t)
(else #t)))
(let ((n1 (z-curr-node cursor1)) (n2 (z-curr-node cursor2)))
(cond
((eq? n1 n2) (display "shared node: ") (display n1))
(else (display "t1 node: ") (display n1) (newline)
(display "t2 node: ") (display n2)))
(newline))))
(tree-compare-sharing tree2 tree2*)
===>
t1 node: (z (u) (v (w 10 12)) y)
t2 node: (z (u) (v (w 10 12)) newy)
shared node: (u)
shared node: (v (w 10 12))
shared node: (w 10 12)
shared node: 10
shared node: 12
t1 node: y
t2 node: newy