Serialized linklets in the sense of setting current-compile-target-machine
to #f
? It uses racket/fasl
on a linklet S-expression. The racket/fasl
output is a kind of preorder traversal of the S-expression, where a designated integer for each supported kind of node/value that can appear within the S-expression. But racket/fasl
also views the S-expression as a kind of DAG in the sense that symbols, strings, and some other kinds of values can appear multiple times in the S-expression tree; to record that sharing, racket/fasl
writes a “remember this subtree” integer plus a unique index at the first instance of the shared value, than a “reference subtree” integer with the same for later references.
For a JSON encoding instead of a byte-stream encoding, I imagine it would be more sensible to represent an S-expression tree as a JSON tree, plus a similar “remember” and “reference” strategy to encoding sharing.
@notjack it would be pretty easy to write an sexp->jsexpr function that worked a lot like s-exp->fasl
depending on what your goals are, you might not even need to care about the sharing @mflatt mentions
but if you want to round-trip faithfully/use the output again in compilation you do
@mflatt Is the sharing stuff part the semantics of reading and writing linklets, or is it just a space-saving optimization? Also I’m curious about how it might work when current-compile-target-machine
is #t
too (but I am more interested in the case where it’s #f
).
@notjack yes, it’s part of the semantics
in the sense that if you don’t do it, it won’t work
Well, reading must intern symbols, strings, etc., so it’s actually a space-saving optimization in this context
to be more specific, if you don’t do it, compiling, writing, reading and then evaluating won’t give the same answer as just evaluating
@mflatt it’s a space-saving optimization over using write
and read
, but wouldn’t it give the wrong answer compared to a version of racket/fasl
that didn’t do it?
@samth Honestly my primary goal here is daydreaming; I’m not building anything at present. But what I’m daydreaming about is a distributed racket compiler for the package server that writes and reads linklets to and from a distributed content-addressed storage system and generally tries to be Really Smart (TM) about caching.
(maybe that kind of daydreaming is a side effect of working for google)
@samth Oh, you’re right. For gensyms and such, sharing must be recorded. I was thinking only of values that can/should be interned.
ah so linklets definitely allow gensyms in places? I’m surprised any sort of generativity is allowed in a serialized format like that
@notjack I’m not sure why you’d want something different from fasled form in that case, especially since racket/fasl
goes out of its way to produce a deterministic result.
It’s easy for gensyms to go wrong (such as not being deterministic), so they should be avoided. But they’re not prohibited.
@mflatt Less something “different from”, more something “in addition to”. I’m imagining http content negotiation could be used to get compiled linklets in formats other than the fasl one so non-racket things (including humans) can analyze linklets. It also would help me personally to get a better understanding of the underlying data model of linklets and what data they can and can’t encode.
a JSON format that’s 1:1 with the racket format would make it much easier for me to understand the racket format, for example
Oh also there might be more efficient linklet encodings for a system that builds everything in the package catalog, since a system like that would get a lot more mileage out of maximizing cacheability and structure sharing across many linklets