mflatt
2019-2-14 13:17:04

Serialized linklets in the sense of setting current-compile-target-machine to #f? It uses racket/fasl on a linklet S-expression. The racket/fasl output is a kind of preorder traversal of the S-expression, where a designated integer for each supported kind of node/value that can appear within the S-expression. But racket/fasl also views the S-expression as a kind of DAG in the sense that symbols, strings, and some other kinds of values can appear multiple times in the S-expression tree; to record that sharing, racket/fasl writes a “remember this subtree” integer plus a unique index at the first instance of the shared value, than a “reference subtree” integer with the same for later references.


mflatt
2019-2-14 13:19:02

For a JSON encoding instead of a byte-stream encoding, I imagine it would be more sensible to represent an S-expression tree as a JSON tree, plus a similar “remember” and “reference” strategy to encoding sharing.


samth
2019-2-14 14:47:01

@notjack it would be pretty easy to write an sexp->jsexpr function that worked a lot like s-exp->fasl


samth
2019-2-14 14:48:06

depending on what your goals are, you might not even need to care about the sharing @mflatt mentions


samth
2019-2-14 14:48:26

but if you want to round-trip faithfully/use the output again in compilation you do


notjack
2019-2-14 15:41:02

@mflatt Is the sharing stuff part the semantics of reading and writing linklets, or is it just a space-saving optimization? Also I’m curious about how it might work when current-compile-target-machine is #t too (but I am more interested in the case where it’s #f).


samth
2019-2-14 15:41:26

@notjack yes, it’s part of the semantics


samth
2019-2-14 15:41:40

in the sense that if you don’t do it, it won’t work


mflatt
2019-2-14 15:42:08

Well, reading must intern symbols, strings, etc., so it’s actually a space-saving optimization in this context


samth
2019-2-14 15:42:30

to be more specific, if you don’t do it, compiling, writing, reading and then evaluating won’t give the same answer as just evaluating


samth
2019-2-14 15:43:26

@mflatt it’s a space-saving optimization over using write and read, but wouldn’t it give the wrong answer compared to a version of racket/fasl that didn’t do it?


notjack
2019-2-14 15:44:03

@samth Honestly my primary goal here is daydreaming; I’m not building anything at present. But what I’m daydreaming about is a distributed racket compiler for the package server that writes and reads linklets to and from a distributed content-addressed storage system and generally tries to be Really Smart (TM) about caching.


notjack
2019-2-14 15:45:16

(maybe that kind of daydreaming is a side effect of working for google)


mflatt
2019-2-14 15:45:58

@samth Oh, you’re right. For gensyms and such, sharing must be recorded. I was thinking only of values that can/should be interned.


notjack
2019-2-14 15:46:40

ah so linklets definitely allow gensyms in places? I’m surprised any sort of generativity is allowed in a serialized format like that


mflatt
2019-2-14 15:47:00

@notjack I’m not sure why you’d want something different from fasled form in that case, especially since racket/fasl goes out of its way to produce a deterministic result.


mflatt
2019-2-14 15:47:46

It’s easy for gensyms to go wrong (such as not being deterministic), so they should be avoided. But they’re not prohibited.


notjack
2019-2-14 15:50:16

@mflatt Less something “different from”, more something “in addition to”. I’m imagining http content negotiation could be used to get compiled linklets in formats other than the fasl one so non-racket things (including humans) can analyze linklets. It also would help me personally to get a better understanding of the underlying data model of linklets and what data they can and can’t encode.


notjack
2019-2-14 15:51:02

a JSON format that’s 1:1 with the racket format would make it much easier for me to understand the racket format, for example


notjack
2019-2-14 15:52:50

Oh also there might be more efficient linklet encodings for a system that builds everything in the package catalog, since a system like that would get a lot more mileage out of maximizing cacheability and structure sharing across many linklets