2021-3-8 21:00:51

idle thought of the day: I think syntax->s-expression might be a less confusing name than syntax->datum

2021-3-8 21:24:51

That would be clearer.

2021-3-8 21:31:31

except that it doesn’t need to be an S-expression?

(struct s (x)) (s-x (syntax->datum (datum->syntax #f (s 1))))

2021-3-8 21:50:35

I didn’t know that structs could be used in syntax objects.

2021-3-8 21:57:49

I think 'foo counts as an s-expression to many people, so I’d be fine with that

2021-3-8 21:59:13

@kellysmith12.21 Anything can be stuffed into a syntax object. It’s just that the expander only looks deeper inside lists. And if you shove something into a syntax object that isn’t… serializable, I think? then you get “3-d syntax” which can’t be compiled to bytecode and therefore can only exist in the intermediate stages of macro-expanding a module.

2021-3-8 22:00:28

In Punctaffy, where I’m using other data structures to represent pieces of code, I refer to Racket syntax as s-expr-stx to distinguish it from syntax based on other representations. For example, we could imagine hygienic reader macros that transform string-stx or input-port-stx, which would similarly associate locations, scope sets, and syntax properties with parts of the text.

2021-3-8 22:01:04

(In Punctaffy, I’m using representations that are in general more structured than s-expressions are.)

2021-3-8 22:05:09

That makes sense. There could probably be a similar thing in a Honu-like system to distinguish the post-read but pre-enforestation token stream syntax objects

2021-3-8 22:07:13

Some part of the expander must “look into” vectors, prefabricated structs, hashes, and boxes too; I think these are normalized into immutable versions. The stablest way I’ve found to preserve a reference in 3D syntax is to put it into a lambda that returns it.

2021-3-8 22:09:19

Anyway, all this is to say most things are s-expressions, and syntax objects feel to me like a variation of s-expressions that often warrants using the term “s-expression” to explain it.

2021-3-8 22:11:30

I usually describe them as “like s-expressions but with metadata for tracking things like scope, source locations, etc.” since “s-expression” almost always means code-as-plain-data to people.

2021-3-8 22:13:08

I think I like the idea of there being s-expression syntax objects and s-expression datums (things that don’t bother carrying the metadata). So there could be a syntax->datum for Racket’s primary syntax representation (s-expressions), and other representations would have things like input-port-syntax->datum.

2021-3-8 22:15:35

But… that doesn’t mean it’s less confusing, I dunno. Datum is a term that is given meaning in relation to Racket syntax, but it probably isn’t one people would ever use that way if they weren’t in the context of Racket

2021-3-8 22:17:13

Yeah I think “datum” is terminology that could easily be done without

2021-3-8 22:22:34

Come to think of it, the representations I’ve used so far are more like “hyperbracketed (s-expression syntax objects)” than "(hyperbracketed s-expression) syntax objects," so I might not actually need to distinguish “syntax objects” from “s-expression syntax objects” for this purpose. But I do anticipate "(hyperbracketed s-expression) syntax objects" coming up someday.

2021-3-8 22:24:08

hmm… how about changing syntax->datum to s-expression-remove-marginalia or something? :smile:

2021-3-8 22:26:51

“marginalia” :laughing:

2021-3-9 01:54:47

I’ve been thinking, syntax objects built from s-exprs are great for manipulating user-facing syntax, but they’re not as good for a compiler, which would benefit from something similar, but more structured.

2021-3-9 01:55:51

Fully expanded program looks very structured to me

2021-3-9 02:20:21

Note that serializable here isn’t in the sense of prop: serialize

2021-3-9 02:21:45

There are lots of considerations for a compiler IR, but syntax objects aren’t that

2021-3-9 02:24:13

Would it be possible to have a library for building/using compiler IRs for use in macros, or are IRs too domain specific to abstract over like that?

2021-3-9 02:51:22

what is it in the sense of?

2021-3-9 02:52:25

I don’t think there’s anything else that corresponds to “can be serialized in byte code” other than that itself

2021-3-9 02:53:15

is there some list of bytecode-serializable types in the docs somewhere?

2021-3-9 02:53:27

No I don’t think so

2021-3-9 02:54:36

@kellysmith12.21 Yeah, I think there could be use cases for that. Lots of complex macros “fully” expand code to some alternate set of core forms and then process those forms somehow.

2021-3-9 02:54:49

It might just be “things traversed by datum->syntax” plus symbols and anything with a read syntax

2021-3-9 02:55:07

anything with a read syntax?

2021-3-9 02:55:25

oh you mean like

2021-3-9 02:55:31

anything with a reader notation for it

2021-3-9 02:55:32

Strings, booleans, regexps, etc

2021-3-9 02:55:37

not read-syntax

2021-3-9 02:55:48


2021-3-9 02:56:07


2021-3-9 02:56:31

Note that uninterned symbols are a special case which is why I listed symbols separately

2021-3-9 02:57:23

:thought_balloon: reader notation for “module path + symbol naming a deserializer combined with bytes for some prop:serializable object”

2021-3-9 02:59:03

I don’t think there are any macros that really use an IR in that sense without being a full compiler sort of glued on to the macro system (like the JavaScript package)

2021-3-9 03:00:14

I figured it could be useful in cases like match, which is effectively a tiny compiler, or when embedding a DSL or building a #lang.

2021-3-9 03:46:58

match does have an IR in some sense, but the syntax-object-ness doesn’t come up much there