
@marian.petruk has joined the channel

Hi,
Would anyone be interested in an informal Racket meetup in London?
Any preference for date/time?
For a location I’m thinking about ‘The Microsoft Reactor London – 70 Wilson St, London EC2A 2DB’ as it appears to be free and relatively central.
https://developer.microsoft.com/en-us/reactor/Form/5
Kind regards,
Stephen

@spdegabrielle Yeah why not! I live in the north of France so I can easily get there (if the train tickets are not so expensive at the time). I don’t know about the date. How many people are you planning to host?

Depends on how many are interested.

@mflatt @samth I am trying to get CI done for cross compilation/testing. I have a script doing this automatically for arch X (where X is a variable). I am setting up jobs and wondering which archs to test. s390x seems to be one supported arch. Which archs are officially supported? x86_64
, arm
, s390x
and I think ppc
?

All architectures are supposed to be supported for traditional Racket, but that seems like a fine set if you add i386
. All of those have JIT support except s390x
.

OK, so for s390x
I need --disable-jit
, for ppc
I need --disable-places --disable-futures
. I will give these a try. Thanks.

And for completeness, CS is only supported in x86_64
for now?

Just recalled that mips
is supported in traditional racket too, right?

Yes, mips
. You don’t have to use --disable-jit
or other flags; the configure
script is supposed to figure out which things are supported. CS should work for i386
. (It should also work for arm
and ppc
, eventually, but there are currently no pre-built Chez Scheme boot files.)

Here’s a question about Typed Racket and calling functions generated by a macro.
I’ve got a macro, currently in untyped Racket, that “compiles” a little DSL that lets you define nodes in a graph as illustrated by tagclass
below.
The node class being defined is called needs
and it takes one argument, need-type
. Attributes of the node, like name
, are defined in terms of expressions that can include that argument. The applies-to
clause adds further arguments, indicating what nodes of this class should link to and how.

The current version of the macro generates functions to evaluate those expressions. The number of arguments varies according to the node class, of course. The line marked “THIS LINE HERE” calls those functions. The function shown below is called by a function called make-node
to make a new node of a given class. You pass make-node
the arguments for the node class—which vary by node class—like this: (make-node g 'needs 'source)
. To make, say, a node class without arguments, you would write (make-node g 'plus)
.
My question: How can I call these generated functions so that Typed Racket won’t complain?
I expect complaints because of the varying function signatures in the generated code. Better yet, is there a (not too hard) way to make Typed Racket complain if I try to instantiate a node class with inappropriate arguments?

OK. I will trigger the initial builds and then we’ll see what to add. Thanks for the information.

@bkovitz I think the answer to your question depends on what f
can be in apply-f
if it’s not a procedure. Can it be anything?

Currently it can be anything. But since I’m now overhauling the whole system as I move it to Typed Racket, I could change that or anything else.

I’ll give two answers: one that just focuses on the specific sort of thing you do in apply-f
and one that takes the rest of the context you gave into account.

The problem with functions like apply-f
in Typed Racket is that (procedure? f)
succeeding tells the typechecker very little about f
. It knows it’s a function, but it has no idea how that function can be called. It could demand no arguments, it could demand exactly 12 arguments, it could demand keyword arguments. TR just can’t know. So even if (procedure? f)
produces #t
, absent of other information, the typechecker won’t let you call f
at all, under any circumstances.
But this is only true if the typechecker doesn’t know anything about f
already. If f
’s type is not Any
, but is instead something like (U (-> String Integer) Integer)
, then if (procedure? f)
returns #t
, TR can be certain that f
isn’t an Integer
, so it knows it must have type (-> String Integer)
. In that case, you can invoke the function if you take that branch.
So: procedure?
is useless on something that you don’t know very much about statically. But it could be useful if you can give it a more constrained type.

Now, the bigger-picture answer. It seems like it might be easier to make all this machinery work if you restructure things in a more significant way besides just changing something local to apply-class-attr
. I am not totally sure I understand what evaluating a tagclass
form does—does it insert an entry in a global hash table? does it evaluate to some value? does it define some variable?—but your description of make-node
would seem to imply that whatever tagclass
does, the information is eventually looked up symbolically, perhaps being stored in a hash table somewhere.
That seems hard to statically type. Typed Racket doesn’t support heterogenous hash tables (i.e. hash tables used as records instead of as dictionaries), so you can’t have a hash table where the values associated with specific keys have different types. This means your hash table would have to have some very vague type like (HashTable Symbol Tagclass-Info)
, and it wouldn’t be able to encapsulate how certain keys in the hash table are associated with tagclasses that take different types of arguments.
A better approach, more conducive to static typing, would probably be to skip the hash table and the symbolic lookup, and instead make the tagclass
form produce a value or definition with a rich type. So, for example, you could make (tagclass (needs [need-type : Foo]) ....)
actually expand to (define (needs [need-type : Foo]) : Tagclass ....)
, and you could pass the needs
function directly to make-node
, instead of passing the symbol 'needs
. If you need more information than just a function, you could make needs
a struct that combines a function with some extra information.

Your guess is right: the node classes are stored as structs and looked up in a hash table somewhere.

I think you’ll have a much easier time if you avoid the indirection through the hash table and instead define variables (which can each have distinct types) and refer to those directly. Is there a reason you need the indirection?

No particular reason; it’s just what I hacked out as a semi-beginning Racket coder. Well, one reason: I’ve got code like this, which decides at run-time which class of node to make. Sometimes I want to say, “Of these tag classes, which could validly be instantiated and linked to these nodes?”

I’m trying to understand, but I’m missing some context: does that code select a class dynamically from the hash table, or does it just look iterate through the predefined list problem-tags
? If it’s the latter, I think you could just define problem-tags
to be (list (needs 'source) (needs 'result))
. If it’s the former, it’s much more interesting.

I definitely like the idea of generating typed functions inside the macro. It’s a little scary, since as you rightly just guessed, some of the main code does indirection on node classes, so that would need to be rewritten from scratch. But I don’t think there’s too much of that code. And really, I’d like to move that code inside the DSL; I just haven’t figured out a way yet.

If you legitimately use the indirection for some purpose, I think there are ways of encoding that in a way that TR would be happy about. But if you don’t need the indirection, then it’s probably much easier to just get rid of it.

Here’s the function that looks through problem-tags
. It’s the second way—the non-interesting way, which is likely a good thing.

OK, this might be the idea that I needed to hear: just get rid of the indirection. I’ve been taking the need for indirection so much for granted, I’ve never really examined whether I can just do without it.

I’ve been trying to write some code for “auto-tagging”—generically saying “If any tags could apply to these nodes, build them and link them.” That sounds like it needs indirection, but now I wonder if the macro could just generate code to do that.

Well, here’s another way to think about the things: when you are choosing a specific tag—whether via the indirection or not—you know in your head exactly what arguments it needs to be supplied with. For example, when you write (make-node g 'needs 'source)
, you know that 'needs
is a class that takes one argument, so you give it one.
If you were instead iterating through the hash table of tags without a specific tag already in mind, now you have a problem, even in untyped Racket: you don’t know how to actually build the node class because you don’t know anything about which arguments should be supplied. For any particular key
, value
pair, the value
might need two arguments, four, or none at all. And since your code is just looping through all of them, it can’t even know how many arguments to provide, much less which arguments they should be.
TR is not quite as “smart” as you are, since you know that (hash-ref all-the-tags 'needs)
will give you something that needs one argument (a symbol), but TR doesn’t. On the other hand, TR’s “stupidity” is actually somewhat insightful here, from another point of view: it rightfully points out that, when you view the hash table as a dictionary mapping arbitrary keys to arbitrary values, the values are pretty useless. You have no way of using them without some kind of extra, prior knowledge.
The “auto-tagging” function you describe would presumably have to dig through the hash table of tags to find “applicable” tags, but if some of those tags take arguments, your “auto-tagger” would be screwed, even in untyped Racket. On the other hand, your problem-tags
list provides precisely the right information: it couples tags with their arguments, making it possible for some tool to understand how to get the actual node class from the tag.
This suggests a slightly different way of viewing things: tags that take arguments aren’t really tags at all, they’re tag constructors, aka functions that produce tags. So if you built a hash table full of tags, without the tag constructors (or, perhaps more usefully, with the tag constructors already applied to produce tags), you could do interesting things with it. But as you are right now, your hash table contains a jumble of tags and tag constructors, and that just isn’t very helpful.

I agree: the “stupidity” would be useful. I’ve had a couple long, painful debugging sessions due to passing wrong arguments to tag constructors. Node classes inherit from other node classes, as in OO, and currently there’s nothing to ensure that “super-class” constructors get called with the correct arguments.
BTW, what sort of thing did you have in mind as an “interesting thing” that I could do if I structured the hash table differently?

One other thing: the main idea here is to make it easy to try out different ideas for tagging, responding to tags, etc. There’s no one exact thing, like find-node-with-problems
, that I’m trying to implement. I’m hoping to be able to just say, in the DSL, “Here are some node classes—let’s see what happens!”, rather than hard-coding too much in plain Racket. IOW, this program overall is tool for experimenting with graphs whose contents tell how to modify themselves—a wild, mostly unexplored domain, which I’m writing this program to explore.

Well, the “auto-tagging” application is interesting, and that’s one that could be solved in a simple enough way by what I just described: build a list or hash table of tags that have already been fully-applied to arguments, which is what your problem-tags
list is already morally doing. If you were doing something even more interesting, though, where you had a scheme by which arguments to tags could somehow be inferred, at runtime, and your code was selecting tag constructors and somehow automatically figuring out how to plug their arguments, you’d need to do something fancier. But it’s always possible to get fancier in some direction or another, so that isn’t saying much. :)

I do understand not wanting to hardcode too much. I think skipping the indirection doesn’t cost you anything there, though, since my above point was basically that it wasn’t actually getting you any expressive power over just having direct variable references.

Ah, now I see: so the “tags” that you have in mind wouldn’t be actual nodes in the graph, but complete sets of attributes that could be put into new nodes?
The need for “tags” that aren’t really nodes turned into a headache in the first version of the program. I often want to ask the question “Could a tag of this class apply to these nodes?” The current system, where attributes are stored in the nodeclass
struct as functions if you need to call them to get the real attribute value, led to … you guessed it, some painful debugging sessions.

I see… I’m admittedly a little confused about the nomenclature, so I might be mixing concepts up. (Is a “tag” different from a “tag class”, and is that different from a “node class”? Is a “node” different from a “tag”? I don’t actually really know.)

Yes: a “tag” is just a kind of “node” (one that’s meant to describe other nodes); a “tag class” is just a kind of “node class”, and the “classes” are unchanging, abstract definitions whereas actual “tags” and “nodes” are elements of the graph, which can be removed or have their attributes modified as the program runs.

Excellent point that indirection actually hasn’t been helping. Now that I think about it, the main value of the indirection has been convenience in writing code to add nodes to graphs. Here’s a crude example:

What is a “node class”, and how is a “tag class” different from a generic “node class”?

A “node class” is just a definition of a category of node: what attributes it contains, what “ports” it has, and how it’s supposed to link to other nodes. The difference between a “node class” and a “tag class” is mostly their purpose: the latter describe the former. For example, you might have two nodes of class number
, like (number 4)
and (number 17)
, and they might be “tagged” by a node of class greater-than
, indicating that the program “noticed” that relationship, and possibly triggering actions based on that relationship.
There is one “real” difference between a tag and a node, but it’s not a big deal: a tag gets automatically removed when the nodes that it tags are removed. For example, if (number 4) got removed, then the
greater-thantag would also get removed, but if the
greater-thangot removed, the
(number 4) would stay in the graph.

Okay… so do you have a nodeclass
form that is basically the same as a tagclass
form except that the latter generates “tag nodes” instead of ordinary nodes?

Hmm, there is one other place that uses indirection, and it’s actually important, even though its only function is convenience. There’s a little DSL that lets you tell how to make a graph without writing lots of explicit calls to make-node
. The code is an interpreter for lists that look like this:

Re nodeclass
/tagclass
: yep. nodeclass-body
is the macro that does all the work. (I think this is a bit ugly. As I’m moving to Typed Racket, I’m currently rewriting this so there is only one main macro, farg-model-spec
, which does everything.)

It seems like you could make that DSL work without the indirection by just doing something like `(:in slipnet
(:in (equation)
(:let ([15 ,(number 15)]
[+ ,(+)]
[9 ,(number 9)]
[6 ,(number 6)])
(:edge 15 source + result)
(:edge 9 result + operands))))
though I’m not sure, since I don’t totally understand all the features of the DSL. If the quoting/unquoting became too unwieldy, it might be easier to just do it with a macro, or set of macros.

And yeah, that makes sense.

I think what I said above is still true, though maybe now I can state it more precisely and not confuse terms. When you define a node class or tag class with arguments, you aren’t defining a node or a tag. You’re defining a node or tag constructor, a function that evaluates to a node. In fact, you might even be able to get away with skipping argument support in nodes/tags entirely and just writing: (define (needs need-type)
(tagclass needs ....))
I.e. literally just writing functions that produce tags/nodes when invoked.

About ,(number 15)
, that brings up a source of messiness in the program. Maybe you know of a way to avoid this. When you make a new node, you get two values back: the new node’s id, and the updated graph. So, (number 15)
would have to return two values, messing up the use of unquoting. Much of the code right now is long let*-values
statements like the one a few code snippets above.

Yeah, I think viewing the class definitions as constructor definitions makes a lot of sense.

I see, yes. It does seem like maybe it would be better to do this with a macro rather than a runtime interpreter. It’s what macros are for, after all. :) Beside the obvious advantages of being able to do the work at compile-time and being able to reuse all of Racket’s macro-writing machinery, you get the very potent advantage of not having to reimplement scope yourself, since the macro system will handle that for you.

It seems like you could probably define :in
, :let
, and :edge
as macros. Or maybe :edge
wouldn’t be its own macro but would be part of the syntax of :let
, I don’t know.

Good point about not reimplementing scope! That was a bit of a nuisance, though I was able to exploit it to do stuff like make :in
cause all the new nodes created inside it link automatically as “members” of the outer node.

Yes, I don’t know the details about all the things your DSL does, but I’d bet it’d be possible to replicate all the functionality and convenience via a macro or set of cooperating macros, and that’d definitely play nicer with Typed Racket. But it might involve some more advanced macro machinery to do everything the same way.

Regarding :edge
, the current version is pretty ugly. In an old version of that DSL, written in Clojure, I had it look up the “port classes” (oh no, another kind of class!—nah, it’s really not so bad) in the node-class definitions, and infer which ports were appropriate to link by. So, you could just say (+ -> 15)
and it would infer that the edge must go from port (+ result)
to port (15 source)
.

So, that’s yet another place where code needs to “reflect” on the class definitions—a sort of indirection.

The more I’ve worked on this, and the more I’ve learned about syntax/parse
, the more I’ve come to think that it would be a big improvement to add more to the DSL. That could simplify the plain-Racket code quite a bit. It may be a little beyond my current syntax/parse
-fu, but I’m learning…

Is that a reflection on the class definitions, or a reflection on the nodes themselves? It sounded to me from what you said that the “classes” are just the syntax, they don’t exist at runtime (and at runtime there are only nodes).

It’s a reflection on the class definitions. At run-time, in the current implementation, the classes still exist as structs that can be queried for various things, like “What port-labels do you link by?”, “Does this class inherit from you?”, etc.

Right now I’m thinking that it would be better to just make the big macro generate functions to answer those questions.

I see. Yes, I think that’s right. So a class definition does exist as a separate thing from an “instantiation” of the class?

Yes.

Yes, it seems like then it would make sense to define variables bound to the class definitions, then make the macro expand to functions that inspect the class definitions at runtime to do the appropriate things. The macro can probably handle things like binding in :let
at compile-time, though (although it’s certainly possible that :let
could have extra runtime effects, too).

In fact, the nodes themselves (the “instantiations”) are not represented in one place, like as a struct. There is a big struct for the graph as a whole, which contains hash tables for node ids, (,node-id ,port-label)
-> incident ports, and looking up a node’s attributes by the node’s id and the attribute name.

@soegaard2 i made a little dataflow language similar to what you were asking about: https://pkgd.racket-lang.org/pkgn/package/datacell
(I literally wrote it this morning, no garuntees that its actually good :P)

@florence cool!

The example looks just right.

I can’t imagine where I got that example from…. :stuck_out_tongue:

And you even check that there are no cycles between cells.

What’s the idea behind the continuation marks?

the idea is to do the cycle detection dynamically: if we every try to evaluate a cell while evaluating that same cell, we ran accross a cycle

to tell if we are currently evaluating some cell the can look at the continuation marks an see if the stack is marked by that cells identity

Oooh! That’s clever.

I thing promices do basically the same thing

I think I got confused by the name identity
.

I was thinking of the identity function.

oooh… yeah no I just wanted a seperate notion of eq
in case the cells get chaperoned some day

because like contracts are good, right?

But rereading, I see identity is bound two lines above its use…

right its just a fieldless struct stored in the cell

thats used to track eq?
ness since the cell identity is generated with each cell

(also i was wrong looks like promices use semaphores and thread-eqness to track re-entrance, so that its thread safe. which i didn’t even think of… meh sounds like a later problem)

I like your implementation. It’s a nice example of something that’s hard to do without macros.

So I had a first pass that didn’t need macros actually

Basically cells were built out of constants and functions could be lift
ed to work on cells, but that felt kinda ugly to me (and also lift
ing involved make-keyword-procedure
which always feels gross to me)

It also meant that cells values had to be extracted by function, which meant that it was possible to accidentally lose track of a dependency which isn’t great

(so… thank you for giving me an excuse to avoid my real work/Agda for a morning and work in racket for a while :P)

:wink:

@yjc961020 has joined the channel

@lexi.lambda Thanks for your time and advice. The last time you helped me out, you helped me get the first version of the macro for this DSL working when I was completely stumped about how to use syntax/parse
. Thinking about what you said today, it’s hit me that the node-constructor function shouldn’t take the graph as an argument nor return an updated graph. It should just return a hash table of the node’s attributes except for the node’s id
. That can then get passed to the function that makes the new node (i.e. puts it in the graph) and assigns it an id. Failure to keep those ideas separate has probably cost the program some unnecessary complexity and bug-proneness up to this point. In other words, just change from (make-node g 'number 17)
to (make-node g (number 17))
. I’m still not sure how inference of the right port-labels for an edge could be done, but that can wait.

@florence oh hey your identity
struct is the same as my generative-token
struct: https://docs.racket-lang.org/rebellion/Generative_Tokens.html

There is something similar too in reactor
using signal-name
, although thats using uninterned symbols since I also want the actual name there:
https://github.com/florence/reactor/blob/master/data.rkt#L46

Does anyone know why the variables defined in the outer begin
don’t get defined? The functions defined in the inner begin
do get defined.

Most likely they are defined - but maybe in a different scope than you are expecting?

Any idea how I could track down that scope?

The context is missing. Is this the body of a define-syntax?

Here’s a little test that invokes the macro. The last line gets hey: unbound identifier in: hey
. (number 17)
works fine.

Yes.

Even this gets hey: unbound identifier in: hey
.

The hey
in (define hey 'h)
needs to have the same scope as stx.

Wrap your #` with a with-syntax.

Ohhhhh…

(with-syntax ([hey (format-id stx “hey”)) …your syntax…)

Is with-syntax
the idiomatic way to do it with syntax/parse
or does syntax/parse
have its own natural way, maybe involving a keyword like #:with
?

So, another way to fix this is to have things like hey
and ht/class->is-a
be supplied by the user/invoker of the macro. So maybe the spec
macro takes these from the user. When a macro define
s things, and starts introducing names, that’s often friendlier. It’s usually OK to bend the rules and form identifiers using format-id
that add to a name supplied by the macro user. Much like how struct
and define-struct
take the struct name from you, then make accessors from that base name.

@bkovitz There’s a #:with
keyword, and I usually write syntax/parse
macros using define-simple-macro
, #:with
, and #:do
instead of directly calling syntax-parse
or using with-syntax

Indeed I don’t like defining a bunch of magic variables. This was a quick test to see if I could generate the hash table. On the other hand, I also don’t want to burden the user (me) with having to supply a list of all the various things like this that are going to be defined.

(define-simple-macro (dtest)
#:with hey (some expr returning a syntax object ...)
(begin (define hey 'h)))

@bkovitz Fair enough. I just learned that rule of thumb, and so for me the lazy thing to do is follow that rule of thumb, rather than worry too much about scope if possible. :smile:

Oh, I like that rule of thumb, though! I’ll keep it in mind as I try to finish this thing today.

I’m all for simple rules of thumb and standard programming idioms so you don’t have to consider lots of alternatives, you can just code!

I really wish the standard racket libraries had a nice wrapper around format-id
that took care of the whole 'sub-range-binders
thing

@notjack Indeed I wrote my own wrapper like that once. Today I totally forgot about the need for format-id
.

The with-syntax
and syntax
aka #'
works everywhere. It’s independent of syntax-parse
, syntax-case
etc. It even works in Scheme.

There’s a couple of wrappers in the package catalog too I think

It’s good advice not to introduce a new identifier into an outer scope though.

Very much not in the standard library (yet), but I have an implementation of such a function here: https://github.com/lexi-lambda/mini-ml/blob/e8e0b62d294293bce343c77ac21c1ecb69d6f6a9/mini-ml-lib/mini-ml/private/util/syntax/misc.rkt#L143-L151

Maybe it should be put somewhere.

Doesn’t #:source
(of format-id
handle sub-ranges?

@soegaard2 Thanks. I’m just seeking one of those standard, idiomatic things so the code is easy to read (and write). I’m going to try it with #:with
and if that doesn’t work, go back to with-syntax
. (A little grepping shows that I’ve been doing with-syntax
+ format-id
throughout other code. Weird that I forgot about it today!)

Ah, maybe it would be better to have the spec
macro return that hash table—or some struct that contains all the stuff that it needs to generate, apart from the various functions (named by the user) inside the spec.

@soegaard2 I don’t think so, I think that’s just the source location of the whole identifier. Not for any specific portions of the identifier.

I see. Why not extend format-id
then. No need to have a wrapper if we can avoid it.

a list of everyone’s implementations of this would be a compelling argument for a racket/syntax
PR :p

It worked with #:with
and format-id
. Hooray! Thanks for saving me an hour of stupidly trying stuff until it hit me. :wink:

@soegaard2 That would also be fine with me, and probably won’t break anything

@greg This requires passing an additional argument to the function, but at least there’s no mess of “magic” or “implicit” identifiers being defined.

@lexi.lambda Put in format-id
using a new keyword.

Hmm, (define sp (spec . . .))
can’t define functions in addition to returning a FARGishSpec
struct. Looks like it’ll have to be (define-fargish-spec specname (nodeclass blah blah . . .) . . .)
.

@soegaard2 Doing it in format-id
would be more work, since you’d have to parse the pattern to figure out where the insertions begin and end. format-id
currently just ultimately turns into format
, but making it do the smarter thing would require doing more itself.

Still, it’s probably the right thing to do.

something like this? (format-id #'foo "~a-~a" #'foo #'bar #:sub-range-binders? #t)

Now it works without #:with
or format-id
. Hooray! Thanks again, this time for guiding me along a saner design path.

I wonder if it would cause any actual backwards-compatibility problems to make #:sub-range-binders?
default to #t
. It would probably make some existing programs magically more cooperative with Check Syntax. But maybe that’s too magical and not worth it.

I need to read up on sub-range-binders.

@soegaard2 sub-range-binders
is what lets posn-x
have different binding arrows for the posn
and x
parts.

Maybe it can default to #f
initially and then change to defaulting to #t
after a release or two.

And it’s what lets DrRacket’s “renaming” tool work on struct fields

if you rename the x
field of a posn
struct to foo
, drracket will rename posn-x
to posn-foo

Yes. I just can’t remember the details. For example. Is it possible for format-id to infer from "~a-~a" what to do? In other situation it might be "~a/~a" or something different.

I begin to see what you meant with “it is more work in format-id”.

For simple cases it might not be too bad though.

in my utils package I’ll probably just replace the whole format-id
shebang with something where format strings are actual structured data and not just strings, so it’s much less magic how format-id
works

It’s not too hard. It just involves manually expanding the format string.

what should (format-id #'foo "~a-~a" #'bar "not an identifier" #:sub-range-binders? #t)
do?

Non-identifier arguments just don’t cause any entries in the 'sub-range-binders
value.

which means (format-id #'foo "~a-~a" #'bar (expression-that-should-return-identifier-but-doesnt) #:sub-range-binders? #t)
will be a silent bug

that will pretty much never get caught, since this is the kind of thing that people don’t test for (and it’s hard to test for)

I can’t imagine very many cases where a sub-range-binder is actually useful but someone would somehow accidentally return a stringified version.

If you ever see that happen in the wild, let me know. :)

it’s very common that I use format-id
with map
, and I’ve been known to get myself confused over what I’m mapping over or whether I’m getting a syntax object or a symbol as an argument

that would be a case where a lenient format-id
would go unnoticed by me

….when does format-id
get used with arguments that aren’t identifiers anyway?

I guess I could theoretically see people using syntax->datum
instead of syntax->list
and dropping properties. But I don’t really care enough to worry about it.

What’s far more likely is that people won’t specify #:props
, the resulting syntax object won’t be syntax-original?
, and Check Syntax will ignore it anyway.

oh yeah I didn’t even realize that would be necessary

so (format-id #'foo "~a-~a" #'foo #'bar #:sub-range-binders? #t)
is broken on its own, it needs a #:props
argument with…. what?

just copying the props of #'foo
or #'bar
doesn’t seem right

Good question. Syntax properties are goofy. I don’t think they’re actually a very good solution to very many things, but that’s not a fight for anytime soon.

But think about it another way: you’re copying lexical context from #'foo
or #'bar
. That doesn’t really seem right, either.

No the lexical context copying does seem right, since that’s how I introduce the identifier into the same scope as either #'foo
or #'bar
(like struct field accessors)

If a macro generates (struct foo (x))
, and foo
has totally different lexical context from x
, what lexical context should foo-x
be in? Who knows?

for field accessors it should be the context of the field name

Then you should copy the properties from the field name.

I don’t think that follows

IME, when using format-id
, you usually want the lctx, srcloc, and props arguments to all be the same.

yeah and that’s what I do by default but I don’t feel like I have a good enough argument for why I do that, and I don’t know if it actually works the way I want it to all the time. It just seems like the simplest thing to do that isn’t obviously broken somehow.

Think of it this way. When you write format-id
, you’re usually generating an identifier derived from one or more other identifiers. Sometimes, it isn’t clear which of the identifiers is the “primary” identifier. That case is hard. But when you can designate one of them as a “primary” identifier, then you’re basically morally saying that the resulting identifier is the same identifier as the original one, the user just wrote it in an abbreviated form. So everything about the original identifier should be copied onto the new one; they’re the “same” identifier.

That’s definitely not what I want to say though, because sometimes I’m deriving multiple different identifiers from the same one. Like struct lenses. I want to say that these identifiers are all derived from this other one, and should have the same scope, but I don’t want to say that they’re the same because I don’t think of them all as the same identifier at all. Like, what if the field name already had a sub range binders property?

use case: macro for nested struct definitions that did something magic to generate field names by stringing together lots of other field names

Sure, that case is trickier. But keep in mind that just because the binding site has a sub-range-binders property doesn’t mean its uses do. It’s actually extremely unlikely for an identifier with a sub-range-binders property on it to ever show up in the input to a macro.

(there’s a couple of interface description languages that do this, most notably google’s protobuffer language)

In any case, I think if you’re doing something that fancy, you’re probably going to get weird behavior, anyway, with any automatic scheme.

So you might just have to be careful and do it by hand.

I think I want a list of what common syntax properties there are and to understand how they’re affected by copying onto derived ids like this

I only know of the ones used by check syntax

Yeah, me too. I think the fact that all syntax properties are automatically copied by the expander was probably a mistake, but it’s also unclear what a better solution is. So I don’t know. I don’t think syntax properties are “the right thing”, I think they’re sort of a hack, so I try not to worry about them that much.

They seem alright for use-site metadata. Anything related to the binding of the id though…. not so much. Like type information.

I think they work okay for things like 'disappeared-use
, but not for lots of other kinds of metadata, like 'paren-shape
.

Since it really doesn’t make very much sense for 'paren-shape
to get copied onto the expansion of a macro.

Yeah 'paren-shape
definitely doesn’t make sense

But that’s just the problem: syntax properties are used for a half dozen different things, but they’re a one-size-fits-all solution. Which doesn’t work.

But, to be fair, I doubt most of those use cases were known when syntax properties were first added.

Yup

I’m gonna make a gist listing syntax properties

@soegaard2 & @daniel Sorry, I was out sick for the past month, so no, they aren’t. :disappointed:

No worries.


looking at that list, 'mouse-over-tooltips
jumps out at me as something I would plausibly want on a struct field name, an accessor, and a setter, and I’d want them all different

Is this just something that hasn’t been migrated to Typed Racket yet, or am I misusing ~or*
?

A lot of syntax-parse
internal functions just haven’t been put into Typed Racket yet

I see. Further REPLing suggests that the problem has something to do with literals.

Any suggestion for how to get around this?

I’m surprised that this doesn’t work. It makes me suspect that I’m probably making some elementary error rather than pushing syntax-parse
beyond what’s been migrated to Typed Racket. Does anything look wrong there?

I hate to tell you this bad news, but no amount of required/typed
will fix this. The normalize-context
identifier produced by the syntax-parse
macro doesn’t have a type, and require/typed
(and unsafe-require/typed
too) produces a new binding, which the output of syntax-parse
will never refer to because it’s not in scope.
The only thing that might fix this is a super unsafe, undocumented, internal, super not-meant-to-be-used-by-users thing called #lang s-exp typed-racket/base-env/extra-env-lang
.
It’s weird and dangerous, more dangerous than unsafe-require/typed/provide
. You can think of it sort of like compile-time mutation on Typed Racket’s internal type table. It doesn’t just define a new identifier that might be unsafe to use like unsafe-require/typed/provide
would; mutates the table for existing identifiers without creating a new binding. So code X
using it can make code Y
unsafe even if Y
has nothing to do with it.
So maybe someone should use that to provide types for internal syntax-parse
functions, but it should be part of the official typed-racket-more
package, reviewed for type safety, and it will have to update whenever the internal implementation details of syntax-parse
update, not just when the interface updates.

OK, that does indeed sound scary, especially for a Typed Racket and syntax-parse
newbie.

I’m now seeing if defining syntax-classes will work. That’s how, purely by chance, I’ve been doing it in other code so far.

Well, this works!

So define-syntax-class
just avoids using the normalize-context
identifier here?

Wait, when I try that I get a similar error about es-add-thing
instead.

@bkovitz I think you probably want to use untyped Racket at phase 1 (that is, for all your compile-time code) and just use Typed Racket at phase 0.

I don’t think it’s theoretically impossible to use TR for phase 1 code, but I don’t think it’s easy, either.

Ohhhhhh, that might explain it. The error messages are all from me typing into the REPL. The code that worked was in a source file that I ran. The source file has #lang typed/racket
, so so does the REPL. But as you say, phase 1 should be run with Untyped Racket.

Yes… I think #lang typed/racket
provides ordinary racket/base
at phase 1?

If you write your code in the REPL under a begin-for-syntax
, you might have better luck.

Yes. Or at least I’m pretty sure that’s what’s happening.

And thanks for the REPL suggestion! I’d been grooving on how easy it is to try even syntax-parse
stuff at the REPL, but OK, it’s not quite as easy as I imagined, but that’s still fine for quickly trying stuff out to see if I’m doing it right.

If you want a REPL that’s consistently at compile-time, you can start one with (begin-for-syntax (read-eval-print-loop))
or something like that: > (begin-for-syntax (read-eval-print-loop))
> (define-syntax-class nodeclass-head
#:datum-literals [nodeclass tagclass]
#:attributes [tagclass?]
(pattern nodeclass
#:with tagclass? #'#f)
(pattern tagclass
#:with tagclass? #'#t))
> (syntax-parse #'nodeclass [x:nodeclass-head #'x.tagclass?])
.#<syntax:interactions from an unsaved editor::156 #f>

Wow, I had no idea that would work.

Excellent!

It seems to be maybe less than ideal in DrRacket.

Indeed when I’m experimenting like this, I usually try many, many variations until I get it right.

Ah, I see. It’s got that “eof” box. @alexknauth, are you running with something like XREPL?

No, I’m running it in DrRacket, I’ve just gotten used working with that “eof” box in the way

I see. Life is filled with trade-offs. :wink:

Anyway, thanks. Now I’m no longer making workarounds for things that aren’t problems, and now I’ve got a way to experiment with syntax-parse
in the REPL.

@alexknauth wait, what

that works?

Switching between sleep
and alarm-evt
, if I had a nickel for every time I mixed up seconds and milliseconds, I’d have either 100 or 100,000 nickels.

@greg I had 100000 nickels but then I divided by 1000

@greg @samth fun fact: inside Google, functions that accept / return a number of seconds / milliseconds / days / etc. are required by policy to include the unit in the function name and tools check for this automatically during code reviews

What do you do for function arguments (as in @greg’s examples)?

Either include the unit in the function name or (better) accept a structured type like Duration
instead of a plain number

so like, either (sleep-seconds 200)
or (sleep (seconds 200))

The former approach tends to get used when you can’t use a structured type easily, like for a command line flag.

For my own function definitions I tend to do a keyword argument that says the units. e.g. For a “Racket2” I’d suggest (sleep #:seconds n)
and (alarm-evt #:milliseconds n)
. In my own code I tend to abbreviate #:msec
. Although, real talk: If I haven’t burned my hand on the stove recently, I sometimes still define them ambiguously. hash tag do as I say not as I do. ¯_(ツ)_/¯

Hello, I’m making some autocomplete plugins from drracket: https://github.com/yjqww6/drcomplete

That works pretty well in a pinch. I assume it makes it difficult to document such a function though.

Cool!

How can you do a topological sort on things defined within a syntax-parse
macro? The topological sort itself hasn’t been an obstacle for me (there’s a library function that does it, for one thing). The difficulty has been pulling the syntax objects out of pattern variables to pass to the topological-sort function. I’d need to make a closure or something that accepts one pattern variable and returns its “parent” attribute(s). But is that even possible?

@bkovitz I’d expect something like (topological-sort (syntax->list #'(foo ...)))

but I suspect you’re having issues because you want to do something like foo.attr
?

Yes, exactly.

I think a reasonable way to do that would be to make a syntax class that does the sorting

Hmm. So how could a syntax class do that?

Just a sec, I’ll paste some code…

(define-syntax-class foo
#:attributes (attr1 attr2)
(pattern ...))
(define-syntax-class foos
#:attributes ([sorted 1] [sorted.attr1 1] [sorted.attr2 1])
(pattern (unsorted:foo ...) #:with (sorted:foo ...) (topological-sort-ids #'(unsorted ...))))

beware, I have only written this code, not compiled it

in particular I’m not sure if that #:with (sorted:foo ...)
part will correctly expose sorted.attr1
and sorted.attr2

So how do you write the function, passed to the topo-sort function, that accepts an element and returns its neighbors?

hmm. I wouldn’t topo-sort that way, I’d use a partial order (a function of two args that returns (or/c '< '> '= '≠)
)

is there an implementation of topological sort somewhere in the stdlibs?

This is way more than the minimal code to illustrate the problem, but hopefully the comments make the relevant bits salient.

Yes, there’s a topo sort right here: https://docs.racket-lang.org/mischief/sort.html

if it were me I’d probably make my own implementation that accepted a partial order, since that way it’s impossible to accidentally have cycles

The main problem, though, is just how to extract info from the syntax objects to process by arbitrary Racket code, as suggested here: https://docs.racket-lang.org/syntax/varied-meanings.html?q=~optional#%28part._.Non-syntax-valued_.Attributes%29 I figure this must be a pretty fundamental technique. I’ve been experimenting with functions like stx->list
, but I haven’t found how to pass data between the syntax-parse
world and the Racket world yet.

ah, right - I think the issue there is that an expression like #'(foo …)
throws away all the syntax attributes of foo
, since those aren’t really something connected to the foo
identifier

I hazily remember implementing a topo sort via partial order once, a long time ago. I’ll think about applying it here. I do want to raise an error if there is a cycle.

the syntax/parse
library just binds related identifiers like foo.attr
using with-syntax

I was wondering how attributes were connected to pattern variables. :wink:

but the (syntax ...)
form and with-syntax
don’t know anything at all about that connection

One idea I’ve been exploring, so far without success, is to make a list in syntax-parse
that places each name next to its parents, like this:

so, what’s a nodeclass anyway?

A nodeclass is a definition of a kind of node that can go into the graph that this program operates on. A nodeclass specifies values for attributes and rules for what kind of linking is allowed. Here are a couple illustrations. (A “tagclass” is just a nodeclass for nodes that describe other nodes.)

The reason for the topological sort is to implement inheritance for nodeclasses.

@bkovitz You might have more luck making actual structs for your nodeclass and tagclass concepts and defining functions that construct and manipulate them. Then you could have macros that create those structs at compile time. That would be easier to do complex transformations on than raw syntax objects.

Hmm, that’s what the current version (the one I’m now rewriting) does, actually. (It keeps the nodeclass structs around during run-time and consults them for various things, which is probably not what you have in mind.) I was wondering if people do complex transformations right within syntax-parse
, or, if not, how they generate code that does non-trivial rewriting of parts of the program like what happens in inheritance.

I think there’s a wide variety of ways people do it. Macrology is far from a refined science. For especially complex compile-time transformations, I personally would create structs holding the pieces of syntax I’m transforming so I can implement the transformations with functions that are documented, tested, and provided from their own module that I import at compile time when I need it.

Hmm, so are you thinking to structure the program something like this? 1. syntax-parse
to read the source code and generate structs. 2. Racket code to operate on the structs to do stuff like inheritance. 3. Another round of syntax-parse
to read the structs and generate code.

I’m honestly not sure

I suspect a good place to unify the structs and syntax-parse
would be syntax classes though

since you can put the struct instances in a syntax class’s attributes

I keep hearing how syntax-parse
makes it easy to whip out DSLs. I figure it’s a matter of seeing how it’s done on a variety of non-trivial examples.

Yeah there is not a lot of guidance for complex DSLs

Yeah, I’ve been having some success using syntax classes as “subroutines” that return “multiple values” through their attributes.

Oh well, I was hoping that this was a simple DSL!

I suppose that inheritance adds some non-trivial complexity. It’s given me a headache in each version of this DSL that I’ve implemented. (Some were in Clojure before I switched to Racket.)

hmm, maybe the nodeclass and tagclass things don’t need to be macros at all

what is your program using them for?

To unravel the mysteries of the mind. More precisely, to simulate human analogy-making in solving an arithmetic puzzle. But I’d like the same DSL to easily implement other, related computer models—called “FARG models” (for Fluid Analogy Research Group).


Yes.

A better answer to your question is: to automate various kinds of matching of subgraphs and editing of the graph. For example, if the program “notices” that there’s a (number 4)
node and a (number 17)
node, it might “tag” them with a greater-than
tag, which in turn could help trigger some other kind of match, eventually leading to solving the puzzle.

huh, that kinda reminds me of RDF and OWL

anyways, so this program is working with a graph and the program is allowed to add edges it thinks are interesting / useful to the graph - when is the program done? what’s the success state?

FARG models try to simulate thought at a “subcognitive” level: psychological pressures pushing the system to explore this way or that way, somewhat different than the formal-logic style of semantic-web stuff. But in practice, sometimes the difference is not all that great. Anyway…

Yeah it sounds much less directed

(not a bad thing!)

The program has to detect when it has solved the puzzle. Some versions just have some code that checks if there is a solution after every timestep. I think it’s better to have detecting the solution be yet another thought process to simulate. Probably not too important, though. My main concern right now is just to get the DSL into good-enough shape that it’s easy to try ideas.

what’s the puzzle?

If the nodeclasses weren’t defined by macros, how might that work? (Tagclasses are nodeclasses with a tag?
attribute set to #t
.)

I’m guessing the program is supposed to solve the puzzle in a very generic way that involves guessing at graphs of concepts of stuff, instead of some sort of algorithmic solution that’s very specialized to the kind of puzzle being solved

In this FARG model, the puzzle is to figure out how to add, subtract, and/or multiply some given numbers to produce a target number. For example, given 6, 3, 4, and 5, make 32. Yes, the program is supposed to follow human-like heuristics, guessing at what looks promising, applying mathematical insight, maybe gaining some mathematical insight during the process.

Something that I want to add pretty soon is some sort of logarithmic representation of the number line, to judge, say, that 32 is “much bigger” than the other numbers, which are all “pretty close together”. This needs to be easy to add in the DSL. But right now, the main thing is just to get the linking right, so that code can look around to see what tags apply to what nodes, search its memory for similar situations, make adjustments to those memories to guide the current search—that sort of thing.