ebirman77
2021-1-11 15:34:35

Hi all, I’m cross-posting from the Racket channel at freenode. This is I’m afraid sort of off-topic, so please tell and excuse me if I am out of place. I am following along @popa.bogdanp’s great screencast tutorial: https://www.youtube.com/watch?v=DS_0-lqiSVs and was wondering if someone could help me to set an environment up with Docker Compose or Podman (I am actually using the Podman compatibility layer) using koyo-shorty as an example app https://github.com/Bogdanp/koyo-shorty. So first question: 1) Should I build my own (multi-stage?) container images for postgres and Racket/Koyo, or shoud I pull a ready-made availiable container image? I am quite new to containerization and I have to actually learn this DevOpsy stuff. The goal of my excercise is to build a fictional staging-like environment that will be subjected to testing. I’ve read https://defn.io/2020/06/28/racket-deployment/ and found it interesting, and might try that in the future but this time I am being required to at least use Docker/Podman and a CM tool like Ansible or Chef and some testing framework like serverspec or similar. I am finding it too much for my brain to tackle at once. So, a cursory look at http://hub.docker.com\|hub.docker.com brought racket/racket and racket/racket-ci to my attention, which one should I use? or none of those?


jjsimpso
2021-1-11 16:21:32

I have a custom language (magic, https://github.com/jjsimpso/magic) that compiles very slowly. Short files compile/expand fairly quickly, but a 1000 line file takes 30 seconds or so and a 3000 line file takes upwards of 30 minutes. Any tips on profiling this? DrRacket’s macro expander doesn’t seem to like the larger files. When I click ’Macro Stepper" on the 1000 line file it doesn’t do anything.


jjsimpso
2021-1-11 16:23:26

I’m using brag to implement the parser and reader. One thought is that the parser is slow, but it could also be my macros for expanding magic.


jjsimpso
2021-1-11 16:26:06

It looks like the macro stepper did eventually throw an error:


jjsimpso
2021-1-11 16:26:12

file-position: setting position allowed for file-stream and string ports only port: #<input-port:/home/jonathan/git/magic/tests/images.rkt> position: 449 #<void>: 51:22 /home/jonathan/.racket/snapshot/pkgs/brag-lib/brag/codegen/runtime.rkt: 69:2


jjsimpso
2021-1-11 16:28:23

The code runs correctly outside of the macro stepper though, so I’m not sure what the problem is.


soegaard2
2021-1-11 16:30:03

Well, that sounds as good point of attack. Time the parsing part only of files with different sizes.

It sounds as if something makes the time grow quadratic in the number of lines. One possibility is, that there is a loop (or matching process) of all lines, which contains a sub loop or sum match over the remaining lines. If the problem is in the macro layer, then look out for … patterns.


jjsimpso
2021-1-11 16:32:36

Any hints or sources of info on how I can actually profile this? Will I need to instrument the code? Profiling in racket-mode when I run the file doesn’t appear to profile the macro expansion or parser.


soegaard2
2021-1-11 16:33:32

If the problem is in the parser, look out for any right recursive rules.


soegaard2
2021-1-11 16:35:02

For profiling: You can temporarily, make your module-begin expand to #’42. Then use time raco compile file.rkt to see how long it takes to parse the file.


jjsimpso
2021-1-11 16:40:07

I like the module-begin idea. I’ll try that.


hoshom
2021-1-11 16:51:31

jjsimpso
2021-1-11 16:52:14

With the simple module-begin, it does appear that almost all the time is spent in the parser.


soegaard2
2021-1-11 16:53:32

That means, we need to look at this file?

https://github.com/jjsimpso/magic/blob/master/parser.rkt


soegaard2
2021-1-11 16:57:57

Btw - I don’t know if it is useful in this situation, but the parses have a debug clause, that can be used to save extra information.



jjsimpso
2021-1-11 17:06:46

Parsing isn’t my area of expertise, but I don’t think any of my rules are recursive. The only rules that expand to anything complicated are at the top of the grammar: magic : EOL* (query \| named-query)+ query : line (level+ (line \| clear-line))* /EOL* level : /">" line : offset /HWS type /HWS test (/HWS message?)? /EOL* clear-line : offset /HWS "clear" (/HWS test)? /EOL* named-query : name-line (level+ (line \| clear-line))* /EOL* name-line : offset /HWS name-type /HWS MAGIC-NAME (/HWS message?)? /EOL*


jjsimpso
2021-1-11 17:07:47

Basically, a magic file is a sequence of lines and lines expand to their component parts and aren’t recursive.


jjsimpso
2021-1-11 17:22:45

@hoshom good to know about the macro profiler even if it seems the macro expansion may not be the problem here.


soegaard2
2021-1-11 17:28:47

I mostly have experience with using the parser-tools/parser directly. I have used ragg once, but never brag.

Although your brag grammar doesn’t contain any recursive rules (as far as I can tell from the snippet above), it is worth looking at the grammar that brag produces for parser-tools. Maybe the translation is not one to one (for example how are cuts handled )?


ben.knoble
2021-1-11 17:38:20

I was curious about possible issues with the EOL*’s colliding, but I imagine your long files don’t have lots of blank lines? And a short file with many blank lines is still fast?


jjsimpso
2021-1-11 17:45:10

I’ll do some tests with blank lines. I’ll also see if I can figure out how to look at the grammar brag is producing.


soegaard2
2021-1-11 17:46:51

Wait - where is HWS defined? It looks like it is commented out?


soegaard2
2021-1-11 17:48:29

Ah - it’s a token.


jjsimpso
2021-1-11 17:48:31

It is a token.


jjsimpso
2021-1-11 17:48:36

Yep :slightly_smiling_face:


greg
2021-1-11 17:53:14

I have very little hands-on mileage with parsing, and none with brag and that grammar. But from my experience making a markdown parser that at times was horribly slow: :smile: I might look first at choices? Like the topmost choice between query \| named-query? Could it be exploring query more than it needs to before realizing that fails and backtracking to try named-query? Stuff like that. Not necessarily quadratic, but I might start there? idk


greg
2021-1-11 17:55:03

Also I might have guessed that a named-query would be a special case of a query and it would make sense to try that first? Not sure, just thinking out loud.


greg
2021-1-11 18:00:07

Like I believe other people already said, I think one obvious “hand-wavy first theory of the bug” is that it’s scanning to the end of the file before failing and backtracking.


soegaard2
2021-1-11 18:14:37

Planet Scheme is moving, so I have a redirect in place. Anyone care to test if it actually works?

http://planet.scheme.org\|planet.scheme.org


kellysmith12.21
2021-1-11 18:17:15

@soegaard2 This is where the link takes me.


soegaard2
2021-1-11 18:18:02

Oh! Wait. I should I have given you the old link … http://scheme.dk/planet\|scheme.dk/planet


kellysmith12.21
2021-1-11 18:18:21

Takes me to the same place.


soegaard2
2021-1-11 18:18:27

Great.


gknauth
2021-1-11 18:37:56

Same here.


popa.bogdanp
2021-1-11 18:59:33

Regarding 1): the technique described in the article you linked is useful even when you use Docker to deploy things. What I generally do is use multi-stage builds to create minimal containers by: building a Racket distribution inside an image built on top of racket/racket then copying that distribution into a debian image. You can find an example of this process https://github.com/MarcKaufmann/congame/blob/e31e5adf8379de3fd2caac9a94fababf469b5d03/Dockerfile\|here.


popa.bogdanp
2021-1-11 19:00:05

Re. racket/racket vs racket/racket-ci, you want the former; the latter is used for Racket’s own CI.


popa.bogdanp
2021-1-11 19:01:10

Re. pulling images vs making your own, it depends on what you’re comfortable with. When I use Postgres with Docker, I tend to just use the official image from Docker Hub.


ebirman77
2021-1-11 19:06:04

Thank you Bogdan, I’ll start with the official Postgres image as you suggest and copy (steal) from your Dockerfile in order to muti-stage build a new koyo-shorty Docker Image. Once I get shorty running, I can then think of writing a docker-compose yaml file to bring up and connect both containers.


soegaard2
2021-1-11 19:40:10

@jjsimpso This looks suspicious: https://github.com/jjsimpso/magic/blob/master/reader.rkt#L113 If all strings are small, it might be ok, but otherwise the idiom is to accumulate a list of characters (in reverse order), and then use (string-append* (reverse acumulated)) when the end is reached.


kellysmith12.21
2021-1-11 20:11:39

Earlier, I asked why a complex number with 0.0 for the imaginary part is not considered real?. @sanchom gave <https://racket.slack.com/archives/C06V96CKX/p1610335952466600|the explanation> that the imaginary part must be an exact zero, for a complex number to be real?. That makes sense, but I do find that unintuitive, in the context of (= 0 0.0) returning #true.

After reading the docs, I find it a little confusing that numerical comparisons coerce arguments into exact numbers, which runs counter to the usual rule of numerical procedures propagating inexactness. I’d like to understand why comparisons work that way, what’re the reasons behind it?


jjsimpso
2021-1-11 20:16:31

I’m glad you noticed that. There was one other place where I used string-append thoughtlessly and generated way too many allocations. The strings should be short but, I’ll rewrite so that it doesn’t call string-append for every loop iteration.


jjsimpso
2021-1-11 20:17:12

I hope it is this simple, otherwise I’ll need to look further into the grammar.


me1890
2021-1-11 20:20:39

= is used to refer to a semantic comparison. This includes IEEE754 floating point special equality (such as all zeros are equal and no nans are equal)


me1890
2021-1-11 20:20:55

If you want to know if numbers are actually the same you should use eq?


jjsimpso
2021-1-11 20:28:53

The string-append change did not make a noticeable difference, but nevertheless I think it is a good change. So I appreciate that.


samth
2021-1-11 20:32:36

I don’t think I’d describe it as coercing to exact in equality


jjsimpso
2021-1-11 20:32:37

@greg would the general idea be to put most likely choices first? I’ve very little experience with parsers myself. named-queries in my language are relatively rare but I wonder if there is a large cost whenever one is encountered.


samth
2021-1-11 20:33:28

Floating point numbers denote particular rationals


samth
2021-1-11 20:33:41

And = compares rationals


jjsimpso
2021-1-11 20:34:12

actually, maybe they aren’t as rare as i think. the queries themselves can also be quite a few lines, so i agree that this is a good place to start.


kellysmith12.21
2021-1-11 20:34:18

This is from the docs on = > An inexact number is numerically equal to an exact number when the exact coercion of the inexact number is the exact number.


samth
2021-1-11 20:35:12

Sure, but for all numbers which are =, the same is true in the reverse direction


ebirman77
2021-1-11 20:35:19

Should I use cs-full as well?


samth
2021-1-11 20:35:45

Coercion to inexact is lossy though, so you can’t specify the function that way


kellysmith12.21
2021-1-11 20:43:42

ah


kellysmith12.21
2021-1-11 20:44:04

Sorry, floats are confusing sometimes.


notjack
2021-1-11 20:45:56

that they are


me1890
2021-1-11 20:47:23

i wrote a library for parsing floats a while ago, so i learned a lot about them.


me1890
2021-1-11 20:47:37

probably more than i will ever need


jjsimpso
2021-1-11 21:29:15

As @ben.knoble suggested, there does appear to be a problem with parsing the EOL token. a file with a small query followed by 500 blank lines takes 4 minutes to compile. But 500 blank lines before the first query are parsed quickly, the magic rule can probably throw them away quickly.


jjsimpso
2021-1-11 21:35:00

Removing the redundant EOL* rule at the end of my query rule speeds up my blank line test file but doesn’t totally fix the problem. This does appear to be the correct path. I can probably tweak the rules to fix this.


jjsimpso
2021-1-11 21:52:39

I’ve made a significant improvement by tweaking the EOL parsing. I’m not sure that I’m all the way there since ideally I’d like to parse files with tens of thousands of lines quickly, but this is a huge improvement. Thanks everyone for your help! I’ll probably post my commit here later in case anyone is curious.


soegaard2
2021-1-11 21:53:58

I wonder whether there are online tools that can analyze such parser rules, and give some advice on what to avoid?


jjsimpso
2021-1-11 22:04:24

That would be very helpful. I’m just happy that I don’t have to abandon brag. I was afraid I’d need to rewrite the parser just using parser-tools. By adjusting the rules I’ve reduced the large file’s comp time from 30 min to 1min.

Next, I think I will modify the lexer to collapse consecutive EOL tokens into one. That will enable me to simplify the grammar further and hopefully get things to a reasonable speed.


popa.bogdanp
2021-1-11 22:21:32

Yes, otherwise your build will spend a lot of time downloading dependencies.


notjack
2021-1-11 22:26:32

I’ve made a lot of progress on my refactoring tool, which I’m calling resyntax. I’m looking for ideas of refactoring rules to implement so I can further test the tool. Anyone interested (especially @sorawee, @laurent.orseau, @rokitna, @soegaard2, and @kellysmith12.21) is invited to share their ideas in https://github.com/jackfirth/resyntax/issues/8


soegaard2
2021-1-11 22:30:54

How about one that rewrites define-struct into struct ?


notjack
2021-1-11 22:31:28

already implemented :)


soegaard2
2021-1-11 22:31:55

Where are the rules?


laurent.orseau
2021-1-11 22:32:40

Could look for all mentions of deprecated in the docs?



soegaard2
2021-1-11 22:33:08

I am blind, they are in … yes exactly.


soegaard2
2021-1-11 22:35:32

(cons a (cons b xs)) -> (list* a b xs) MIght be controversial?


soegaard2
2021-1-11 22:38:22

(cons a (list b c …)) -> (list a b c …)


soegaard2
2021-1-11 22:39:25

(or a) -> a


laurent.orseau
2021-1-11 22:39:44

@notjack may be hard to do but this would be awesome: make it possible to apply refactoring rules to rhs of refactoring rules. Then you wouldn’t have to worry about proposing rules that rely on other rules, since you can force them (in some cases) to be self contained.


laurent.orseau
2021-1-11 22:41:21

(cond [test expr] [else expr]) -> (if test expr expr) but may also be a matter of taste?


soegaard2
2021-1-11 22:42:21

If the first clause is short, it’s possible to write it on two lines. Otherwise I prefer the if.


laurent.orseau
2021-1-11 22:42:22

Nested ifs to cond


laurent.orseau
2021-1-11 22:43:17

let () to block (needs to add a require though)


laurent.orseau
2021-1-11 22:43:34

If+begin to cond


soegaard2
2021-1-11 22:43:36

How about rules converting a program in a teaching language into one standard Racket? Im thinking removal of local.


laurent.orseau
2021-1-11 22:44:36

define foo lambda to define (foo …)


soegaard2
2021-1-11 22:44:59

Apropos, cond and begin. Maybe rules for converting an old-style Scheme program into Racket. For example (cond [expr (let () . body)]) -> (cond [expr . body])


laurent.orseau
2021-1-11 22:45:00

Port rnrs scheme to racket :)


laurent.orseau
2021-1-11 22:45:51

Let loop to for (see also ryanc ideas in the package mentioned by samth)


laurent.orseau
2021-1-11 22:46:26

(+ (+ … )) to (+ …)


soegaard2
2021-1-11 22:47:25

That example works, but, say (+ a (+ b c) d) -> (+ a b c d) doesn’t if there are floats involved.


notjack
2021-1-11 22:47:27

I want to avoid adding rules without evidence they actually occur “in the wild”, so to speak


soegaard2
2021-1-11 22:47:53

A reasonable rule.


jjsimpso
2021-1-11 22:48:15

There is a vulkan api as well (https://docs.racket-lang.org/vulkan/index.html) but I’m not sure how complete it is.


laurent.orseau
2021-1-11 22:48:23

I can write a package with only bad style if you want?


soegaard2
2021-1-11 22:49:04

Let’s look at some old code as a concrete example: https://github.com/soegaard/little-helper/blob/master/lexer.rkt


laurent.orseau
2021-1-11 22:49:09

Another use case: conventions. Define your own conventions and refactor inconsistent code to follow them


soegaard2
2021-1-11 22:50:12

I notice that case-lambda is used to handle default arguments. Presumably the code was written before define supported default arguments.


soegaard2
2021-1-11 22:51:09

(if token (begin (f token) (for-each-token f count-lines?)) (error "internal error: token expected after skipping")))))]))


notjack
2021-1-11 22:51:17

I’d like to avoid defining conventions that are idiosyncratic to a codebase. However, I do want to give #lang implementations and library modules the ability to define conventions specific to that language or library. Then if someone really wants to make conventions for their codebase, they can define #lang mycodebase and use that instead of #lang racket.


soegaard2
2021-1-11 22:51:32

Anything with (if e (begin ...) ...) would be better as a cond.


notjack
2021-1-11 22:52:03

(if e (begin (define a 1) a) …) actually doesn’t compile, because begin in an expression context can’t include definitions


notjack
2021-1-11 22:52:21

(if e (let () …) …) would be a good candidate though


notjack
2021-1-11 22:52:56

That case-lambda one is good


soegaard2
2021-1-11 22:53:42

Ah, the example were: (if e (begin e1 e2 e3) (begin e4 e5 e6)) where e1, e2, e4, e5 are expressions with side effects.


notjack
2021-1-11 22:54:22

Oh yes that’s definitely a good one


soegaard2
2021-1-11 22:55:42

A shame (eqv? (peek-char) #\#) can’t be be rewritten to (char=? (peek-char) #\#) . (since peek-char can return a non-char.


soegaard2
2021-1-11 22:56:38

How about (lambda . more) to (λ . more) ?


notjack
2021-1-11 22:56:51

if it were up to me I’d just rewrite all usages of the type-specific equality procedures to equal? to discourage their use, and leave it up to the optimizer to figure out when that’s appropriate


notjack
2021-1-11 22:57:12

but, that definitely doesn’t satisfy the non-controversial requirement


notjack
2021-1-11 22:57:57

Rewriting lambda to λ would be good in my opinion, though that one might be controversial to some


soegaard2
2021-1-11 23:01:03

Yeah, you are right.


notjack
2021-1-11 23:02:43

> may be hard to do but this would be awesome: make it possible to apply refactoring rules to rhs of refactoring rules. Then you wouldn’t have to worry about proposing rules that rely on other rules, since you can force them (in some cases) to be self contained. @laurent.orseau this is absolutely possible :) refactoring rules are roughly functions from syntax to syntax, so they’re composable


notjack
2021-1-11 23:03:03

an early version of resyntax just repeatedly applied all rules until the code reached a fixpoint


notjack
2021-1-11 23:04:06

I got rid of that both for performance reasons and because it didn’t handle broken rule-generated code very well


notjack
2021-1-11 23:04:46

those problems are both fixable but they’re more work and I wanted to just iterate on the basic functions of the tool first


notjack
2021-1-11 23:05:31

Anyway I have to go do my actual job now. I cordially invite you all to leave comments and/or open issues with the ideas you’ve come up with.


cowbs
2021-1-11 23:16:36

Is it documented somewhere which hashing algorithm Racket uses for its hash tables? We have an external tool we’re trying to achieve parity with Racket’s results. Thanks!


mflatt
2021-1-11 23:18:16

No, it’s not documented, and it sometimes changes. When you say “parity”, you mean that you want to produce the same hash code? Or just something with similar characteristics?


cowbs
2021-1-11 23:19:00

Ideally the same hash code so that for the same data sets we write out our tables in the same order.


cowbs
2021-1-11 23:20:26

Is the implementation on the github repo? That would be enough for our needs.


blerner
2021-1-11 23:22:27

@notjack two questions: can your library work with *SL languages instead of Racket? And, can it work with snips in the source and not just plain-text? I wonder if this could be part of a linter for HtDP programs… :thinking_face: if this is out-of-scope for you, nvm…


mflatt
2021-1-11 23:27:07

cowbs
2021-1-11 23:27:23

tyvm!





notjack
2021-1-12 01:27:34

@blerner it definitely only works for textual programs. As for the student languages, I’m not sure. My intention is that it eventually works for any #lang


jjsimpso
2021-1-12 01:53:38

Here is my initial fix for this, but I still need to tweak the reader some more: https://github.com/jjsimpso/magic/commit/ea49f62b71a2c105f3723367a7ed39c0bba1a815



sorawee
2021-1-12 03:37:53

Magic :slightly_smiling_face:



greg
2021-1-12 03:39:57

That explains why my search of Racket docs didn’t find anything. :smile:


sorawee
2021-1-12 03:40:12

This is potentially unsafe. Consider:

(cond [#t (begin (define x 1) x)] [else 2])


greg
2021-1-12 03:40:21

through the magic of GitHub search I was starting to get an inkling, but thanks for the direct link! :slightly_smiling_face:


notjack
2021-1-12 03:40:40

It’s very magic


greg
2021-1-12 03:41:34

Somehow that stretch of Slack discussion slipped by me, it was only a few days ago. Derp.


sorawee
2021-1-12 03:43:20

greg
2021-1-12 03:43:43

I feel like I need to start adding undocumented modules like this to my own packages. #%do-not-use #%magic #%sekret-modyule and so on.


kellysmith12.21
2021-1-12 03:44:29

Scribble’s defstruct* form automagically indexes the predicate and accessor procedures for a struct type, linking them to the blue box where the type is described. Is there a way to do something similar, using a defform?


sorawee
2021-1-12 03:44:38

I thought the convention is to use private directory


samth
2021-1-12 03:46:20

@ryanc’s paper on the macro stepper is probably useful for understanding it


greg
2021-1-12 03:46:23

Good point. So (submod "foo.rkt" private private private #%sekret-modyule).


kellysmith12.21
2021-1-12 03:49:37

@greg you’re on to something — actual snippet from my project: (module+ secret-provide (provide (for-syntax lookup-contract)))


greg
2021-1-12 03:49:50

Mainly it’s just I was amused, I was reading @notjack’s nicely written code using all these clean rebellion abstractions and my eye snagged on (dynamic-require ''#%expobs 'current-expand-observe) and I’m like, what even am I ….


notjack
2021-1-12 03:55:23

Trust your instincts. I’m still deeply suspicious of it.


greg
2021-1-12 03:56:17

@notjack Some possible ideas, if you haven’t seen this: https://github.com/rmculpepper/sexp-rewrite/blob/master/racket-rewrites.el


notjack
2021-1-12 05:34:03

@soegaard2 generated this diff in your little-helper project :grin:



sorawee
2021-1-12 05:39:40

Here’s an idea: change error to raise-argument-error when there’s enough information.


notjack
2021-1-12 05:41:10

I like that. Might be tricky to do in a semantics-preserving way, but maybe small error message format changes are okay for refactoring rules.


kellysmith12.21
2021-1-12 05:45:01

Are there any rules for replacing legacy contracts?


notjack
2021-1-12 05:45:59

Yeah, those were some of the first ones I added


notjack
2021-1-12 05:46:07

false/c -> #f, etc.


kellysmith12.21
2021-1-12 05:46:30

What about -&gt;d -> -&gt;i?


notjack
2021-1-12 05:48:08

Haven’t done that one because I’m not actually sure there is an automated migration from the former to the latter. I only skimmed the docs, but it looked like preconditions have different semantics between the two systems.


kellysmith12.21
2021-1-12 05:55:08

From what I can tell, automated migration should be safe. The main difference between -&gt;d and -&gt;i is that the former allows the dependent parts to violate the argument contracts within the -&gt;d form, which is never good.


soegaard2
2021-1-12 06:54:58

I am impressed. Works pretty well.