Racket Slack Archive

notjack

2020-8-1 07:30:34

As an aside, @sorawee has just taught me how to write tests for whether macros evaluate expressions in tail position, which is awesome

sorawee

2020-8-1 07:45:47

:slightly_smiling_face:

sorawee

2020-8-1 07:46:29

See also “Test for tail position” section in https://srfi.schemers.org/srfi-157/srfi-157.html

soegaard2

2020-8-1 10:21:14

@yilin.wei10 The default lexer runs before the parser/expander, so it has no extra information. However the syntax colorer runs later, so in principle you can make a plug-in that saves type information for later use.

jjsimpso

2020-8-1 18:23:01

@jjsimpso has joined the channel

greg

2020-8-1 20:52:09

@yilin.wei10 Basically saying what @soegaard2 did, using more words: :simple_smile: Sometimes in programming editors people talk about "syntax highlighting" (which mostly corresponds to the kind of tokenization done by a lexer) and/or "semantic highlighting" (which probably corresponds more to what you could get from running drracket/check-syntax and/or doing other analysis of fully-expanded code).

greg

2020-8-1 20:53:31

Thinking of them as two layers or passes is probably good. That is, a lexer can probably work fast-enough to keep up with a user typing (keys) quickly, especially if re-lexing changes is handled smartly.

greg

2020-8-1 20:54:21

Whereas there is no way in heck that an analysis of fully-expanded code is going to work that fast. It needs to be something that runs “lazily”, comes in later and updates highlighting to be “richer” or “better”. AFAICT.

greg

2020-8-1 20:57:23

In most programming editors the “lexer” is really a pile of regular expressions, which works “good-enough” but fails a lot of corner cases.

greg

2020-8-1 20:57:50

I think Dr Racket is fairly unusual in using a “real” lexer, as well as one that each #lang can provide.

greg

2020-8-1 20:59:36

The other thing about lexing (even using regexps) is that you need that amount of information to do indentation (but not the full semantic analysis). AFAIK.

greg

2020-8-1 21:01:15

In most cases indentation needs to know about “open” and “close” tokens for “expressions” or “blocks” or whatever the lang uses.

greg

2020-8-1 21:02:18

Offside rule langs like Python or Haskell have either “indent” or “outdent” tokens (the former) or the indentation is really a shorthand for curly braces (the latter). I think.

greg

2020-8-1 21:02:52

(Of course for those langs auto-indentation needs to work a little differently.)

greg

2020-8-1 21:03:38

I’m blabbing on about this b/c I’ve been looking at it lately for Racket Mode on Emacs, and trying also think about non-s-expression langs.

yilin.wei10

2020-8-1 21:04:35

Yes; I’m an emacs guy so I’m familiar with syntax highlighting (font-lock) and got ridiculously excited because I wondered whether racket had a way to “transform” syntactic objects.

yilin.wei10

2020-8-1 21:06:35

Does racket-mode on emacs use racket to get that information?

greg

2020-8-1 21:06:58

It’s tricky because in Emacs “syntax” means classifying single characters, so that’s really more like “lexing” and you have to hope the important tokens are single chars. :smile:

greg

2020-8-1 21:07:44

Racket Mode uses a pile o’ regexps, for which the number of failing edge cases is smaller over time but definitely still non-zero. :slightly_smiling_face:

greg

2020-8-1 21:08:15

However I have a branch where I’ve been working on a racket-hash-lang-mode that instead uses the “real” lang lexer.

yilin.wei10

2020-8-1 21:08:17

Does it translate the regex from the color-lexer or is it in built?

yilin.wei10

2020-8-1 21:08:28

Ah that would be pretty cool!

greg

2020-8-1 21:08:35

No the regexps are hand crafted.

yilin.wei10

2020-8-1 21:08:58

I’ve just rolled an extra major more per language at the moment (extremely new to racket, so that’s only pollen which is any different…)

greg

2020-8-1 21:08:59

It also wants to use drracket:indentation, but that’s currently designed to assume racket/gui framework and way too heavy.

greg

2020-8-1 21:09:24

So I’ve also been sketching out a simpler “token-map” interface, that wouldn’t presume DrRacket and racket/gui.

yilin.wei10

2020-8-1 21:09:43

Oh no way? Racket has indentation for langs? That’s super cool.

yilin.wei10

2020-8-1 21:10:13

I’d be happy to hack on that if you’ve got a branch. Not familiar with racket, but pretty familiar with elisp and emacs.

greg

2020-8-1 21:10:43

Yes! A #lang can supply that. However I’ve found relatively few examples of langs that use it so far. the Scribble one is the main example, plus some 3rd party langs that simply re-provide that.

greg

2020-8-1 21:11:16

So I’m hoping if I propose a new protocol, it wouldn’t be too disruptive coughs.

yilin.wei10

2020-8-1 21:11:49

Haha, racket is a research language right :wink:?

greg

2020-8-1 21:13:10

The overview page you might want to see is https://docs.racket-lang.org/tools/lang-languages-customization.html

yilin.wei10

2020-8-1 21:19:02

Thank you very much!

sorawee

2020-8-2 03:21:28

I’m surprised that this function doesn’t already exist:

> (adjacent-group-by char-whitespace? (string->list "abc def ghi")) (list (list #\a #\b #\c) (list #\space) (list #\d #\e #\f) (list #\space #\space) (list #\g #\h #\i)) > (adjacent-group-by abs '(1 -1 2 1 3 -3)) (list (list 1 -1) (list 2) (list 1) (list 3 -3))