laurent.orseau
2020-3-9 08:44:30

@robby The culprit seems to be classify-position in racket/share/pkgs/gui-lib/framework/private/racket.rkt. If the cursor is at the end of a single-line comment, right before the #\newline, it is classified as 'whitespace and not as 'comment.

I have a simple semi-fix, which consists in checking if • the current character is #\newlineclassify-position returns 'comment at the current cursor position –1 • the previous two characters are not \|# It’s only a semi-fix because it will fail on the following example when the cursor is at the end of the line: ; not a nested comment \|#

However, I expect this case to be rather rare, and overall I think this solution is strictly better than the current state.

How does this sound?


laurent.orseau
2020-3-9 08:46:26

classify-position returns 'comment for both line-comments and nested-comments. Ideally, it would a different symbol for each, and would also tell whether the cursor is in a sexp-comment, but it doesn’t.

Also, the rest of the code often relies on a eq? 'comment ... test, so differentiating between the two comments would require several changes and may be a little risky to change.


laurent.orseau
2020-3-9 08:48:37

On a more positive note, it was nicely simple to test all this using Quickscript! :slightly_smiling_face: https://gist.github.com/Metaxal/6a25ca345e829a65a0bcfa3db3d5a82f


laurent.orseau
2020-3-9 09:21:30

Instead of modifying classify-position, it may be safer to modify in-position? in framework/private/racket.rkt instead, since it already attempts to fix such cases (but misses this one)


laurent.orseau
2020-3-9 09:32:20

in-position? will also fail on "abc"\|"def" where | is the cursor: This is classified as in the string, but is actually in-between two strings.


robby
2020-3-9 10:32:31

Changing the API to have two comment characters is backwards incompatible and probably best decoupled from the bug fix


robby
2020-3-9 10:33:49

But classify position should be using the results of the colorer — does it really itself know what is a comment and what is whitespace?


robby
2020-3-9 10:35:00

In other words, is racket-lexer returning the right thing on this example?


laurent.orseau
2020-3-9 10:54:49
  1. I’m not clear about what you mean re. backward-incompatibility. I wasn’t proposing to change the API, but just performing additional tests to return the correct value. However, this test relies on racket-specific characters such as #\|, so may not work for non-racket languages. (Is that what you mean?)
  2. The racket lexer doesn’t return enough information. For example, it doesn’t know the difference between line comment, nested-comments and sexp-comments, which is annoying. It only returns 'comment. However, if you add text at the end of a line-comment, it’s still inside the comment, but at the end of a nested comment it’s outside

  3. I’m currently working on a better bug fix, that may be used for fixing other bugs in other places (while being backward-compatible). This involves an additional classify-position* (or whatever name) that returns: • the result of classify-position when the cursor is not of the start position of the current token • (list (classify-position (- pos 1)) (classify-position pos)) when pos is the start of the current token. Then fix in-comment/string? to return true also for (line-comment whitespace) .


laurent.orseau
2020-3-9 11:00:27

Though currently this is still relying on racket-specific conditions.


laurent.orseau
2020-3-9 11:04:29

Maybe the lexers should know not only the type (comment, string), but also the subtype (line-comment, here-string, etc.)?


laurent.orseau
2020-3-9 13:24:54

The lexers should probably have returned struct objects, or dictionaries instead of values. It would have been much easier to extend them while keeping backward compatibility


laurent.orseau
2020-3-9 14:14:11

Given how the lexers are implemented, I don’t see a good backward compatible solution. Now attempting a bad one (still bwd compat): Is there a way to know if the current lexer is a racket one? I could then add a test on the lexer type that would be specific to racket only and thus not break other lexers


spdegabrielle
2020-3-9 21:46:54

@laurent.orseau @robby I don’t think PLT-Scheme had structs or dicts in 1995 :smiley:

;; originally by Dan Grossman
;; 6/30/95

https://github.com/racket/gui/blob/master/gui-lib/framework/private/racket.rkt\|https://github.com/racket/gui/blob/master/gui-lib/framework/private/racket.rkt


robby
2020-3-9 22:34:31

We had structs; maybe substructs. I forget. But we could certainly have encoded the idea if had had the foresight to do so! :)