


Am I missing something or is the order of xy and yx different here?

Also from dc.rkt:

Do you observe a strange behaviour or is it only the docs?

If the order is swapped in the docs, then the behaviour is as observed.
I noticed because of this issue: https://github.com/sicp-lang/sicp/issues/35

The cairo docs explains the meaning:

Maybe those equations should be added to the docs too.

Are there pre-built variants of Racket w/ debug symbols turned on anywhere? I’m trying to debug a segfault on Linux but I don’t have a dedicated Linux machine so I’d like to avoid building it myself. The segfault is reproducible with both BC 7.6 and 7.7.

The failure is to do with use of continuations in the web-server and possibly raco exe
. Under load I can reliably get it to crash with
SIGSEGV MAPERR si_code 1 fault on addr 0x280
Aborted (core dumped)
but the GDB trace isn’t very useful:
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fdaa22627bb in ?? ()
[Current thread is 1 (LWP 109)]
(gdb) bt
#0 0x00007fdaa22627bb in ?? ()
#1 0x0000000000010402 in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb)

Not sure if that’s helpful, but have you tried running as racket -l errortrace -t <myprog.rkt>
? I don’t know if errortrace gives meaningful info on a segfault though

I don’t think I can, because this is a compiled app (with raco exe
)

can you embed errortrace with ++lib? I don’t know if that will make it use it though

oops, I was passing the wrong executable to gdb (the resulting exe instead of lib/plt/racket3m
). I get a better trace now:

Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7fdaa2227300 (LWP 109))]
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007fdaa224d535 in __GI_abort () at abort.c:79
#2 0x000000000067afa9 in fault_handler ()
#3 <signal handler called>
#4 0x00000000006458bf in ?? ()
#5 0x0000000000645ba0 in ?? ()
#6 0x000000000067e6f9 in ?? ()
#7 0x0000000000681e56 in ?? ()
#8 0x0000000000686e84 in ?? ()
#9 0x000000000047a757 in ?? ()
#10 0x000000000048dbc0 in ?? ()
#11 0x000000000048fdf3 in ?? ()
#12 0x00000000004653ce in scheme_do_eval ()
#13 0x000000000047ace2 in ?? ()
#14 0x000000000047f6af in scheme_force_value_same_mark ()
#15 0x000000000046685a in _scheme_apply_from_native ()
#16 0x00007fdaa1ca61a1 in ?? ()
#17 0x002e646572697571 in ?? ()
#18 0x00007fda902395f3 in ?? ()
#19 0x00007fda933d5f48 in ?? ()
#20 0xfffffffffffff410 in ?? ()
#21 0x00007fff18e13d30 in ?? ()
#22 0x0000000000000002 in ?? ()
#23 0x0000000000003f82 in ?? ()
#24 0x0000000000000000 in ?? ()

I can try

Nope, that doesn’t seem to have made a difference

Ha

Does it still fail with -j
? That might have a better stack trace.

Is there a way to pass -j
all the day down into raco exe
?

I can’t get it to fail reliably unless I go through raco exe
. I was only ever able to get racket app.rkt
to fail once, on macOS and I’m not sure it was the same issue.

I just built an exe using <https://plt.eecs.northwestern.edu/snapshots/current/installers/racket-test-7.7.0.4-x86_64-linux-wheezy.sh>
and I haven’t been able to get it to crash.

PLTNOMZJIT=1

Still crashes

Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f9d088fc300 (LWP 694))]
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f9d08922535 in __GI_abort () at abort.c:79
#2 0x000000000067afa9 in fault_handler ()
#3 <signal handler called>
#4 0x00000000006458bf in ?? ()
#5 0x0000000000645c30 in ?? ()
#6 0x000000000067e6f9 in ?? ()
#7 0x0000000000681e56 in ?? ()
#8 0x0000000000686e84 in ?? ()
#9 0x000000000047a757 in ?? ()
#10 0x000000000048dbc0 in ?? ()
#11 0x000000000048fdf3 in ?? ()
#12 0x00000000004653ce in scheme_do_eval ()
#13 0x000000000046726a in ?? ()
#14 0x0000000000464916 in scheme_do_eval ()
#15 0x0000000000465d4e in scheme_do_eval ()
#16 0x000000000048a0d9 in scheme_finish_apply_for_prompt ()
#17 0x000000000048a286 in scheme_apply_for_prompt ()
#18 0x0000000000000000 in ?? ()

I wonder what those unknown frames are

can you go into frame 12 and figure out what’s being evaluated?

I don’t think I can w/o debug symbols, but by gdb-fu is weak.

info locals
complains that no symtable is available

Running the app from within gdb doesn’t reproduce the issue either…

Wait.

The mere act of installing gdb seems to make it impossible to reproduce…

I had been dumping cores inside one docker container and debugging them in another before, but now that I’ve installed gdb in the original container, I can no longer reproduce the issue.

is there a way to capture the mouse position when a racket/gui window doesn’t have focus? (class canvas%
(inherit refresh)
(define x 0)
(define/override (on-event a-mouse-event)
(set! x (/ (send a-mouse-event get-x) 1))
(refresh))
I suspect is it possible because I can get a negative mouse x-position by moving the moving the mouse quickly.

I don’t know about if the window doesn’t have focus, but you will get negative values if you move left of or above the window.

I think you still need window focus.

By window you mean frame%
or a window-area<%>
?

If the frame does not have the focus, then that will first depend on the window manager. If the frame has the focus but not the widget, that will depend on racket, and on the hierarchy of widgets

This is a very long shot, but since you consistently get an error with address 0x280, and since the stack trace suggests that the crash is in C code, I grepped the disassembly of Racket3m on 64-bit Linux to look for a dereference of a pointer with offset 0x280 (on the theory that it’s likely a dereference of a NULL pointer). The only good match I found is related to the sync_box
field of a thread record, which at least sounds relevant to your application. But the only way I see for that to go wrong is for scheme_curent_thread
to be NULL when scheme_get_thread_sync
is called from nack_guard_evt_is_ready
in struct.c
. Overall, this doesn’t look promising, but if adding an assertion there for a build is easy, it might be worth a try.

it could be either. I’m willign to try anything.

I think it needs to be with window manager.

I’m experimenting to see if I can make a version of xeyes, but just with racket/gui, not installing X

I don’t think events will be delivered anywhere, but you might be able to poll with get-current-mouse-state
.

Good tip get-current-mouse-state
gets me the global coordinates, but sadly only works if the canvas gets some sort of focus. It was a fun idea but no matter.

I had in mind starting a thread or timer that polls get-current-mouse-state
. That way, focus shouldn’t be relevant.

I’ll try that.

I’m looking for something fun to do after I was gazumped with urls in comments wish.

Sometimes I want a place to talk about some package I’m making and whether some feature would be a good idea or not. So I created the #api-design channel. Feel free to join if you’re interested in improving libraries, languages, or frameworks.

@laurent.orseau might be interested :slightly_smiling_face:

Got the idea from talking to him :slightly_smiling_face:

Thanks! I ended up setting up a VM to test this (compiling in Docker on Mac is way too slow) and I’m not able to reproduce the problem on Racket master. I’m compiling the v7.6 tag now to see if I can reproduce it that way

Core was generated by `/home/parallels/Desktop/projection-bias-experiment/koyo-experiment/./dist/bin/k'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f906b356300 (LWP 2925))]
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007f906a5d8801 in __GI_abort () at abort.c:79
#2 0x000056531aecdf0d in fault_handler (sn=11, si=0x56531bf7b370, ctx=0x56531bf7b240) at ./sighand.c:103
#3 <signal handler called>
#4 0x000056531ae9551a in prepare_thread_for_GC (t=0x7f90588a8bb0) at ./../src/thread.c:9209
#5 0x000056531ae9581b in get_ready_for_GC () at ./../src/thread.c:9280
#6 0x000056531aedc161 in garbage_collect (gc=0x56531bf9afc0, force_full=0, no_full=0, switching_master=0, lmi=0x0) at ./newgc.c:5583
#7 0x000056531aecd72c in collect_now (gc=0x56531bf9afc0, major=0, nomajor=0) at ./newgc.c:875
#8 0x000056531aece3e9 in gc_if_needed_account_alloc_size (gc=0x56531bf9afc0, allocate_size=2064) at ./newgc.c:1312
#9 0x000056531aeceba9 in allocate_medium (request_size_bytes=2048, type=2) at ./newgc.c:1510
#10 0x000056531aecf21c in GC_malloc_allow_interior (s=2048) at ./newgc.c:1765
#11 0x000056531abae1ce in copy_in_mark_stack (p=0x7f90588a8bb0, cont_mark_stack_copied=0x0, cms=5551, base_cms=5551, copied_offset=5551, _sub_conts=0x7ffecdd61d18, clear_caches=1)
at ./../src/fun.c:4812
#12 0x000056531abb26f6 in restore_continuation (cont=0x7f906021fef0, p=0x7f90588a8bb0, for_prompt=0, result=0x7f9060272110, resume=0x7f906027c3a8, empty_to_next_mc=1,
prompt_tag=0x7f905a9b0620, common_dw=0x0, common_next_meta=0, shortcut_prompt=0x0, clear_cm_caches=1, do_reset_cjs=1, cm_cont=0x7f906027b758, extra_marks=0x0)
at ./../src/fun.c:5926
#13 0x000056531abb350b in internal_call_cc (argc=3, argv=0x7f90577702a0) at ./../src/fun.c:6180
#14 0x000056531ab80901 in scheme_do_eval (obj=0x56531bf87090, num_rands=3, rands=0x0, get_value=-1) at ./../src/eval.c:2255
#15 0x000056531ab9eb07 in force_values (obj=0x4, multi_ok=1) at ./../src/fun.c:1426
#16 0x000056531ab9ed32 in scheme_force_value_same_mark (obj=0x4) at ./../src/fun.c:1472
#17 0x000056531ab7bcf7 in _scheme_apply_from_native_fast (rator=0x56531bf87230, argc=2, argv=0x7f90577702b8) at ./../src/schnapp.inc:39
#18 0x000056531ab7bf0d in _scheme_apply_from_native (rator=0x56531bf87230, argc=2, argv=0x7f90577702b8) at ./../src/schnapp.inc:80
#19 0x000056531ac37e95 in x_ts__scheme_apply_from_native (rator=0x56531bf87230, argc=2, argv=0x7f90577702b8) at ./../src/jitcall.c:291
#20 0x00007f906b1a61bc in ?? ()
#21 0x000000000000fffd in ?? ()
#22 0x00007f905740c205 in ?? ()
#23 0x0000000000000000 in ?? ()
took me forever but I finally have this. This is on the v7.6
branch. I’ll try to repro on master again tomorrow morning. I think the issue might still be present, just harder to reproduce because I have been able to do it with one of the published 7.7 releases.

Unless you’re using places (and maybe even if you are), I would encourage you to try reproducing under rr
: http://rr-project.org\|rr-project.org

also, I don’t think continuation stuff has changed much since 7.6, so debugging there might be sufficient

OK, thanks. I’ll give rr
a try tomrrow

can you go to frame 4 there and list the code?

(gdb) frame 4
#4 0x000056531ae9551a in prepare_thread_for_GC (t=0x7f90588a8bb0) at ./../src/thread.c:9209
9209 seg[stackpos].key = NULL;
(gdb) list
9204 int stackpos;
9205 segpos = ((intptr_t)pos >> SCHEME_LOG_MARK_SEGMENT_SIZE);
9206 seg = p->cont_mark_stack_segments[segpos];
9207 if (seg) {
9208 stackpos = ((intptr_t)pos & SCHEME_MARK_SEGMENT_MASK);
9209 seg[stackpos].key = NULL;
9210 seg[stackpos].val = NULL;
9211 seg[stackpos].cache = NULL;
9212 }
9213 }

and can you print seg
and stackpos
?

and also segpos

(gdb) print seg
$1 = (Scheme_Cont_Mark *) 0x200
(gdb) print stackpos
$2 = 0
(gdb) print segpos
$3 = 86

SIGSEGV MAPERR si_code 1 fault on addr 0x200
Aborted (core dumped)
was the fault so seg
looks like it’s invalid?

what’s p
?

and also p->cont_mark_stack_segments

seg
looks wrong, but subscripting a valid pointer with 86 shouldn’t get 0x200

(gdb) p p
$7 = (Scheme_Thread *) 0x7f90588a8bb0
(gdb) p p->cont_mark_stack_segments
$8 = (struct Scheme_Cont_Mark **) 0x7f905ff74498
(gdb) p segpos
$9 = 86
(gdb) p pos
$10 = 5504

ah, pos
looks bad too

maybe, at least

5504 seems pretty large

(gdb) p p->cont_mark_stack_bottom
$11 = 5551

this app makes pretty heavy use of continuations

ah, ok

(but I’ve encountered this issue with another app that doesn’t use ’em as much)

Thanks! It looks like a problem with a GC being triggered while a thread’s mark stack is being updated, and the half-updated mark stack confuses get_ready_for_GC
.

Pinned this message to keep the channel discoverable over time. (If that was inappropriate or there’s a better way to do that, unpinning the message is fine with me.)

I’ve pushed a potential repair, but I wasn’t able to construct an example that crashes before the change, so I’m not sure that the change will fix anything.

Awesome! I cherry-picked the change on top of v7.6
and haven’t been able to reproduce the crash. I’ll put together a minimal example that can reproduce the crash tomorrow morning and share. “Minimal” might not be the right word to use, but I think I know what the steps might be to reproduce just using libraries in the main distribution. Thanks again!

@alex.r.laurie has joined the channel