laurent.orseau
2020-9-18 07:35:04

About parallelism: I often have to run 10–15 instances of the same program, just with different inputs/outputs (no interaction). My current setup is just to run racket several times, but that’s not memory efficient. Places won’t give me the same speed benefit due to GC sharing. Is there an intermediate option? Ideally, it would be sharing everything that’s immutable and persistent I guess, so that the GC is not involved.


alexharsanyi
2020-9-18 08:08:44

How much memory are your programs using, and how much do you hope to save by sharing the immutable parts?


laurent.orseau
2020-9-18 08:15:00

For both: as much as possible :slightly_smiling_face: (seriously) Memory usage of each program can grow unbounded, and a large fraction of that can’t be shared. But racket’s VM is a few hundred MB, and if I can save 9–14× that much, it could be used by the programs themselves instead.


alexharsanyi
2020-9-18 08:19:22

If this is really a “no expense spared” operation to save as much memory as possible, you should re-write your program in C++ :grin:


alexharsanyi
2020-9-18 08:27:44

… but if you want to stay with Racket, the OS will already share the data pages for the executable, so, if you run the task as separate processes, most of the separate data will be the data that these programs process (i.e. your data). So, basically, unless you have some concrete numbers you are aiming for, I think what you are doing is already close to the optimal


alexharsanyi
2020-9-18 08:29:18

… and the Racket VM is only once in physical memory and just mapped 10–15 times, once for each process.


laurent.orseau
2020-9-18 08:32:59

Thanks Alex. (I already did the C++ thing too :wink: and yes, I’d rather stick with racket) That was my uneducated guess. Do you know if the virtual memory counts the VM for each process? (I guess so). I’m using ulimit -v to avoid crashing my computer, but if this also counts the VM (and other things?) it may be off by some unknown amount


alexharsanyi
2020-9-18 09:00:03

I don’t know about Linux, but the Windows process monitor only reports the private data of an application (i.e. not the executable it is running)


laurent.orseau
2020-9-18 09:20:44

yeah, things are substantially different on linux I think


samth
2020-9-18 13:01:40

I don’t understand why you think places won’t help


laurent.orseau
2020-9-18 13:02:45

I’ve read several times that places stop helping above 8 jobs


samth
2020-9-18 13:03:34

if you’re sharing so little that you can run multiple processes then I think places should be fine


samth
2020-9-18 13:03:40

have you measured?


laurent.orseau
2020-9-18 13:04:25

I haven’t even tried yet


laurent.orseau
2020-9-18 13:04:38

But if I’m told it should be fine, I will :slightly_smiling_face:


laurent.orseau
2020-9-18 13:06:08

So if very little is shared, can I expect it to scale to 64 cores?


samth
2020-9-18 13:06:30

I don’t know, but we should try it out and fix things if it doesn’t


laurent.orseau
2020-9-18 13:07:22

:thumbsup:


laurent.orseau
2020-9-18 13:09:55

Then I’ll add this to my todolist and will come back when I have some numbers


mflatt
2020-9-18 13:10:18

BC doesn’t scale nicely past 8 or so places because OS page management (to write [un]protect pages for the GC’s write barrier) tends to be a bottleneck. CS doesn’t scale even that well, yet. Since places use independent copies of Racket modules, I doubt that you’ll save much memory by using places instead of processes. But it can depend on the program, so it’s worth a try, and I’d be happy to be wrong!


laurent.orseau
2020-9-18 13:14:22

Wouldn’t it be possible to not copy some modules that are provably immutable?


samth
2020-9-18 13:16:02

Yes, and that could be true already for modules that are cross-phase persistent, although I don’t think it is. But marking modules as immutable is hard, and so very few are cross-phase persistent and there’s not a more general category.


laurent.orseau
2020-9-18 13:30:43

Could modules provide hints to the compiler?


samth
2020-9-18 13:33:36

I suppose you could have unsafe hints, but that seems pretty risky


samth
2020-9-18 13:34:12

the problem is that “has mutable state in any module it depends on” is the fundamental question but that’s almost all modules


laurent.orseau
2020-9-18 14:14:25

Do you happen to have an example of such a mutating module that would be widely share across racket, from the top of your head?


badkins
2020-9-18 14:14:29

@laurent.orseau if each Racket process is multi-threaded, I feel like it’s an efficient use of memory i.e. each core can be kept very busy with a single Racket process.


laurent.orseau
2020-9-18 17:26:15

@badkins I’m not entirely following your thoughts, sorry. I don’t see the link between the cores being busy and the efficient use of memory


badkins
2020-9-18 17:51:47

I’m referring to the amount of work that can be done using a given amount of RAM.


badkins
2020-9-18 17:52:33

If a single Racket process, or place, is blocked on I/O, or other events, then the RAM associated with that process/place isn’t being used efficiently IMO.


shu--hung
2020-9-18 21:17:02

Is anyone using scribble/jfp for JFP papers? The URI for the jfp class file changes and breaks scribble/jfp. Even more problematic is that jfp.cls loads the color package without providing way to configure its options, conflating with the default scribble environment. Any suggestions for approaches to fix scribble/jfp?


notjack
2020-9-18 21:27:41

beware that “module contains a struct definition” counts as “contains mutable state and shouldn’t be shared” because of generativity


notjack
2020-9-18 21:27:58

and that’s the main obstacle to module sharing


notjack
2020-9-18 21:28:46

since it rules out tons of useful library modules, which transitively rules out the 99% of modules that depend on those


samth
2020-9-18 21:31:55

That’s actually not an obstacle, either currently with cross phase persistent or to cross place sharing


notjack
2020-9-18 21:32:42

wait really? why on earth did I think it was


samth
2020-9-18 21:33:45

It seems very plausible that it would be


samth
2020-9-18 21:33:49

But it isn’t


notjack
2020-9-18 21:35:57

interesting, the grammar for cross phase persistent modules does allow you to make structs, but only using make-struct-type and not the struct form


notjack
2020-9-18 21:37:14

is that some sort of fundamental limitation or would it just be a lot of work to implement


samth
2020-9-18 21:37:45

Macros are also not allowed, so requiring the struct definition is not allowed


notjack
2020-9-18 21:39:03

hmm


samth
2020-9-18 21:39:59

Or, that’s not quite right, but certainly neither the definition nor the expansion of struct would be allowed


joseph.beck
2020-9-18 22:25:13

@joseph.beck has joined the channel


samth
2020-9-18 23:42:30

Due to license issues, the only fix for the former issue is to change the URL in the source


hj93
2020-9-19 00:26:06

hello-if you are interested you should check the linear algebra interpreter I am developing. Its not finished but you can play around with some of it. https://github.com/Jobhdez/Linear-algebra-interpreter/blob/master/interp-linear.rkt


anything
2020-9-19 02:07:03

I refactored the code. Now the web server serves just a plain string and doesn’t run any update procedure. The update procedure is a separate program, it is run by CRON and all it does it download the data from the source; at the end of download, it atomically updates the database (read by the web server). The whole thing is a lot simpler now and it should now be very easy to spot the problem — or the problem will go away completely. I’m probably all set now. Thanks so much for your technical and moral support. I appreciate it.