Racket Slack Archive

laurent.orseau

2020-9-18 07:35:04

About parallelism: I often have to run 10–15 instances of the same program, just with different inputs/outputs (no interaction). My current setup is just to run racket several times, but that’s not memory efficient. Places won’t give me the same speed benefit due to GC sharing. Is there an intermediate option? Ideally, it would be sharing everything that’s immutable and persistent I guess, so that the GC is not involved.

alexharsanyi

2020-9-18 08:08:44

How much memory are your programs using, and how much do you hope to save by sharing the immutable parts?

laurent.orseau

2020-9-18 08:15:00

For both: as much as possible :slightly_smiling_face: (seriously) Memory usage of each program can grow unbounded, and a large fraction of that can’t be shared. But racket’s VM is a few hundred MB, and if I can save 9–14× that much, it could be used by the programs themselves instead.

alexharsanyi

2020-9-18 08:19:22

If this is really a “no expense spared” operation to save as much memory as possible, you should re-write your program in C++ :grin:

alexharsanyi

2020-9-18 08:27:44

… but if you want to stay with Racket, the OS will already share the data pages for the executable, so, if you run the task as separate processes, most of the separate data will be the data that these programs process (i.e. your data). So, basically, unless you have some concrete numbers you are aiming for, I think what you are doing is already close to the optimal

alexharsanyi

2020-9-18 08:29:18

… and the Racket VM is only once in physical memory and just mapped 10–15 times, once for each process.

laurent.orseau

2020-9-18 08:32:59

Thanks Alex. (I already did the C++ thing too :wink: and yes, I’d rather stick with racket) That was my uneducated guess. Do you know if the virtual memory counts the VM for each process? (I guess so). I’m using ulimit -v to avoid crashing my computer, but if this also counts the VM (and other things?) it may be off by some unknown amount

alexharsanyi

2020-9-18 09:00:03

I don’t know about Linux, but the Windows process monitor only reports the private data of an application (i.e. not the executable it is running)

laurent.orseau

2020-9-18 09:20:44

yeah, things are substantially different on linux I think

samth

2020-9-18 13:01:40

I don’t understand why you think places won’t help

laurent.orseau

2020-9-18 13:02:45

I’ve read several times that places stop helping above 8 jobs

samth

2020-9-18 13:03:34

if you’re sharing so little that you can run multiple processes then I think places should be fine

samth

2020-9-18 13:03:40

have you measured?

laurent.orseau

2020-9-18 13:04:25

I haven’t even tried yet

laurent.orseau

2020-9-18 13:04:38

But if I’m told it should be fine, I will :slightly_smiling_face:

laurent.orseau

2020-9-18 13:06:08

So if very little is shared, can I expect it to scale to 64 cores?

samth

2020-9-18 13:06:30

I don’t know, but we should try it out and fix things if it doesn’t

laurent.orseau

2020-9-18 13:07:22

:thumbsup:

laurent.orseau

2020-9-18 13:09:55

Then I’ll add this to my todolist and will come back when I have some numbers

mflatt

2020-9-18 13:10:18

BC doesn’t scale nicely past 8 or so places because OS page management (to write [un]protect pages for the GC’s write barrier) tends to be a bottleneck. CS doesn’t scale even that well, yet. Since places use independent copies of Racket modules, I doubt that you’ll save much memory by using places instead of processes. But it can depend on the program, so it’s worth a try, and I’d be happy to be wrong!

laurent.orseau

2020-9-18 13:14:22

Wouldn’t it be possible to not copy some modules that are provably immutable?

samth

2020-9-18 13:16:02

Yes, and that could be true already for modules that are cross-phase persistent, although I don’t think it is. But marking modules as immutable is hard, and so very few are cross-phase persistent and there’s not a more general category.

laurent.orseau

2020-9-18 13:30:43

Could modules provide hints to the compiler?

samth

2020-9-18 13:33:36

I suppose you could have unsafe hints, but that seems pretty risky

samth

2020-9-18 13:34:12

the problem is that “has mutable state in any module it depends on” is the fundamental question but that’s almost all modules

laurent.orseau

2020-9-18 14:14:25

Do you happen to have an example of such a mutating module that would be widely share across racket, from the top of your head?

badkins

2020-9-18 14:14:29

@laurent.orseau if each Racket process is multi-threaded, I feel like it’s an efficient use of memory i.e. each core can be kept very busy with a single Racket process.

laurent.orseau

2020-9-18 17:26:15

@badkins I’m not entirely following your thoughts, sorry. I don’t see the link between the cores being busy and the efficient use of memory

badkins

2020-9-18 17:51:47

I’m referring to the amount of work that can be done using a given amount of RAM.

badkins

2020-9-18 17:52:33

If a single Racket process, or place, is blocked on I/O, or other events, then the RAM associated with that process/place isn’t being used efficiently IMO.

shu--hung

2020-9-18 21:17:02

Is anyone using scribble/jfp for JFP papers? The URI for the jfp class file changes and breaks scribble/jfp. Even more problematic is that jfp.cls loads the color package without providing way to configure its options, conflating with the default scribble environment. Any suggestions for approaches to fix scribble/jfp?

notjack

2020-9-18 21:27:41

beware that “module contains a struct definition” counts as “contains mutable state and shouldn’t be shared” because of generativity

notjack

2020-9-18 21:27:58

and that’s the main obstacle to module sharing

notjack

2020-9-18 21:28:46

since it rules out tons of useful library modules, which transitively rules out the 99% of modules that depend on those

samth

2020-9-18 21:31:55

That’s actually not an obstacle, either currently with cross phase persistent or to cross place sharing

notjack

2020-9-18 21:32:42

wait really? why on earth did I think it was

samth

2020-9-18 21:33:45

It seems very plausible that it would be

samth

2020-9-18 21:33:49

But it isn’t

notjack

2020-9-18 21:35:57

interesting, the grammar for cross phase persistent modules does allow you to make structs, but only using make-struct-type and not the struct form

notjack

2020-9-18 21:37:14

is that some sort of fundamental limitation or would it just be a lot of work to implement

samth

2020-9-18 21:37:45

Macros are also not allowed, so requiring the struct definition is not allowed

notjack

2020-9-18 21:39:03

hmm

samth

2020-9-18 21:39:59

Or, that’s not quite right, but certainly neither the definition nor the expansion of struct would be allowed

joseph.beck

2020-9-18 22:25:13

@joseph.beck has joined the channel

samth

2020-9-18 23:42:30

Due to license issues, the only fix for the former issue is to change the URL in the source

hj93

2020-9-19 00:26:06

hello-if you are interested you should check the linear algebra interpreter I am developing. Its not finished but you can play around with some of it. https://github.com/Jobhdez/Linear-algebra-interpreter/blob/master/interp-linear.rkt

anything

2020-9-19 02:07:03

I refactored the code. Now the web server serves just a plain string and doesn’t run any update procedure. The update procedure is a separate program, it is run by CRON and all it does it download the data from the source; at the end of download, it atomically updates the database (read by the web server). The whole thing is a lot simpler now and it should now be very easy to spot the problem — or the problem will go away completely. I’m probably all set now. Thanks so much for your technical and moral support. I appreciate it.