
Is there an easy way to background a shelled out process in Racket? Currently, I’m using system
, but adding a &
to the end of the command does not behave the same as in Bash, Racket still blocks, waiting for completion.

subprocess
?

Hmmm, I was hoping you wouldn’t say that :stuck_out_tongue: I’m trying to avoid having to close ports and so forth.

there may be a better option — I’ve just used subprocess before and it worked well

(I may have used process
— can’t remember…)

My use case is I have a bunch of commands to run in a loop, and they’re all independent. Moreover, I don’t care about the outputs, I’m running them for their side-effects (creating files on the disk). I wrote a for
loop which calls system
, and it worked fine, but it was a waste of time waiting the things to go serially. Then I wrapped the thing in (thread (λ () ___))
. It works, but I was looking for a more principled solution.

Of course, subprocess
would work just as well, but then I’d have to juggle the many ports I don’t care about.

Ideally, there’d be a for/parallel
, which uses a thread pool and all.

But I couldn’t find such a thing in the stdlib.

would a for/thread
be a sensible for-form?

it would have to dome something (maybe throw away…) the thread descriptors… that seems… not ideal

I’m running this in DrRacket, so even throwing away the thread descriptors works, because the main thread sticks around anyway. But, to make for/thread
(or for/parallel
, as I called it) also work on the command-line, it’d map thread-wait
on thread descriptors. The whole for
form would block. While I’m here wishing things would exist, for/parallel
would have a thread pool to avoid starvation.

would for/thread
want to use exit-handler
to update what should happen on exit, then? (I have no idea if that’s a sensible suggestion, BTW — we’re way beyond my area of experience at this point, lol)

I guess the whole for/thread
form could avoid blocking if it used exit-handler
. But I wasn’t even this ambitious. I’d be happy if the for
form itself blocked, but iterations ran in parallel.

In fact, I’d say blocking the whole form would be even more useful. Because then I’d be easy to construct parallel pipelines with the necessary contention points. For example, if I have a directory full of PDFs I want to process,¹ I can sequence a bunch of for/parallel
and it’d never try to process a file before it’s ready.
¹ Oh, look at that! A directory full of PDFs is exactly what I have here :slightly_smiling_face:

@leafac a thread pool probably wouldn’t do what you think it does most of the time

If you want, you can create a new custodian and use it when creating your processes, then call custodian-shutdown-all
to close any dangling ports all at once.

But, as @notjack says, I don’t think a thread pool makes sense in Racket (since Racket threads are green threads that all run on the same OS thread).

Also the Racket process isn’t the one doing the work in this case. Racket has no idea how many OS threads or how much memory the bash subprocesses are using and logically it shouldn’t, so there isn’t really a sensible way for Racket to properly manage the resources consumed by those subprocesses.

Racket’s the client, not the server. Client load control is usually done via throttling / rate limiting.

so instead of “run these tasks on X threads”, think of it as “make no more than X requests per second”

Right… which is why, if you truly want to run a bunch of subprocesses in parallel, Racket’s threads are good enough. They’ll create a bunch of subprocesses concurrently, and the subprocesses will actually run in parallel (being, well, separate processes).

You could just do (apply sync (for/list ([i (in-range 10)]) (thread (lambda () (system "blah")))))
to run a bunch of blah
commands in parallel and wait for them all to finish.

a throttling package would be a useful thing to have

Err… no, (apply sync ....)
isn’t right, since that would wait for any one to finish. You would want (for-each sync ....)
.

You could implement throttling pretty trivially with a single semaphore.

hey lexi didn’t you write a for/async
macro and stick it in some package?

no

must be thinking of something else then

I’m probably misunderstanding something about thread pools. My idea was that if I had to run 1000 processes, I might not want to run them all simultaneously, but only have, say, 100 active at a time. How would this go wrong?
How is (for-each sync ___)
different than (map thread-wait ___)
?

@leafac It isn’t really, except that for-each
discards the results.

@leafac because each process could spawn any number of its own threads

you don’t want to only have 100 processes active at one time, you want 100 OS threads active at one time

so limiting the processes may indirectly work, or it might not

If you wanted to throttle to n
things running at a time, you could just create a semaphore with (make-semaphore n)
and wrap each system
call with call-with-semaphore
.

(let ([s (make-semaphore 5)])
(for-each thread-wait
(for/list ([i (in-range 10)])
(thread (thunk (call-with-semaphore
s (thunk (system (~a "sleep 1 && echo " i)))))))))

Oh, right. I had (void (map ___))
. That was naïve :stuck_out_tongue:
In any case, sync
is equivalent to thread-wait
in my use case, as I understand from reading the documentation. Is this right?

yes, that’s right

Yes, IIRC sync
and thread-wait
do the same thing on threads.

:+1: Thanks.

I’m learning so much from this conversation. I just read about thunk
, it’s great!

Alexis, you nailed it! That’s the implementation for the feature I wanted to find in the stdlib.

Personally, I think it’s small enough and easy enough to build out of existing, composable pieces that there doesn’t need to be a separate abstraction for it. But YMMV.

I understand @notjack’s point about OS threads, but, in my use case, it doesn’t matter.

in the stdlib I’d want something where doing each for loop iteration concurrently and throttling are separate concerns

Isn’t that what the semaphore and the use of thread
are? Separate concerns?

ugh for/async
in racket/future
is not well named :/

no I mean

there’s more ways to throttle than just “x concurrent tasks”

How about a for/parallel
, with optional #:pool-size
argument?

for/thread
/ for/future
/ for/place
would be the names I’d want

@leafac The thing here is that you aren’t making a pool of workers and distributing work over them. That isn’t really what you want from Racket’s threading model. Threads are cheap; feel free to make a million of them. Rather, you want to limit the number of processes you actually spawn at a time.

A “thread pool” generally means you have a pool of workers, which isn’t the case here.

yes, I wouldn’t want any for forms to have any knowledge of pooling tasks or other kinds of resource control

there’s too many different ways to do it

@notjack, you’re great, but I remain unconvinced that you do not overengineer these things. :)

@lexi.lambda maybe I’m just planning for a different future :p

I’m very much thinking with web servers in mind right now

from left field: would it be for/threads
(i.e. plural)?

XD

¯_(ツ)_/¯

My intent is: (1) given (for ___ (system ___))
, I can make this run faster just by replacing for
with for/<your-favorite-name-here>
; and (2) this doesn’t freeze my DrRacket because it tried to spawn a thousand subprocesses. If threads are cheap, then something else is making DrRacket hang. Do you know what it could be?

I guess a for/thread(s)
could be useful, but it really would just be a tiny abbreviation for wrapping the body of a use of for/list
in (thread (thunk ....))
…

arguably, there’s a lot of tiny abbreviations in the for/*
forms

Everything is just for/fold
at the end of the day. :)

(…which in turn is just named let
, which is just recursion…)

a fold a day keeps the object oriented programmers away

please don’t call if for/threads
… call it ~parallel-for
~ (edit: maybe thread-comprehension
?) at least, or make a map-reduce
package

Moreover, consider the case (1) above. I don’t want to think about threads or thunks, I just want my typesetting tasks to go faster :slightly_smiling_face:

it’s not parallel though

@ben but it isn’t parallel

async-for
?

what’s wrong with for/threads

out of curiosity

@leafac you have to think about threads and the difference between IO and CPU tasks to do this without getting weird unexpected results though

because for/X
means X
is the accumulator; with for/threads
the threads aren’t really the accumulator

Hmm, this is similar in spirit to &
in Bash. And I don’t think about IO or CPU when using that.

@leafac I think Racket’s threading model is relatively simple compared to most languages, and I think it’s worth learning. So much of the benefits of FP are building big things out of small, composable things. I think big abstractions (looking at you, LOOP
) are usually only desirable when either incredibly common or too hard to build out of smaller pieces.

&
in bash is basically thread
in Racket.

@leafac in Bash nearly all tasks ever are IO tasks, so bash can provide a simpler model of parallelism without breaking too many unstated assumptions

a general-purpose form for Racket would not have that luxury

@ben that is not strictly the case (regarding accumulators)

@ben I wish that were the case, too, but that’s already sort of been broken in the stdlib.

there are lots in the standard lib that break that convention

part of the problem is what racket calls threads are green threads, which are completely and entirely different from OS threads, making “thread” a highly overloaded term

I really like Racket’s concurrency model. I wish the parallelism story were better, but I think the concurrency model is pretty easy to understand and work with, more so than most other languages I’ve used.

yeah racket concurrency is really pleasant to work with, and I can generally get a feel for how I would do complex things with it

Oh, I find my for → for/<make-this-faster>
argument so good. It’s a shame no one else here is falling for it :stuck_out_tongue:
Of course, your concerns make sense when writing serious programs. But I’m writing a script to typeset my Redex models. This is not rocket science :stuck_out_tongue:

true. I think a for/<make-this-faster>
form would be really great for something like Rash or one of those other “racket-but-for-shell-script” hashlangs

maybe even it should be the default in that sort of context

@leafac I think I would agree with you if Racket threads ran in parallel, but they don’t. So using threads without understanding what they’re doing would likely be confusing, and making that too accessible might be misleading.

Having a parallel-map sort of feature is a great one. But Racket can’t really have that (barring futures, which I am convinced are not currently especially useful outside of extraordinarily specific scenarios).

Of course, Chez changes things.

If you had a for/make-this-faster
, you would get confused people writing (for/make-this-faster ([i (in-range 100)]) (some-cpu-bound-racket-computation!))
complaining on the mailing list.

actually with real threads, that would be faster but IO bound tasks would be slower

I don’t understand Racket’s parallel/concurrency models. I don’t even understand why Racket threads don’t run in parallel.¹ But I added the (for-each thread-wait (for/list (thread (thunk ___)))))
thing and it did make things go faster.² That’s more then I could’ve asked :slightly_smiling_face:
¹ Yes, I’ll read the documentation and learn about this, now that I’m curious. ² At the cost of making DrRacket hang for approximately 10 seconds.

@notjack That’s essentially my point.

@lexi.lambda oops I thought you meant in the context of real threads run on chez or something

No, I was talking a theoretical for/async
-y thing using green threads.

gotcha

DrRacket should show me a progress bar when working through the for/make-this-faster
:stuck_out_tongue:

@leafac As for why Racket threads don’t run in parallel, I think a large part of it is that Racket was not designed to be made parallel, and making it parallel turned out to be unreasonably hard. :)

There’s a talk mflatt gave a while back (at Mozilla, IIRC?) about some of the history around trying to make Racket parallel.

It has some sort of Global Interpreter Lock (GIL)? I remember that from my days of Ruby…

@leafac racket doesn’t have a GIL


@notjack racket might as well have a GIL

:+1: I’ll watch this later, thanks.

@lexi.lambda how so?

I’m being a little facetious, but I’m saying Racket’s current runtime is not really able to run in parallel, whether it has an actual GIL or not, so the result from the user’s POV is more or less identical

what I’m really saying is that responding to people who ask “does Racket have a GIL?” with “no” is technically correct, the best kind of correct

but what they’re really asking is “can I run my Racket in parallel?”

and the answer is “not in the way you’re probably hoping for”

I think no parallelism with IO in terms of green threads is a really different model than no parallelism with IO in terms of OS threads, and the latter is why a GIL is so painful - in the former a GIL isn’t nearly as big an issue and users won’t think of those two cases as the same thing

you can have the latter with a GIL

you have both with and without a GIL - what I mean is that it’s only the latter where a GIL is a huge problem for almost all users. In the former, a GIL is a problem much less often to the point where most users, even those doing complex things, wouldn’t need to care

hmmm, I think I said that wrong

I’m not sure I understand why the latter is made any more complicated by a GIL

y’know I’m not sure what my point was anymore

alright

I think some a scribble book about general concurrency and parallelism concepts and where Racket’s model fits in would be a great thing to have in the docs

(require ffi/winapi) (win64? ) What argument is (win64?) wanting? I would figure it was (= (system-type ’word) 64)

@mtelesha in the docs, it looks like win64?
isn’t a procedure: http://docs.racket-lang.org/foreign/winapi.html?q=win64%3F#%28def._%28%28lib._ffi%2Fwinapi..rkt%29._win64~3f%29%29

@lexi.lambda, @notjack: Just got a chance to get back to this. Thanks a lot for the conversation. I certainly learned a lot from you :slightly_smiling_face:

:D

@jaz I understand its not a procedure but I don’t understand (win64? )’s arguement it is looking for.

@mtelesha what @jaz is saying is that win64?
is not a function, so it does not accept any argument. it is a value. win64?
is either #t
or #f
. I don’t think your question makes very much sense.

@lexi.lambda Well I should say I expected (win64?) to return #t or #f. I am answer from the point of stupid. How are you suppose to use (win64?)

don’t put any parentheses around it. just write win64?
.

> win64?
#f

I was correct I was from the point of stupid :slightly_smiling_face:

hi folks… maybe especially @stamourv — what are the lunch options near the venue on Saturday?

@apg walk west to University Ave (just called “the ave” if you ask for directions) and take your pick… there is a ton. Probably stuff on campus too, but it won’t be half as good.

what do you like?

@zenspider ah. thanks! I wasn’t sure if there was a recommended campus spot, or if people would just self-organize and find something.

I’m vegan, so just trying to plan ahead.

there’s many options for you… go north on the ave for explicitly vegan options. Pizza∏ and others nearby…

there’s at least 2 other vegans attending

Pizzaπ isn’t as good as it used to be.

and is fairly far from the venue, no?

it’s a short walk

maps say 22 minutes from Mary Gates Hall.

(doable, sure)

there’s a korean place closer to our venue that I love… don’t know the name, but it’s about 42nd just off the ave

@zenspider @ben thanks for the info. :stuck_out_tongue:

Araya’s Place is also popular w/ my vegan friends

and closer

I guess if I play my cards right, I might even be able to go to Full Tilt for ice cream.

Araya’s is good food.

apg: Food will be provided, including vegan options.

Oh! Even better! Thanks, @stamourv !