aymano.osman
2019-9-12 13:50:32

looking forward to it


zacromero3
2019-9-12 16:08:16

Hello everyone, I have a quick question about web-server performance. I’m pretty new to using the web server so I may be missing something pretty obvious. So the problem is to make a server which receives a URL with urls in it; it then should perform a GET on the url parameters and aggregate the results. The trick is that no matter what, a response should be sent within 500ms.


zacromero3
2019-9-12 16:08:42

Here is what I came up with


zacromero3
2019-9-12 16:10:32

And the performance behavior is that when hitting the server with lots of requests, it misses its 500ms deadline every 7 seconds or so. Is this due to garbage collection?


soegaard2
2019-9-12 16:13:03

@zacromero3 To test the theory that the garbage collector runs every 7 seconds, you can make the garbage collector log when it is activated (can’t remember any details on how to do it).


zacromero3
2019-9-12 16:17:15

Good idea! It looks like Racket logs that info automatically and I just have to set the log level to debug. I’ll look into this.


samth
2019-9-12 16:26:08

@zacromero3 that’s also my guess. You might consider using the incremental collector, which would probably avoid those spikes.


zacromero3
2019-9-12 17:00:13

Thanks. I tried this and it does seem to lower the max times for when the server is performing simple calculations and not making requests. Strangely though, the incremental collector seems to throw the original code above into some feedback loop.


samth
2019-9-12 17:20:42

yikes! did you include calls to collect-garbage when you used the incremental collector?


zacromero3
2019-9-12 18:24:43

I wasn’t. It looks like calling it at every request breaks things all together. Are there standard ways to integrate calls to collect-garbage into an app? I tried putting it in a thread and calling it every 50ms but that didn’t seem to have any luck.


chris613
2019-9-12 19:27:20

so in megaparsack there is parse-string but is there an easy way to parse from a file without simply reading the file into a string?


chris613
2019-9-12 19:27:59

the parse works on syntax-box’s but ive no idea what one of those would be :slightly_smiling_face:


samth
2019-9-12 19:45:37

@zacromero3 roughly, you want to call (collect-garbage 'incremental) every time you finish processing a request


samth
2019-9-12 19:46:08

if that breaks everything that’s a bug and should be reported (maybe @mflatt will have other thoughts) and I would be interested in the output of GC logging


zacromero3
2019-9-12 20:45:10

@samth Oh, I see. I was calling (collect-garbage) every time. Calling (collect-garbage 'incremental) after processing a request did greatly improve the results, giving a lot less spikes. Thanks!


soegaard2
2019-9-12 20:57:52

The distance between the spikes is exactly 60 seconds.


zacromero3
2019-9-12 22:10:38

That’s good catch! After digging into this more these peaks have to do with the #:initial-connection-timeout setting the default being 60 seconds. Setting it to 120 sec on serve makes these occur every 120 seconds. That doesn’t make much sense though why its behaving this way though.


soegaard2
2019-9-12 22:39:22

Maybe the web-server calls collect-garbage?


mflatt
2019-9-13 00:44:57

@mflatt has joined the channel


samth
2019-9-13 00:47:29

@zacromero3 have you turned on GC logging? My guess is those are major collections triggered by something


lexi.lambda
2019-9-13 03:19:26

@chris613 There isn’t, sorry. I meant to add it but never got around to it. The current strategy is a little embarrassingly inefficient, as it represents the entire file in memory using a linked list of characters. It’s a bit disastrous if you want to do anything high-performance, but I originally wrote it because I wanted a parser combinator library for parsing #langs, and the bottleneck is definitely not in the parsing there.


lexi.lambda
2019-9-13 03:22:23

It would be very valuable to have a streaming interface to Megaparsack, but it currently doesn’t exist. It could probably be done on top of Racket ports by using peeking whenever it’s possible to backtrack. Parsec-style parsers in general tend to have the flaw that it’s easy to accidentally keep the whole stream in memory due to the backtracking semantics, but if you’re very careful about how you use try (try/p in Megaparsack), it’s possible to write efficient streaming parsers.