
Working on section 4.2 now.

@pravnar @rjnw how are we doing for benchmarks/plotting?

I coded up the sampling+plotting infrastructure for the Gibbs benchmarks. These Gibbs samplers have knobs that need to be tuned (per benchmark) in order to zoom in on interesting parts of the plots. I am going to experiment with various knob settings now (starting with GmmGibbs), which involves repeatedly invoking a ‘tune->sample->plot’ pipeline.

I implemented the same knob stuff for our benchmark. And will try to have the rest three running asap. I already added the times for clinical trial and linear regression in the output folder. Those time are in microseconds

I think we should format all times in milliseconds, does that seem reasonable? or is it better to use microseconds instead of fractions of milliseconds?

Well mine is even fractions of microseconds. I time in nano.

Sure

But I don’t think the scale matters, and ms is pretty standard

@pravnar you can probably catch me and @ccshan after my lecture

Cool! Thanks

I’ve just pushed a decent version of section 4.2.

Is now a decent time to edit section 5?

Sounds good, @carette

@rjnw @pravnar what units are the checked-in benchmarks in?

It sounds like the rkt output is all in microseconds

I was talking to praveen it seems he needs to change somethings in gmmGibbs benchmark

So we are working on that.

how long does it take to run a benchmark, just ballpark?

hmm the clinicalTrial was total around 1 min

but the gmm ones maybe longer

ok

I am not sure about haskell though

and are those numbers indeed in microseconds?

for now yes

ok

I am using get_clock for getting process time

with nanosecond precision

I added that to the readme

@carette, we hope you’ll be done with the histogram section soon, and then one helpful thing is if you could shorten the simplification section by 1.5 pages, such as by removing some intermediate steps in the longer equational derivations that include scary integral signs.

for people not in Lindley, I wrote an Intro

@ccshan I am as done as I can be - someone else needs to read it and give feedback before I can meaningfully improve it more.

I can indeed work on shortening the simplification section. 1.5 pages is a tall order! I’ll see what I can do.

Ok let me read it over…
In the context of Section 4, the index variable of the sum that we want to turn into a histogram is not i but j. (See for example the displayed equation above Section 4.1 and the line above that.) So switching to i in Section 4.2 is jarring, so i should be renamed to j.

Can do.

(I will be a little delayed, my furnace went out, and it’s cold out here. Repairman is here, but…)

Will do that now-ish, in fact, as that is important to get right. Shortening can be done by many people.

Another thing: the word “piecewise” is undefined.

It is only used in the text, as a word which I thought was well understood. I guess I could spell it piece-wise?

I did not mean ‘piecewise’ in the sense of Maple, but in the sense used in mathematics.

It is a bad sign that you use “piecewises” as a noun. But to take a more concrete example, I have trouble understanding the sentence “The first rewrite takes all piecewise-defined functions which do not depend on the summation variable i, and translates them to a Fanout.”
What “first rewrite”? (Maybe collect all the rewrite rules into a figure? Currently the enumerate items 3 and 4 are especially ugly.)
What functions? (There are a lot of functions around, so maybe say what piecewise-defined functions there are in the simple example above Section 4.1?)
If “this is implemented via term rewriting” then I expect rewriting rules. I see equations and I’m not sure how to interpret them as term rewrites — for example, enumerate item 1 seems to contain a where clause that invokes “histogram” as a function.

"[0..n–1]" is Haskellism to be avoided

There is already a symbol for natural numbers I prefer; it’s \iplus

The right hand side for the Split needs a right paren?

Could you perhaps give a go at dealing with some of that? I’ve started the shortening process of the simplification section.

Also, my LaTeX-fu is weaker than yours, so it would take me a very long time to put the ‘rewrites’ into a proper figure. [I do agree I should have used some arrow rather than =, I insufficiently translated from the markdown]

I’m still in Section 2 (after helping Praveen with coercion problems that would have entertained you), so please do your best to address the above issues by revising the text. It’s not just about the LaTeX. Feel free to stick a bunch of $$
s into a \begin{figure}
or \begin{figure*}
and tell me to line them up.

it’s barely colder in Hamilton than in Bloomington :wink:

A tiny bit colder in Waterloo than Hamilton, but surprisingly similar to Bloomington.

Ok, I’ve shortened section 3. I think by about a page. I will now go back to 4.2 and try to deal with as many of the above as I can.

@ccshan What’s the status of section 2?

@rjnw What’s the benchmark status? I see that your commit messages suggest things are working

yes I am running gmmgibbs right now

just debugging the printer

will push the results in a few minutes

great!

@samth Still writing, active progress on Section 2.

I have improved 4.2 (piecewise gone, a few other things). I’ll continue on this tomorrow morning.

I wrote Section 2. The paper is connected now! Lots of smoothing may commence.

woo

@ccshan if you can fix the citations in section 1, that would help with editing for space

ok i’ll do that

predicting the impact of medical treatment~\cite{?}
??!

is the clinical trials model not based on anything real?

probably not; it looks p contrived

ok

but I’m sure I ca nfind something

maybe some better example is needed then

also there’s a … in the intro which I couldn’t fill in (in the summary of sections)

but maybe that should all move into 2?

Maybe to all. I’m just collecting a,bunch,of,stuff

section 2 looks quite nice tho

I think that’s it for me tonight, and I’ll edit/help integrate benchmark data tomorrow

@rjnw hopefully you’ll have plottable data by tomorrow morning?

oh yeah

my computer just ran out of because I had the output file open

RAM, I had to restart :stuck_out_tongue:

@rjnw are you planning on running benchmarks to produce gmmData tonight?

I did already

I ask because Ken and I changed the model a little :confused:

when?

the commit you made today?

I think I incorporated that

No, earlier tonight. A change that I haven’t pushed yet

well hopefully it won’t be change in arguments and inputs

then I don’t have to do much

There’s juuust one more argument :slightly_smiling_face:

And it’s only used in one place (standard deviation of the normal) so it shouldn’t be a difficult change

are you going to make any changes to naive bayes and lda?

I wish I knew

How about mauve?


@rjnw I don’t wanna stop you from getting your code to produce data for all the benchmarks. But fyi there is still some model/knob tuning I will have to do for Naive Bayes and LDA before we can use our runners once again to produce the final data to make pretty figures.

I’m currently tuning the GmmGibbs sampler and comparing it to JAGS.

Citations added

So… Praveen and I are thinking of going home, so we want to make sure that nobody is blocked on us. I understand that @carette is improving 4.2 @samth is smoothing text and directing benchmarking @pravnar is scripting plots and adjusting knobs (currently for GMM) @ccshan is reading and smoothing text @rjnw We’re not sure what you’re running. It’s certainly important that you be ready to run benchmarks, but the knobs for how many seconds, how many sweeps, etc. will have to be finalized with the help of preliminary plots. That said, are you blocked and how might we help?

I have my own knobs which can be modified easily, similar to yours

I copied the semantics of gibbsSweep into my own util. So in the end I will use the same settings as yours

right now I am running the benchmarks with some default knobs as I saw in gibbs and seeing the time and making sure it runs

So have you found knob settings that produce interesting curves (like, with a diagonal part and not just jumping around the same accuracy)

I haven’t plotted the curves

It sounds like the results from your current runs will at least serve as a first cut in plotting the curves and possibly adjusting the knobs

I am just running them and throwing the output into the output folder in the format we discussed

but I have some memory leaks to figure out as it uses my whole ram

So, if you’re waiting, maybe it’s useful for you to look over the paper, especially sections 5, 1, 2.

I will

Hopefully a last resort for working around a memory leak is to restart the process more frequently

A preliminary comparison

haha that is there


Wow ok

@pravnar is there a script and documentation somewhere to make this plots?

Already I’m wanting some kind of error bar or shading representation of how wide the spread is among the 10 trials per backend

Yeah there is a script called “plots.r” in /output/

@rjnw I will be pushing some updates to that script. My idea is that we run accuracy computers (such as runners/hk/GmmGibbs/Accuracy.hs) to produce data that this script can handle.

And Praveen I think you have some updated hssrc etc files to push?

Yeah

Thanks for the reminder. I will push those too

So eventually Rajan should use the new typechecker

Which means pulling from hakaru repo, building it, and updating his maple archive, yes?

I do that from time to time

whenever there are new changes in hakaru repo

Here’s another quick plot. The only change here is that we take many more snapshots per trial. In other words, I reduced the “stepSeconds” parameter.

oooh aaah
@pravnar commented on @pravnar’s file https://racket.slack.com/files/U7Z0QHKJR/F8188QHV0/output.pdf\|output.pdf: This is for the gmm benchmark, just 10 trials per backend