
@rjnw \ks{Just to be sure, are all these numbers (5 means and 5 standard errors) in milliseconds (not seconds)? So with full optimization we can run 3921 sweeps per second?}

There are now just two unaddressed PLDI comments — one for @ccshan and another for me to try to address. As it concerns signposting throughout (because we’re not reorganizing this paper at this point!), what I will do is to start reading the paper from the start, tweaking things as I go. I expect this to go quickly and smoothly for the first 4 sections, as they’ve been gone over quite a lot. I’ll then slow down through sections 5 and 6.

I’ll commit+push often, to minimize merge hell. But if someone is working on a particular section, please say so, there’s no need to cause unneeded merges.

I’m thinking we should remove the radar chart. If we had been able to rank 2–3 other systems on that chart along with the current system, it would make sense. On its own, it seems gratuitous.

@ccshan these numbers are ms per sweep, they are calculated by just doing one update though

wait let me check again

seconds per sweep

so with full optimizations it is around 2.5 seconds per 10 sweeps which is somewhat same as in fig13

@rjnw I’ve edited 5.1 quite a bit (and still at it). Please read asap to make sure I have not changed the meaning of things in bad ways.

I did, looks good to me. I will keep an eye on the changes.

That was 5 intro. I just, just checked in 5.1.

roger

So one thing about loop fusion I wanted to mention but I don’t think I did. Fusing loops don’t guarantee performance benefit as sometimes it disturbs locality of reference. But in our case most of the loops in hakaru iterate over a specific array. If you see an oppurtunity to add that somewhere, it might be helpful.

when we fuse loops we fuse not only on the bounds but also how those bounds were evaluated i.e if they are using the same array size.

I’ve added this information to what I’m about to push.

How much time does hk-maple takes for something like our GmmGibbs or ClinicalTrial example? It runs much faster on karst than on my machine. Where ClinicalTrial takes 33 seconds using karst and has been running for last 15 minutes

on my machine

I wonder if it has to do with mismatched versions of our Maple code and Maple itself. I know it can be slow, but not 15 minutes.

And remember that there’s Maple 2017.1 and 2017.2, not just 2017.

hmm

I checked the option to update when I installed

let me try again

You can get precise version info in Maple by saying version();

But also I don’t know when ppaml.mla was last updated on karst or on your machine

I was just about to ask about ppaml.mla versions!

is that different than running update archive?

No, it’s the same

I did that but not much difference still running after a minute

and the version();
gives and error in maple

That’s surprising. Can you supply a screenshot or photo?

hmm not this time, ``` > version(); User Interface: 1265877 Kernel: 1265877 Library: 1265877

karst is 1194701

I’ve finished a pass on 5.3 (which finished 5). Onto 6!

My version();
is 1247392 and it also is taking at least 100s to simplify <http://ClinicalTrial.hk\|ClinicalTrial.hk>
. Maybe @carette has access to older Maple versions to try?

Hmm, not easily. My Maple 18 on my laptop doesn’t start anymore because of some missing Java libraries.

And Maple 18 on my PC got wiped when I had to reinstall the OS because of, well, Windows 10 being Windows 10.

And I never actually got Maple 2016.

So Jacques, since you’re on Section 6, let me encourage you to revise it a lot — reorder as you wish — to highlight the punch line of “faster and more accurate” front and center. And you can at least leave room for us to say something about how long simplification takes on karst (we can describe both Maple version and the shared hardware on which Maple runs over ssh)

I do have Maple 2017.1 though.

Do we have the equivalent of http://ClinicalTrial.hk\|ClinicalTrial.hk as a Maple test?

Is that 1247392? It can be worth trying it in case it finishes surprisingly within 1 minute

That would certainly make my life simpler as far as testing/debugging this.

No

Yes, that version.

@ccshan can you also try the gmmGibbs it takes around 8minutes using karst for me

Where is this <http://ClinicalTrial.hk\|ClinicalTrial.hk>
file exactly? AFAIK we had 3 repos in use for this paper at some point…


@ccshan I’ll do a first pass on section 6 for wording, then see about bigger changes.

I’d suggest the opposite order but whatever helps

You can give file names from the testcode/hksrc directory directly to hk-maple with nothing else on the command line

I tend to read+edit. And section 6 is short. I want to get it all in my head first. Then I’ll see about structural changes.

ok

Then I’ll see about |http://ClinicalTrial.hk\|ClinicalTrial.hk|, unless that’s blocking?

http://ClinicalTrial.hk\|ClinicalTrial.hk was an example I want more GmmGibbs timings with maple

That’s what I want to compare to Jags in terms of startup time

Right

I’m building the most recent hkmaple
now.

My http://GmmGibbs.hk\|GmmGibbs.hk simplification is taking at least 14 minutes

I’m a little sad to see that a few tests from NewSLOTest.mpl
fail. Though none in PlateT
.

I didn’t think anything changed in the Maple code in the last 3 months. I think the last changes were in September!

So, my cpu usage is not up. Maybe there’s a bug in communication with Maple?

I don’t understand your statement + question. CPU usage of what? Communication in what context? Local or to karst? Of what call exactly?

CPU usage on my laptop, where I’m timing hk-maple. Communication between the Haskell side and the Maple side. Local. hk-maple <http://GmmGibbs.hk\|GmmGibbs.hk>

@rjnw the caption to fig:gmm50 says “each on the line point is for accuracy”, and I don’t understand what that means.

So the communication bug would be that what is sent to Maple is different than it used to be?

That is possible. A few changes have been made to the Haskell side of things.

Or Haskell doesn’t wait for the result from Maple in the same way anymore, or…

“Each point on the line is at every 10 sweeps.”

Try a slightly older version? Say from when we were working on the PLDI version?

@rjnw Thanks.

slightly older version of Maple?

slightly older version of Hakaru, the Haskell side.

Maybe go back to hakaru commit 23e5a8e7f4c67b3c704fe00aefdeea1450007662 ?

okay let me try

so I have never had hk-maple finish when using local maple

I wait for 10–20minutes

Do we have ‘microbenchmarks’ that are known to usually finish almost immediately?

I’ve used hk-maple on various tests (mostly of simplify) with no problems, locally. But not in a few months.

This finishes immediately for me: p <~ beta(1,1) weight(p^4*real2prob(1-p), return p)

[ClinicalTrials
also uses disintegration — could something have changed there?]

hk-maple -c Simplify /tmp/x.hk

@ccshan this one works

@carette You don’t nee \noindent
after \end{enumerate}
as long as you don’t put a paragraph break there

So for http://GmmGibbs.hk\|GmmGibbs.hk I think the next step is to take hk-maple --debug
output and feed it into Maple directly, then debug/trace that in Maple.

I’m a little paranoid about \noindent
, so I tend to put it in, just in case someone puts in an extra blank line.

I gave the output of hk-maple —debug directly to maple. It uses timelimit of 90 is it 90 seconds?

Yes

What’s supposed to happen is you get a timeout error after 90 seconds then you add --timelimit=600
or whatever to the command line.

But if I don’t give that when using hk-maple it doesn’t error out which is weird, but giving that output of debug to maple errored out

it times out when I don’t give 600 timelimit when using karst

So that seems like debugging progress. Maybe the way timeouts happen has changed… Maybe Maple no longer exits on timeout, so maybe the Haskell side is waiting for Maple to finish computing and the Maple side is waiting for more commands from the Haskell side

anyways now running with 600 timelimit directly using maple

If increasing —timelimit eliminates the hang, that’s definitely a bug in what I’d call Haskell-Maple communication, but it can be worked around by increasing —timelimit

okay it gave some output and quit, I don’t see any error in there

How long did it take?

roughly

hmm maple says 332

Is that 5m32s? I don’t know how Maple says these thigns.

I guess that is seconds

It would be bad if Maple simplified http://GmmGibbs.hk\|GmmGibbs.hk to “332” I guess

> quit
memory used=13283.2MB, alloc=711.8MB, time=332.40

oh no this is after the simplified output

Oh I see, that’s seconds. But yeah, it seems that timeout behavior has changed and you should work around it for now by increasing —timelimit when invoking hk-maple

Also maybe put “Maple 2017.2” instead of “Maple 2017” in the paper — right @carette?

Right.

@carette I just pushed this:
In other words, this list captures just enough information from the context to enable primitive distributions to generate integrals in direct style, instead of continuation-passing style.\ks{I’m not sure what this sentence means. I suggest omitting this sentence until we have a version of $\integrate$ that doesn’t use this list argument and accordingly passes continuations. I don’t have such a version.}

I’ve been avoiding the name “Hakaru”, because it’s never introduced in the paper, and introducing it properly would be slightly deanonymizing and slightly unscientific (because science is all about objective propositions right?). This means using the phrase “our (probabilistic) language” more. Also the legend in the evaluation can omit the name “Hakaru” and instead say “New LLVM backend”, “Old Haskell backend \citep{zinkov-composing}”, “JAGS”.

Sure.

I’m happy to defer on this since I’m not doing anything but I find that names are often helpful for this sort of thing

And I think the FAQ says it’s ok wrt double blind

But it’s going to be 99% obvious who wrote this paper, I think this level of de-anonymization serves no real purpose. And as @samth says, it’s permitted.

Anyways, I’m finishing a first pass on the paper before anything else.

Note that I will definitely disappear for a few hours soon - and may not have more than a couple of hours later in the evening to dedicate to polishing.

Just a curious question we cite zinkov-composing
so if someone reads that they will find out everything. Is that okay in double blind and everything?

also I changed the plot to what ken suggested

What do you mean “find out everything”? You mean they will realize that the authors of this paper are Zinkov and Shan?

@ccshan I ran hk-maple with timelimit=600
but it didn’t help still got stuck and I killed it after 18 minutes, now running with --debug

I mean Hakaru and your name basically

So any opinions on whether we keep or toss that radar chart?

@rjnw that’s fine, you can’t tell that’s by us

I’ve put in some text for 3 of the 4 contributions that were still ’to do’s (for @rjnw).

I think something might be wrong with communication between hakaru and maple on my machine

the same thing which maple does in 5minutes when given directly hakaru is stuck waiting for maple

Maybe it’s fine to give a ballpark figure based on invoking Maple directly.

@ccshan Do you intend to write something about “Perhaps draw analogy to optimizing compilers.” from the comments on PLDI reviews? That’s the last thing in there.

As to the comments in the paper itself, there’s mine about the radar chart, and all the rest are from @ccshan in section 6. I must admit to being at a bit of a loss as to how to reorganize it properly, as I’m simply not sure what comparisons will be in the final paper — that keeps changing!

By “what comparisons” do you mean what measurements?

I’ve got another 20 minutes now, and then I’ll probably be away for a couple of hours.

Yes. What measurements of what.

And even how to really evaluate those results. I’m not really sure what constitutes a good accuracy improvement between one system and the next.

Maybe the best use of your 20 minutes is to draft a response to the previous review.

@carette are you talking about the naive bayes comparison?

or gmmgibbs

Are we going to do that? The paper is a lot better, but the evaluation section is still rather sub-optimal.

For naive bayes I still don’t know what to say, if you look at the measurements we have per update time for mallet accuracy and per sweep time for JAGS accuracy and per sweep time with LLVM and haskell backend

@rjnw both. And the stuff in section 6.3. I have a really hard time interpreting what it all really says.

For 6.3 I am going to switch back to a table of measurements, that bar chart does not help

essentially they are timing measurement by “turning off” some optimizations

so as to show that our optimizations actually do something in terms of performance improvements

Remember, I do symbolic computation and code generation (really high level — generating Haskell is great, anything lower level is a pain) and math. I am a rank amateur at the rest.

I agree that the table was better than the bar chart. Easier to interpret, somewhat.

That’s why you give good feedback

The problem is that there was no baseline. Does it make sense to run things without any optimizations at all? Is it possible?

I can do that give me few minutes

With the total absence of @pravnar, tiny dips from @samth, and in-and-out from @ccshan, shall I guess that they are all working on other ICFP papers?

[To be fair, I was going to also, but things kind of imploded a couple of weeks back, making it clear that there was no way to get something in good shape by the deadline]

I’m on paternity leave


@samth Indeed, I actually knew that, forgot. Good of you to mostly stick to it :sunglasses:

There is some issue with running without any optimizations, I think there might be a bug somewhere because it is saying 250 times faster than full optimization

Very cute!

@rjnw Hopefully that is indeed a bug…

I ran it again now it’s around 1000 times faster

Are your flags reversed somehow?

If all of them are, that would mean a rather different interpretation of 6.3!

not really I mean I am using the hakaru code without summary which is slow to run with my compiler optimizations turned off

I have written the benchmark seperately for each of them

manually tweaking things to turn them off

so most probably I am doing something wrong here

because it’s saying 3ms to run 10000 updates for GmmGibbs with 50 classes and 10000 points

which is equivalent to doing nothing in a loop of size 10000

That is really weird.

Where is hk-maple
available? I thought it was in the main hakaru
repo, but stack build
didn’t give anything named hk-maple

It should be the main hakaru repo…

I do see it in the cabal file. Odd.

Ah, not under dist/, under .stack.

Oh weird. On my MacBook Air (not a fast computer), using Maple 2017.1, hk-maple <http://ClinicalTrial.hk\|ClinicalTrial.hk>
works successfully (locally) in 11 seconds!

latest ppaml.mla from master, right?

Yes.

version 1247392?

Yes.

[But remember that some external dll was broken on the Mac, and so it didn’t end up using that path. Which is likely why there are some failures, now that I think of it]

so if I run clinical trial directly on maple using the debug output of hk-maple it takes 6.7seconds

(Separately, Jacques, the most helpful thing you can do might be to continue revising Section 5, ifyou see more room for improvement there. Let me know if you tackle that.)

On a decent computer, that is not so far from what I get.

yeah but haskell gets stuck it never shows it got anything from maple in debug

I get the behavior Rajan just described

(I can try. It’s decent now. Though I will be starting my ~2 hours offline any minute now)

There might be some communication changes in maple, which Ken mentioned before

Weird. I just pulled from master and compiled everything from scratch.

I would have the same communication changes…

which I guess may only affect Linux and/or Maple 2017.2 ?

could be

But Jacques and I are using the same maple version. I’m on Linux though. I’m rebuilding hakaru to see if it affects it.

What is the formula for calculating percentage slowdown?

that should be a good column in the table for optimizations right?

[Have to go. Back some time after 9:00]

@samth ping!

I don’t know what formula you’re talking about.

Pong but not at a computer until maybe 9

calculating percentage slowdown in two different runtime. At full opt we are 0.2 seconds with no histogram 460 so % slowdown?

or should I say percent improvement

I think slowdown, but not percentage

Percentages are often confusing

and should I use fast/slow –1?

Back (at least mostly). Early, apparently.

I’m in the middle of improving 5.1

@rjnw Did you ever figure out why the no optimization version seemed to be faster?

@ccshan Ok, I won’t touch that.

No not yet

Worrisome…

Does it make sense for me to edit section 6. Some of the English recently added has some grammar mistakes. But I don’t want to edit if you’re going to check something in at any time @rjnw.

the naive bayes benchmark part needs a lot of work

I am adding numbers so you can go ahead with the writing part

Ok, will do.

I’m now relatively available for a while

is there something I should work on editing?

one thing I noticed is that the table of no-opt slowdowns, 0.8x should probably be 80x or something like that

Is Section 6 an instance of Brooks’s law?

(I’m on Section 5.2)

Section 6 does seem to suffer from too many cooks.

Anyways, I’ve done some (significant?) edits to it.

oh, i see it should be more like 1.8x

I agree with @samth that there seems to be an issue with that 0.8x.

it should just be 1.8

I did slow/full – 1

if you don’t have –1 then full is 1x slower than itself

also slowdown itself does not give a good picture

1.8x is the conventional presentation of “slowdown” for those numbers

it should be speed up with opposite ratio but then the table becomes weird

I think adding 1 to those numbers gets the right result

(I’m keeping an eye on this conversation and it seems that it’s not interfering with Jacques’s work yet)

I am saying if you add 1 then add 1 to the last row as well?

which makes it 1x slower than itself

I think those slowdown numbers are all dubious.

I agree then don’t make sense

they*

0.003183/0.000499 ~ 6.379

that is standard erro

those numbers are the standard error

I have pushed what I think is the correct table

Oops!

Misspelled something?

1.8x it is indeed.

@samth ‘pushed’?

git push

I know. I normally get an email when anyone pushes. And I didn’t.

But that’s apparently an email server problem, git pull
got your push.

Is there a particular section that no one else is looking at?

and/or is there something I can do to help NB appear in the paper?

Sam, you can look at 5.3 (but perhaps read the beginning of Sections 5.0, 5.1, and 5.2 first, for context)

But that’s less urgent than NB

Right now, I am not actively working on any section. Perhaps :white_frowning_face: ?

ok, or i can work on editing some totally other section if someone else wants 5.3

I am at a bit of a loss on what I can really do to improve things.

Anyone needs any Maple (or Maple/Haskell connections) debugged?

For NB, for example, what could I do to help?

Jacques raises a good issue about our final contribution item, “compared to…popular probabilistic programming systems”. We only show JAGS. I’m sure we’re much faster than, say, WebPPL, but that’s because Zinkov showed that even the old Haskell backend was faster than WebPPL.

What do we have the time to reasonably show in this paper?

We should just cite Rob showing that

Even if Zinkov showed the old Haskell backed was faster, we ought to show it again to claim it as a real contribution.

I think the problem is claiming it as a contribution of this work.

is the problem just the plural there?

or that we claim to do the comparison?

My question was just about the plural.

if we rephrase to “are faster and more accurate than manually specialized-and-tuned code, popular probabilistic programming systems, and a previous backend that emits Haskell”

is that reasonable?

since if Zinkov shows that old-Hakaru is faster than webppl, then by the transitive property …

It’s not false.

So just drop “Our benchmarks show that”?

“reasonable” is a different bar :wink:

no just drop the “compared to”

I don’t understand why this would be unreasonable

or perhaps, I think that truth is a sufficient condition for reasonablness

It’s getting late, I get pickier then…

We need to cite Zinkov in the right place(s), but I think we can defensibly claim that.

Who can do that please?

@samth Want to do the honours?

doing it now


Great. Perhaps label that ‘Ours’ ?

I am also waiting on haskell backend timing

it’s been running for a few days now

it’s almost finished we will get better jags timings as well

So use the same label as the legend in the GmmGibbs plot.

okay

Also, error bars (standard error)

@ccshan it’s not possible in racket, I am going to do this in tikz and maybe you can help with it?

Ok, so is this the only tikz plot in this paper?

I suggest you make a table for now then. Thanks for the plot though.

(I’ll take your table and make it a plot with error bars)

I just pushed something with a missing citation to webppl

@ccshan done

I added haskell as a placeholder for now

Imma use the word prepone!!!

So I think the citation should be

@misc{dippl, title = {{The Design and Implementation of Probabilistic Programming Languages}}, author = {Goodman, Noah D and Stuhlm"{u}ller, Andreas}, year = {2014}, howpublished = {\url{http://dippl.org}}, note = {Accessed: 2018–3–16} }

?

That’s how they ask to be cited.

done

And I’ve added that (with key webppl) to the .bib file.

So, other than the numbers/table from @rjnw are we waiting for any more data?

no we are just waiting for better numbers

which I am hoping will finish soon

Ok, is there anything left to write, other than that ?

also I was looking at the no optimization number still no luck there

no

@ccshan are you still polishing section 5?

Yes, almost done with 5.2

@samth I think another set of eyes on section 6 (or the early parts of 5) would be good.

I am now a bit too familiar with the content, and am starting to have a harder time seeing things to improve.

so there is no difference between no histogram and nothing

I will add that to the table

So histogram makes no difference? That seems surprising.

no when histogram is off nothing makes any difference

Oh! That’s pretty big!

When only histogram is off it’s 460ms when histogram is off and all our optimization are off it’s 470

which is sort of what I expected

I think the difference is bigger in naive bayes though

Eagerly awaiting good numbers then.

I’m on to 5.3…

As we should be able to say something about that.

@rjnw What does “all the required functions for probability distributions” refer to?

categorical and others

So, not “prog”, but the primitives?

yes

And things like logsumexp etc. Got it.

yes

Ok so I’m going to read Section 6 in its current state

Someone could spell-check…

Ok, I’m on it.

why is there an extra significant figure in the first row of fig 14?

my mistake probably

I will fix it

Apparently I can’t do the spell check. Neither aspell nor ispell are on this machine, and my installation of brew is broken. At least, that’s what it tells me.

Trying to fix that right now doesn’t seem wise.

i’ll do it

i made a few edits in 5.3

done

Boy oh boy @ccshan that thumb of yours is getting some serious exercise! :sunglasses:

Ok, @rjnw just checked in some new numbers. Is there some text that we should add to comment on that?

Admittedly not my usual finger.

Also, are we still awaiting more numbers?

no

there more of JAGS running but they are just for reducing standard error, which even now is not so bad

What about startup/simplification times?

oh yeah, I am running that manually though

and my hakaru is not working with maple

do we call maple twice in simplification and summary?

The last table in 6.1 - did @ccshan say he was going to turn that into something else?

because right now what I have is just the first time maple is called which I got from running hk-maple using --debug

@carette Yes and that’s still the plan

Right now, it is odd, as it uses the Mallet numbers as a header line.

that table is there just for data

@ccshan is making a bar chart

I am starting to seriously fade. It’s been a long week, and I don’t know if there really is a lot I can contribute anymore.

Seems like that bar chart is really the main thing left to do. And perhaps one more pass on section 6. The rest seems really solid.

You can help Rajan with Maple(/Haskell)

I’m on Section 6 including bar chart. It needs revision.

I can just use the numbers with karst?

yall figure out what numbers to use, i’m reading

Sure - @rjnw what can I do to help on that front?

I need to get the time it takes to do simplification

Of which problem?

GmmGibbs

And have you been doing it on your machine or karst?

What is the input file you use for GmmGibbs?

So for actually running it and doing timing benchmarks for sweeps I just used karst


@rjnw Is it true that “The Haskell backend takes 500 second to do 100 sweeps” (GmmGibbs)? Figure 13 seems to show that 10 sweeps take ~33 seconds, so I’d expect 100 sweeps to take ~330 seconds, not 500.

@rjnw Figure 14 is for NB, right? And do you have standard error for those accuracies?

let me check

I seem to have my machine set up to ssh into karst (as user ppaml), but I don’t seem to have a password stored anywhere…

@rjnw Also “we reach 40% accuracy where JAGS never exceeds 35%” seems a bit overselling because Figure 13 depicts JAGS reaching ~36%

@carette Send me a public key and I’ll get it installed. That’s probably what you were doing before, not password.

@carette (Also you may have a karst account yourself.)

I think I wrote the 35% part when I was cutting off gmm gibbs where we finish

I think I do. And I think I installed that key.

@rjnw Ok so if you could look at the numbers now, what’s JAGS’s maximum accuracy and what’s ours?

Anyways, I’ll email you that right now.

haskell backend takes around 400 seconds to do 100 sweeps

Oh @rjnw did you figure out the “no optimizations” benchmark while I wasn’t paying attention?

@rjnw Ok I’ll change 500s to 400s and 35% to 36%

yeah I added those numbers in the table

@carette Installed. (Your previously installed key is a different one ending in jpxrxh4= carette@carette-notebook
)

Yeah, that was likely a DSA key instead of RSA

@rjnw I’m still waiting on “Figure 14 is for NB, right? And do you have standard error for those accuracies?”

I was just double checking the jags maximum accuracy for gibbs

Hmm, it still asks for a password. I tried as ‘carette’ and as ‘jcarette’.

I will calculate now

@carette You’re doing ssh ppaml@karst
right?

okay I don’t jags is still running

Didn’t have the ppaml. Let me try again.

@rjnw So you’ll be updating the Jags column (both means and std errors) in Figure 14?

aha!

yes

@rjnw Ok great, would you please confirm it’s NB?

also for karst I don’t use the ppaml karst, I use mine

NB? figure 14?

Then the ppaml karst probably needs a maple update-archive

it is naive bayes

@rjnw Thanks

@rjnw So, the table you left for me… Is it per-sweep or per-update times? I assume the first numeric column is mean and the second is std err (not std dev), but what are the units?

per update

I was not able to do full sweep in mallet

ms

yes

Got it

Std err, I assume.

What module do I need to load to have access to stack on karst?

Std err in ms, I assume.

As I don’t seem to have/see it. Nor kb-maple.

Maybe we use cabal on karst?

I do have maple 2017 loaded (though I had to change that since it defaulted to 2016)

@ccshan yes

should I update the accuracy values for fig14 in tex?

I got the stderr

I tried to use cabal, but it complained that various things were not installed.

@rjnw Yes please go ahead (I haven’t touched it yet)

okay

and ghc —version is 7.6.3 !!!

@carette Let me dig out an old ppaml-l message that might help…

@carette Are you using the ppaml account or your own karst account?

I did ssh ppaml@karst.

‘who am i’ says ppaml.

@carette Try module load gmp ghc mpc mpfr

That loads 7.10.1.


so @ccshan we don’t have much data points for jags or rkt for naive bayes full sweeps. We have 10 for rkt and 3 for jags

Well the beauty is you can compute std err with whatever you have

I pushed that

I am just saying due to small number of points our stderr is 0.0001 and less

*0.001 or less

I don’t understand, if you have a small number of points then you should have bigger stderr because stderr is dividing by sqrt(n)

That still only gets me 7.10. The paper says we use 8.0.2 (which is actually kind of old too).

well the values are almost identical

@carette If you really want 8.* then you can build your own

The point is to be able to help you guys in a timely manner, to produce reliable data.

If @carette is also running on karst then there is no difference right?

@rjnw Great but I don’t understand “our stderr is 0.001 or less” — your stderr is a number you know for sure, not a number you are trying to infer or predict.

0.001 or less for all the 5 rows

@carette Great so thanks for doing your best with that.

@rjnw it depends on what Haskell you use on karst!

If you do ghc --version
, what do you get?

@rjnw Ok but why blame the small number of data points for how small the stderr is? If you run more data points, the stderr will probably get even smaller, no?

I don’t know sometimes you can get lucky with small numbers, specially if that number is 3

@carette @rjnw If you use hk-maple on your local machine and have it ssh to karst then ghc on karst doesn’t matter because karst is only used for its maple installation

yes I do what @ccshan mentioned

@rjnw Well, the stderr is valid as long as you don’t keep running the same experiments hoping to get lucky.

I am fine with having really small stderr

just saying that they are really small

going back to getting startup time, what should I do?

Ok, I got that working, I think. At least ClinicalTrial takes 13s instead of 11s, so it seems it is indeed ‘calling out’.

I will just do it with karst ssh then?

for fig 14, shouldn’t the caption say something about NB?

@rjnw Is it easy for you to (pull then) add the number of trials to the NB table(s)?

and will the timing table become a figure?

@samth Yes and there’s a lot more to fix…

’

But the data is basically there so I’m hapy

ok good

Look we’re >10x faster than JAGS and 9x as fast as Mallet on NB

I think I have to call it, but we seem to be in good shape to get over the line

Yeah

If nobody’s blocking on me then I think it might be time for the obligatory Steak and Shake run on deadline night

(That’s why we didn’t get in last time)

So what exactly are you expecting from me? I believe I am trying to get the time to get disintegrate+simplify+summary for GmmGibbs on karst.

Is that right?

Are you running haskell on karst as well?

No, on my machine.

well then that’s no different than what I have!

If we are okay with that, I can do it on the same machine as every other benchmark

except simplification on karst

I am fine with that!!

The big picture is we’re trying to measure time from knowing the GMM model to starting to compute with numbers. For JAGS that number is “whopping 250 seconds” so what’s ours?

8minutes

Then perhaps we should tone down that ‘whopping’ just a tad.

I think we need to be more clear about startup time

because right now we are very breif about it

I might remove “whopping” but note that the JAGS time is per-data (and increases with data sizee) whereas our time is per-model.

Though ours is still a ‘compile’ step while theirs is runtime.

I drew a picture a while back that I posted

I don’t know if we can include a graphic like that, but we should clarify those things

our models have a multi-minute compile step, and then a few second startup time

I offer to revise Section 6. I just want the numbers.

in both cases, the times are independent of the data

I mean if someone else wants to revise Sectino 6 that’s fine, but I think it needs a rewrite and nobody seems to be doing that

I’m running it now with a time limit of 10 minutes.

for JAGS, the compile time is 0 but the startup time is multi-minute and scales with data size

@samth What do you mean by “both cases”?

both the compile step and startup time

I see

I am massively fading now. A rewrite from me would worsen things considerably.

I will run the hk-maple a couple of times and get the startup time in terms of mean and stderr

also I belive the JAGS startup time for NB was even worse

2500seconds

I think

~21–22k seconds

I will have concrete values of startup times in a few minutes

“a few” meaning 41 minutes? :thinking_face:

well it takes around 8 minutes for one gmmgibbs startup

I already have jags

8 minutes is good… as my run on karst just succeeded, in 9m51s.

Beautiful result, if a little bit slow to get. One-time cost though. [Except for people like us who do testing.]

Anyways, I have to call myself done. I can’t do any more productive work tonight. But the paper is in excellent shape!

with karst there is a lot of variation

just now I also got 9m11s

~ std err ~

good night jacques

good night.

So @rjnw you’re going to put what you know about startup times in the manuscript, right?

yes

I added the raw startup timings

what kind of figure should we do?

I have three things time for simplify, llvm-startup, jags-startup

First of all, 32776 seconds is 9 hours! Are you missing a division?

oops

So I understand that the “jags-startup” time is per data, and “simplify” is our per-model time, and “llvm-startup” is our per-data time. Right?

yes

Ok so I guess you’re about to change the numbers?

yes I am doing that now

I’m thinking that a table might be best because these numbers are hard to compare. Basically we’re saying if you run the same model 3 times on different data then we’re faster at startup

And remind me, this is Gmm or NB/

?

Gmm

and also we are not fast enough to cover 300s when actually running it

I don’t understand that. What does “cover” mean??

umm I don’t know what word to use. we do 100 sweeps in ~30seconds and jags does in ~70s

if this difference was greater than 300s then we could say overall we are faster, but it is not

300s is the difference between 545s and 222s. I see.

But again, our 545s is per model and their 222s is per data (and grows with data size). I’ll write about this.

Did you say you have startup timings for NB?

only for jags

let me run hk-maple once to get an approximate value

Ok I understand it’s going to be like jags 40 minutes, hakaru 10 minutes

not really jags 3hrs

Oh ok

And hakaru like 10 minutes right?

it’s running

but should be in same ballpark

as 10 minutes, not 3 hrs, I hope.

jags is even higher

it’s around 22k seconds

That’s 6 hours

yeah

no wonder it’s been running for almost 2 days

That’s higher than Zinkov reported — any guess why?

don’t know, maybe my machine is slower

(it could just be you have a slower machine)

I know mine is slower for single threaded applications

That may be the last number I want. I’m starving so I’m going to eat and read hardcopy.

hmm the first run for NB gave 2minutes

^hk-maple timings

I will make these tables like fig15

I’ve had the experience before where GMM simplifies slower than NB.

yeah second time 2:20minutes

That’s not bad…

yeah, other numbers I need is runtime startup for NB. I think that should be enough right?

I think the table should have two columns for times, one for per-model and one for per-data startup time (and we can just put down 0 as the per-model startup time of JAGS)

Yeah

okay

Yay thanks

after that I might go to sleep I have been up since 5:30am

Yeah, do that.

One question, in Fig 13 you put a mark per 10 sweeps, but of course the sweeps take slightly different times per trial. How do you deal with that?

the time is also mean per sweep

Meaning you put the marks at exactly regular intervals?

Ok

yeah the snapshots are at every 2 sweeps

or maybe 1

But just to make sure I understand, the mark might be placed at a time where no actual sweep finished, right?

yes

Got it, thanks

just curious

it is placed at the mean time for a sweep across all trials

So if for some weird reason the first sweep takes longer than the second sweep in every trial then that would show up

yeah but I didn’t see that in practice

Got it

@ccshan I pushed the changes for making tables for different timings.

Good night!