Racket Slack Archive

ccshan

2018-7-9 14:01:41

@ccshan commented on @rjnw’s file https://racket.slack.com/files/U6602H150/FBLR8DAJF/naivebayesgibbs-accuracy.pdf\|NaiveBayesGibbs-Accuracy.pdf: Good morning… Is there some reason you haven’t put this plot version in the git repository?

rjnw

2018-7-9 14:03:12

The repository still has one with y-axis 0–100, what should I do with legend for 45–85%?

ccshan

2018-7-9 14:04:17

Yeah, well, can you move the legend up?

ccshan

2018-7-9 14:04:26

(to the middle vertically)

ccshan

2018-7-9 14:07:42

Given that the AugurV2 curve doesn’t move, I prefer this version over the one in the git repository even if the legend stays in the lower-right corner translucent. Similarly for NaiveBayesGibbs-Likelihood. But this issue is minor.

ccshan

2018-7-9 14:08:44

It’s more important to have PSI plots. But I know you have to go soon!

rjnw

2018-7-9 15:22:00

I am still here for a couple hours, I will see if I can start the psi plots while I get ready.

rjnw

2018-7-9 15:22:06

Can psi take command line arguments?

ccshan

2018-7-9 15:23:19

I don’t see a way…

ccshan

2018-7-9 15:23:42

(and that might mean generating psi code with different n on the fly, I understand)

rjnw

2018-7-9 15:24:03

I think so, what is the range you want for clinicalTrial?

ccshan

2018-7-9 15:24:30

It seems that [10,20..100] would stress it enough?

ccshan

2018-7-9 15:24:47

(I mean, it seems that it gets pretty bogged down at n=100 already)

rjnw

2018-7-9 15:30:19

How many times should I run it? and is it okay if I just give you the raw data?

samth

2018-7-9 15:31:33

@rjnw we can definitely turn raw data into plots

ccshan

2018-7-9 15:32:22

Sure, why don’t you start with one run and give me the raw data, then if you have more time then do more runs and plot them with error bars? I say this because I can throw together a plot in tikz but I don’t have a quick way at hand to plot the error bars the way you do them nicely.

rjnw

2018-7-9 15:41:32

Okay, it’s running now let’s see how long it takes for one run.

rjnw

2018-7-9 16:27:25

I also pushed new naivebayes plots with suggested axis and legend position

samth

2018-7-9 16:32:44

looks good to me

ccshan

2018-7-9 16:44:40

@rjnw Table 1

ccshan

2018-7-9 16:45:04

needs LDA startup times for LLVM-backend and AugurV2, no?

rjnw

2018-7-9 16:45:29

Oh yeah forgot, doing it now.

rjnw

2018-7-9 16:48:54

update on psi clinical trial, it increases 2x every +10, with 787s for 80 and running currently for 90. I am not running for 100. I think I can run it two more times while I pack.

rjnw

2018-7-9 16:49:12

what about psi linear regression?

ccshan

2018-7-9 16:50:32

I’m not sure you need to run psi clinical trial more than once. I’d rather you run psi linear regression once, for n=10,20,..,90

rjnw

2018-7-9 16:58:59

clinicaltrial ../../testcode/psisrc/ct10.psi 1.61s user 0.02s system 99% cpu 1.632 total ../../testcode/psisrc/ct20.psi $bin $i 5.82s user 0.04s system 99% cpu 5.867 total ../../testcode/psisrc/ct30.psi $bin $i 15.44s user 0.11s system 99% cpu 15.571 total ../../testcode/psisrc/ct40.psi $bin $i 50.39s user 0.30s system 99% cpu 50.741 total ../../testcode/psisrc/ct50.psi $bin $i 101.33s user 0.34s system 99% cpu 1:41.79 total ../../testcode/psisrc/ct60.psi $bin $i 224.49s user 0.78s system 99% cpu 3:45.58 total ../../testcode/psisrc/ct80.psi $bin $i 787.87s user 2.01s system 99% cpu 13:10.96 total ../../testcode/psisrc/ct90.psi $bin $i 1434.59s user 3.01s system 99% cpu 23:59.44 total

ccshan

2018-7-9 16:59:53

Cool, what about 70?

rjnw

2018-7-9 17:02:54

linear regression psi lr ../../testcode/psisrc/lr10.psi $bin $i 0.41s user 0.00s system 99% cpu 0.413 total ../../testcode/psisrc/lr20.psi $bin $i 0.63s user 0.02s system 99% cpu 0.653 total ../../testcode/psisrc/lr30.psi $bin $i 0.91s user 0.01s system 99% cpu 0.923 total ../../testcode/psisrc/lr40.psi $bin $i 1.24s user 0.02s system 99% cpu 1.258 total ../../testcode/psisrc/lr50.psi $bin $i 1.59s user 0.03s system 99% cpu 1.618 total ../../testcode/psisrc/lr60.psi $bin $i 1.87s user 0.02s system 99% cpu 1.891 total ../../testcode/psisrc/lr70.psi $bin $i 2.20s user 0.04s system 99% cpu 2.237 total ../../testcode/psisrc/lr80.psi $bin $i 2.49s user 0.05s system 99% cpu 2.544 total ../../testcode/psisrc/lr90.psi $bin $i 2.87s user 0.03s system 99% cpu 2.905 total

rjnw

2018-7-9 17:08:03

missed 70 for ct, running now.

rjnw

2018-7-9 17:14:04

table 1 per model time is the time for hk-maple right?

ccshan

2018-7-9 17:14:16

yes

ccshan

2018-7-9 17:20:00

Plotting psi times…

rjnw

2018-7-9 17:22:47

Sorry, I ran 70 but forgot to add time. Doing it again

rjnw

2018-7-9 17:33:29

~/w/h/r/psi &gt; time ../../other/psi/psi ../../testcode/psisrc/ct70.psi
452.29user 1.32system 7:34.23elapsed 99%CPU (0avgtext+0avgdata 3279712maxresident)k

rjnw

2018-7-9 17:53:28

I pushed the lda numbers for table1. Is there anything else, I am going to log out soon.

ccshan

2018-7-9 17:56:36

Thanks! I take it you’re not up for running MALLET LDA :wink:

rjnw

2018-7-9 17:57:12

I forgot how I did it last time.

ccshan

2018-7-9 18:00:32

It would be totally new because it’s LDA, not NB.

ccshan

2018-7-9 18:00:42

http://mallet.cs.umass.edu/topics.php

rjnw

2018-7-9 18:03:04

importing data in mallet is completely different, for 20newsgroup we just gave it raw data. For KOS I don’t know how to do that.

rjnw

2018-7-9 18:03:20

Also 20newsgroup their website itself had examples.

ccshan

2018-7-9 18:05:10

Hey, sorry I didn’t notice, it seems LR is much faster than CT with PSI?

rjnw

2018-7-9 18:05:18

Yeah

rjnw

2018-7-9 18:05:27

It doesn’t scale that bad

ccshan

2018-7-9 18:06:22

Do you think you can run larger n, like 100,200 up to 1000?

rjnw

2018-7-9 18:08:36

running now.

ccshan

2018-7-9 18:08:47

Thanks and bon voyage…

rjnw

2018-7-9 18:09:09

Thank you!

ccshan

2018-7-9 18:19:50

Oh would you please report your PSI and D versions?

ccshan

2018-7-9 18:20:03

(for PSI, commit hash)

rjnw

2018-7-9 18:28:02

I think I did already

rjnw

2018-7-9 18:28:09

it’s in other/psi along with augur

rjnw

2018-7-9 18:35:51

psi lr ../../testcode/psisrc/lr100.psi $bin $i 3.29s user 0.03s system 99% cpu 3.323 total ../../testcode/psisrc/lr10.psi $bin $i 0.40s user 0.01s system 99% cpu 0.409 total ../../testcode/psisrc/lr200.psi $bin $i 8.80s user 0.13s system 99% cpu 8.938 total ../../testcode/psisrc/lr20.psi $bin $i 0.63s user 0.00s system 99% cpu 0.635 total ../../testcode/psisrc/lr300.psi $bin $i 20.02s user 0.20s system 99% cpu 20.244 total ../../testcode/psisrc/lr30.psi $bin $i 0.93s user 0.01s system 99% cpu 0.938 total ../../testcode/psisrc/lr400.psi $bin $i 41.24s user 0.39s system 99% cpu 41.676 total ../../testcode/psisrc/lr40.psi $bin $i 1.26s user 0.01s system 99% cpu 1.273 total ../../testcode/psisrc/lr500.psi $bin $i 79.87s user 0.51s system 99% cpu 1:20.66 total ../../testcode/psisrc/lr50.psi $bin $i 1.57s user 0.02s system 99% cpu 1.592 total ../../testcode/psisrc/lr600.psi $bin $i 136.46s user 0.85s system 99% cpu 2:17.45 total ../../testcode/psisrc/lr60.psi $bin $i 1.79s user 0.05s system 99% cpu 1.841 total ../../testcode/psisrc/lr700.psi $bin $i 220.28s user 1.04s system 99% cpu 3:41.78 total ../../testcode/psisrc/lr70.psi $bin $i 2.14s user 0.05s system 99% cpu 2.186 total ../../testcode/psisrc/lr800.psi $bin $i 347.42s user 2.00s system 99% cpu 5:49.86 total ../../testcode/psisrc/lr80.psi $bin $i 2.43s user 0.04s system 99% cpu 2.485 total ../../testcode/psisrc/lr900.psi $bin $i 498.65s user 2.39s system 99% cpu 8:21.68 total ../../testcode/psisrc/lr90.psi $bin $i 2.80s user 0.06s system 99% cpu 2.860 total

rjnw

2018-7-9 18:45:28

logging off my machine, I still have slack on my phone if something comes up.

ccshan

2018-7-9 18:45:43

Cool, thanks!

samth

2018-7-9 19:24:41

@ccshan when did you become an economist?

samth

2018-7-9 19:25:36

https://hsm.stackexchange.com/questions/140/why-is-price-on-the-vertical-axis-and-quantity-on-the-horizontal-axis

ccshan

2018-7-9 19:26:13

I’m happy to switch it back. I thought it’s PLDI-ish advice to plot things so that “up is better”.

samth

2018-7-9 20:25:50

well, I think when I’ve said things like that I’ve mostly meant that you want the graph to go a particular way to indicate success

samth

2018-7-9 20:26:01

I think the dependent variable should always be on the y axis