Racket Slack Archive

pocmatos

2019-11-20 15:07:15

OK - lets do the following. I just wanted to discuss a few details on how to get all the components of the future architecture speaking: Actions - DB - Dashboard. I know @notjack has a lot of experience with this so lets go ahead with the meeting as is. I will schedule another meeting and make sure the time suits all of us. In any case, I will take notes and write down what we discussed.

pocmatos

2019-11-20 15:08:22

There are two services I would like to integrate with the CI we are doing - LGTM and OSS-fuzz. LGTM is a simple push of a button while for OSS-fuzz we need to send a request to Google to be accepted.

pocmatos

2019-11-20 15:08:32

Any issues with me going ahead and doing both of these?

pocmatos

2019-11-20 15:10:52

on lgtm, I have submitted it a few weeks ago so we already have some results https://lgtm.com/projects/g/racket/racket/

pocmatos

2019-11-20 15:11:03

but having this in CI would be great.

pocmatos

2019-11-20 15:12:00

Also: https://github.com/racket/racket/wiki/Continuous-Integration

pocmatos

2019-11-20 15:12:05

^^^ in very early stages.

samth

2019-11-20 15:25:15

@pocmatos I’m in favor of all of those things

pocmatos

2019-11-20 15:42:55

@samth from lgtm:A request to install <http://LGTM.com\|LGTM.com> has been submitted on the @racket account.

pocmatos

2019-11-20 15:43:05

I assume someone is going to get an email they need to approve.

samth

2019-11-20 15:50:25

@pocmatos I have not gotten an email yet

samth

2019-11-20 15:51:43

ah, they didn’t email but i found it on github

pocmatos

2019-11-20 15:58:12

Thanks - first github workflow is ongoing : https://github.com/racket/racket/commit/6e63f6a99fe46317f5a9538d85a4f8ad1d87e6e1/checks?check_suite_id=320543270

pocmatos

2019-11-20 15:58:17

Yay!

samth

2019-11-20 16:48:24

Another next step is to add something to notify #notifications here wrt the GH Actions

samth

2019-11-20 16:50:28

Another question is if we can cache the results of building Chez Scheme so that we don’t re-do that work when things don’t change.

notjack

2019-11-20 17:27:03

Caching will be tricky

samth

2019-11-20 17:28:27

https://help.github.com/en/actions/automating-your-workflow-with-github-actions/caching-dependencies-to-speed-up-workflows

samth

2019-11-20 17:40:36

also, can we check out with —depth=1?

pocmatos

2019-11-20 20:32:51

@samth https://github.com/google/oss-fuzz/pull/3054

pocmatos

2019-11-20 20:33:59

We cannot check out with --depth=1. I have tried that in gitlab but was surprised when it failed. :slightly_smiling_face:

samth

2019-11-20 20:34:14

We already do that in Travis

pocmatos

2019-11-20 20:34:38

Ah - yes, in travis works because everything is a single job.

samth

2019-11-20 20:35:09

I’m not sure why that’s related

pocmatos

2019-11-20 20:35:18

When you do it in parallel, you push X, checkout X and build X. Someone in the meantime pushes Y.

pocmatos

2019-11-20 20:35:27

It checkouts Y and builds Y.

pocmatos

2019-11-20 20:35:43

When you start to run tests in X, X does not exist if you do --depth=1.

pocmatos

2019-11-20 20:35:46

and the build fails.

pocmatos

2019-11-20 20:36:08

Because you do a checkout per job.

samth

2019-11-20 20:36:13

pocmatos

2019-11-20 20:36:23

You could do --depth=5.

samth

2019-11-20 20:36:27

so we’d need to do a “checkout to this commit”

pocmatos

2019-11-20 20:36:42

And hope 5 is large enough. :slightly_smiling_face:

samth

2019-11-20 20:36:45

5 isn’t always enough because you could push arbitrarily many commits

pocmatos

2019-11-20 20:37:36

right … we don’t know the upper bound that’s safe but it’s not very high for racket. It also depends on how long CI runs and how fast we push commits. :slightly_smiling_face:

pocmatos

2019-11-20 20:38:03

I was intrigued by the idea of caching Chez. What did you have in mind?

notjack

2019-11-20 20:38:03

what’s the default depth? pull everything since the beginning of time?

pocmatos

2019-11-20 20:38:13

@notjack right.

samth

2019-11-20 20:38:15

yes

samth

2019-11-20 20:38:41

I mean cache everything built in ChezScheme until the commit changes

notjack

2019-11-20 20:38:53

so since racket/racket has 40k+ commits, even --depth=100 would be two orders of magnitude better

samth

2019-11-20 20:38:57

yes

samth

2019-11-20 20:39:17

also the most recent N commits have much less data

samth

2019-11-20 20:39:25

since the repositories split

pocmatos

2019-11-20 20:40:12

I mean - if the CI takes 1 hour, how many commits are there per hour in Racket? OK, sometimes Matthew pushes 3 or 4 one after the other but I would say even --depth=10 would be safe.

pocmatos

2019-11-20 20:40:51

@samth i like the caching idea. Not sure exactly how to do it yet but it could speed up the build slightly.

notjack

2019-11-20 20:40:58

it probably would be. would it be hard to debug problems caused by choosing too low a depth?

pocmatos

2019-11-20 20:41:13

no … checkout would say commit doesn’t exist.

pocmatos

2019-11-20 20:41:28

Actually first time I saw it, I was left scratching my head…

samth

2019-11-20 20:41:36

let’s go with 100 and we’ll see if it ever fails

pocmatos

2019-11-20 20:41:41

Sure.

notjack

2019-11-20 20:42:10

So, the caching

notjack

2019-11-20 20:42:40

racket uses a fork of chezscheme maintained at https://github.com/racket/chezscheme right?

samth

2019-11-20 20:42:47

yes

notjack

2019-11-20 20:43:18

is that what gets built during the build of racket/racket? so the build cross a repository boundary?

samth

2019-11-20 20:43:32

I just tested: all commits: 13.4 100: 4.2 10: 4.1 1: 4.0

samth

2019-11-20 20:43:43

so basically no win from going less than 100

notjack

2019-11-20 20:44:04

interesting

pocmatos

2019-11-20 20:45:40

Unfortunately yes, the build crosses a repo boundary. I had several thoughts about this while doing gitlab but I think with actions we are in a better position. The issue here is that chez can change and break racket build with no changes to the racket repo.

pocmatos

2019-11-20 20:46:30

So I think we should somehow trigger the CS build and test jobs for pushes to racket/ChezScheme.

pocmatos

2019-11-20 20:47:01

This might need an action on the ChezScheme repo but it shouldn’t be impossible to achieve (I hope) - given the rough edges still in gha.

samth

2019-11-20 20:48:53

There’s a github action you can use actions/cache to do caching

samth

2019-11-20 20:50:04

https://help.github.com/en/actions/automating-your-workflow-with-github-actions/events-that-trigger-workflows#external-events-repository_dispatch

notjack

2019-11-20 20:50:18

The racket/chezscheme repository seems to have some git submodules too - does changing the version imported with a submodule require a commit?

samth

2019-11-20 20:50:36

we can write a github action on the racket/ChezScheme repo to trigger that on the racket/racket repo

samth

2019-11-20 20:50:46

@notjack yes

notjack

2019-11-20 20:51:13

that’s good, then we at least don’t have to worry about triggering builds of racket/chezscheme whenever those dependencies get updated too

notjack

2019-11-20 20:52:01

Could we use a git submodule in racket/racket to import chezscheme and use that for building? It would require periodically keeping things in sync, which isn’t great, but would eliminate the need to do this cross-repo event triggering.

notjack

2019-11-20 20:52:39

(this kind of dependency-graph-based CI triggering and caching is what the system at my day job does and it gets real complicated real fast)

samth

2019-11-20 20:52:48

@notjack I would be in favor of that but traditionally @mflatt has not been

samth

2019-11-20 20:53:18

also you can build Racket with an external Chez Scheme, which is probably how we’d do caching

notjack

2019-11-20 20:54:00

drawback: we won’t find out if a commit to racket/chezscheme breaks racket/racket until we attempt to update the commit used to import the submodule

popa.bogdanp

2019-11-20 21:01:39

@popa.bogdanp has joined the channel

popa.bogdanp

2019-11-20 21:09:15

Hey folks! I haven’t been keeping up with the discussion and work around this, but I wanted to point out re. this comment1 that my setup-racket action is able to install snapshot builds of racket on all 3 platforms (though the implementation is a bit hacky), in case that might help. On Linux, installing a snapshot using this action takes less than 20 seconds so it should be much faster than building from source.

notjack

2019-11-20 21:14:56

welcome @popa.bogdanp :wave: good job on the racket setup action btw

popa.bogdanp

2019-11-20 21:17:04

Thanks! I mostly just ripped the guts out of the official setup-python action and based it on that :smile:

pocmatos

2019-11-20 21:17:38

Thanks for that. I might be using that very soon to speed up our PR workflow.

pocmatos

2019-11-20 21:17:49

By that - I mean your action.

pocmatos

2019-11-20 21:18:21

@samth To enable notifications to slack I need some sort of incoming webhook secret. Are you the admin for these things here on slack?

pocmatos

2019-11-20 21:45:55

I have been thinking for awhile and I might be missing the right technology. How can we go about testing Racket on FreeBSD? It doesn’t run on docker, there’s no github runner support for that os so we need to virtualize. The closest I got was to use vagrant to test racket manually but I know of no good way to script this atm. Any suggestions and PR demo’ing this would be great! :slightly_smiling_face:

samth

2019-11-20 21:46:34

yes, I’ll add the secret on github

pocmatos

2019-11-20 21:47:01

Once you add the secret, can you send me the secret variable name so I can create the workflow? Thanks.

samth

2019-11-20 21:53:18

there’s now a SLACK_WEBHOOK_URL secret

notjack

2019-11-20 21:57:38

@pocmatos based on https://wiki.freebsd.org/Docker and https://reviews.freebsd.org/D21570, it looks like the freebsd folks are actively trying to improve the docker<->freebsd situation

notjack

2019-11-20 21:58:14

asking the freebsd-virtualization mailing list what to do might be a good starting point

pocmatos

2019-11-20 22:02:53

Sure!

pocmatos

2019-11-20 22:02:59

@samth thanks.

pocmatos

2019-11-20 22:03:24

Shall we use the #notifications channel? failures only or everything?

samth

2019-11-20 22:03:57

yes (you have to use #notifications). and that channel currently has everything for all the other CI systems

samth

2019-11-20 22:04:06

except DrDr, we should fix that sometime :slightly_smiling_face:

samth

2019-11-20 22:04:19

anyone in this channel interested in DrDr should let me know …

pocmatos

2019-11-20 22:04:53

Is there a DrDr notifications system? Can you add me to the cc of that?

samth

2019-11-20 22:05:44

@pocmatos currently, DrDr notifies the responsible person or people for each file that fails

samth

2019-11-20 22:08:08

here’s the body of my most recent drdr email: DrDr has finished building push #53213 after 3.02h. <http://drdr.racket-lang.org/53213/> A file you are responsible for has a condition that may need inspecting. stderr: <http://drdr.racket-lang.org/53213/racket/share/pkgs/typed-racket-test/historical-counterexamples.rkt> <http://drdr.racket-lang.org/53213/cs/racket/share/pkgs/typed-racket-test/historical-counterexamples.rkt> I also get the emails for files with no one responsible; that looks like: DrDr has finished building push #53213 after 3.02h. <http://drdr.racket-lang.org/53213/> A file you are responsible for has a condition that may need inspecting. stderr: <http://drdr.racket-lang.org/53213/racket/share/pkgs/aws/aws/sigv4.rkt> <http://drdr.racket-lang.org/53213/pkg-src/build/make> <http://drdr.racket-lang.org/53213/cs/racket/share/pkgs/aws/aws/sigv4.rkt> unclean: <http://drdr.racket-lang.org/53213/racket/share/pkgs/aws/aws/sigv4.rkt> <http://drdr.racket-lang.org/53213/cs/racket/share/pkgs/aws/aws/sigv4.rkt>

notjack

2019-11-20 22:09:43

How much overlap is there between what DrDr does and what the package build server does? I’ve always been confused why there’s two of these systems.

pocmatos

2019-11-20 22:10:09

I thought DrDr was the package build server…

notjack

2019-11-20 22:10:18

I’m… not actually sure

pocmatos

2019-11-20 22:10:46

Me neither, but I always assumed that.

samth

2019-11-20 22:10:53

no, they’re totally different

pocmatos

2019-11-20 22:11:02

Whoops - bad assumptions.

samth

2019-11-20 22:11:11

@pocmatos maybe it’s worth writing these things down on the wiki … :slightly_smiling_face:

samth

2019-11-20 22:11:31

http://drdr.racket-lang.org\|drdr.racket-lang.org is a CI system for, roughly, the “main-distribution”

pocmatos

2019-11-20 22:11:33

@samth good idea. You explain it to me and I will try and put them down. :slightly_smiling_face:

pocmatos

2019-11-20 22:11:53

So, would that be something we could replace with GHA in the long term?

samth

2019-11-20 22:13:05

it works by checking out the latest racket/racket, building it and all packages in main-distribution and main-distribution-test, and then executing every racket file in every package either with raco test or racket depending on configuration.

samth

2019-11-20 22:13:25

it also now does this with a racketcs build, similarly executing everything

samth

2019-11-20 22:14:13

it runs on a single, bespoke, Linux server (located at IU). the configuration is a combination of the racket/drdr repository and a lot of state on that machine

samth

2019-11-20 22:14:23

basically nothing is containerized/protected

notjack

2019-11-20 22:14:54

or hermetic / easily reproducible?

samth

2019-11-20 22:15:11

we have a very full history of runs of the system, so you can go back in time, plus there’s logging/charting of timing results

samth

2019-11-20 22:15:27

every 100 builds is saved and downloadable

samth

2019-11-20 22:16:00

more info at http://drdr.racket-lang.org/help

pocmatos

2019-11-20 22:16:06

OK. I think with GHA + dashboard we could have that implemented.

samth

2019-11-20 22:16:12

and yes, things are not always easily reproducible

pocmatos

2019-11-20 22:16:21

Thanks.

samth

2019-11-20 22:16:42

by far the biggest challenge with doing that somewhere else is that it’s 3 hours of wall-clock time on a 12-core machine per run

samth

2019-11-20 22:16:54

~8 hours compute time

samth

2019-11-20 22:17:11

plus we’re storing a lot of data

pocmatos

2019-11-20 22:17:24

I have a 40 core machine I have been using for racket gitlab. Soon GHA.

pocmatos

2019-11-20 22:17:49

It should be speedier there - it has 2 Xeons 20cores each.

samth

2019-11-20 22:18:30

yes, although less than you’d hope

samth

2019-11-20 22:18:59

there are a number of individual tests that take 20+ minutes

notjack

2019-11-20 22:19:54

cpu-heavy tests or io-heavy tests?

samth

2019-11-20 22:24:36

both: http://drdr.racket-lang.org/53213/pkgs/racket-test/tests/pkg/test.rkt (io) http://drdr.racket-lang.org/53213/racket/share/pkgs/drracket-test/tests/drracket/language-test.rkt (gui concurrency) http://drdr.racket-lang.org/53213/racket/share/pkgs/typed-racket-test/run.rkt (cpu)

notjack

2019-11-20 22:27:23

hmms

samth

2019-11-20 22:28:31

In contrast, http://pkg-build.racket-lang.org\|pkg-build.racket-lang.org builds all the packages in a VM using https://github.com/racket/pkg-build

samth

2019-11-20 22:28:55

it builds each package using the current release, and rebuilds each package when the package has changed since the previous run

samth

2019-11-20 22:29:02

it runs once every 24 hours

samth

2019-11-20 22:29:15

it’s used to populate http://docs.racket-lang.org\|docs.racket-lang.org

samth

2019-11-20 22:29:52

similarly, https://plt.eecs.northwestern.edu/pkg-build/ does the same every 24 hours with the most recent snapshot, and builds every package (since the snapshot changed)

samth

2019-11-20 22:30:54

sometimes that machine does something different, and builds https://plt.eecs.northwestern.edu/release-pkg-build/ using the most recent release candidate

samth

2019-11-20 22:31:18

also every day, both of the sites listed at http://snapshot.racket-lang.org\|snapshot.racket-lang.org build Racket on a wide variety of platforms

samth

2019-11-20 22:32:01

(which is also a form of CI, since almost any kind of build error in any main-distribution package will cause them to fail)

samth

2019-11-20 22:32:26

also, during the release process, there are regular builds on a variety of platforms which you can get from http://pre-release.racket-lang.org\|pre-release.racket-lang.org

samth

2019-11-20 22:32:59

those (snapshots and pre-release) are built with https://github.com/racket/distro-build/

notjack

2019-11-20 22:34:42

So what are our goals for CI in racket? So far I’m hearing:

• Ensure that Racket works on the operating systems and architectures we claim to support • Get CI feedback faster (both by running things faster and by running them when dependencies change) • Make it easier to get historical CI data about Racket @pocmatos Did I miss any? Which of those is most important?

samth

2019-11-20 22:35:34

Reduce the amount of CI stuff we have to maintain

pocmatos

2019-11-20 22:35:50

@samth +1

notjack

2019-11-20 22:35:53

also, do we want to focus only on CI for the main distribution, or on the wider racket ecosystem?

samth

2019-11-20 22:36:06

Overall increase the quality of Racket software

samth

2019-11-20 22:36:40

I think doing both is good but we should start with racket/racket, follow up with racket/* and continue to the whole ecosystem

pocmatos

2019-11-20 22:36:58

With regards to your point 3, it not only about getting historical data but understanding the evolution of racket as a piece of software. Currently all the benchmark is done locally and it’s not straightforward to reproduce.

notjack

2019-11-20 22:38:13

if we had to pick between increasing CI performance/coverage, and reducing the maintenance burden of CI, which should we focus on?

pocmatos

2019-11-20 22:39:50

I would prioritize CI maintenance to CI performance to start with.

notjack

2019-11-20 22:40:23

I think that’s a good direction to go in

pocmatos

2019-11-20 22:40:44

Although coverage is high up there, not worried too much about performance except for PRs. We can reduce coverage there in order to improve performance. Have a push workflow that has more coverage and decent wall time, and a nightly scheduled run with full coverage.

notjack

2019-11-20 22:41:33

would we be okay decreasing coverage in the short term if it meant reducing maintenance burden?

pocmatos

2019-11-20 22:42:15

One thing that is quite worrying for me - from a maintainership point of view is the enormous amount of OS, architectures, vms, gcs and build configurations we say we support but most of those are not tested.

notjack

2019-11-20 22:42:31

yeah that spooks me too

pocmatos

2019-11-20 22:42:36

Like - an extreme example - the support for QNX.

pocmatos

2019-11-20 22:43:05

I didn’t even know what it was - it’s a commercial OS. We have code ifdefing for this thing but probably nobody cares.

pocmatos

2019-11-20 22:43:15

However Matthew is extremelly reluctant to remove the code.

notjack

2019-11-20 22:43:40

@pocmatos there’s that spreadsheet we made a while ago describing all the combinations and support status, mayhaps that ought to be shared more widely or moved to the wiki

pocmatos

2019-11-20 22:43:53

Maybe I should make a better case for it though, however how can we have code in that’s just impossible to test. We don’t even have evidence anyone uses it.

pocmatos

2019-11-20 22:44:16

@notjack i started trying to move that to the wiki? Did you miss the url i sent earlier?

notjack

2019-11-20 22:44:24

oh! I didn’t notice

pocmatos

2019-11-20 22:45:02

https://github.com/racket/racket/wiki/Continuous-Integration

pocmatos

2019-11-20 22:45:59

The QNX thing for example: https://github.com/racket/racket/issues/2906#issuecomment-553410336

notjack

2019-11-20 22:54:27

I think it’s reasonable to delete operating-system-specific and architecture-specific code that we 1) can’t find known users for and 2) can’t test. It will bitrot and stop working over time anyway. I’m all but certain that racket-on-QNX is currently broken, just because we’d have absolutely no way of knowing if it broke and no way to test for it.

notjack

2019-11-20 22:55:12

it’s a nontrivial maintenance burden

pocmatos

2019-11-20 22:58:17

I will try to discuss this with Matthew further since it’s also code that has been in our codebase since the days of PLT Scheme, with no testing whatsoever. And you are guaranteed to be correct - Racket QNX definitely doesn’t work but since it’s a commercial OS, which we can’t test, we’ll never know.

pocmatos

2019-11-20 22:58:31

It’s time for me to leave. Talk to you tomorrow/later.

notjack

2019-11-20 22:58:39

:wave:

samth

2019-11-21 00:39:33

I think Matthew’s feeling is that we learned something back then about how to make Racket work on qnx and we shouldn’t throw away that knowledge

samth

2019-11-21 00:39:53

I don’t think he’s under the impression that it currently works out of the box there

samth

2019-11-21 00:40:36

I think also the maintenance burden is low for that

sorawee

2019-11-21 02:18:10

@sorawee has joined the channel

sorawee

2019-11-21 02:25:34

Just want to chime in to say that: while I think that we should take advantage of any features available in GitHub Actions, I disagree with pushing users to use it for the following reasons:

• Some users might not want to use GitHub (trust, etc.) • Some users might not be able to use GitHub (company uses GitLab, etc.) I’m not saying that we must right now support multiple platforms. I think it makes perfect sense to focus on GitHub Actions for now. However, I don’t like how https://github.com/racket/racket/wiki/Continuous-Integration states that:

> We are committed to using GitHub Actions for CI as it provides an unparalleled level of integration with our current workflow. which, as I understand, says that we won’t ever support other platforms.

pocmatos

2019-11-21 06:50:52

Good morning (on my side of the world at least)!

pocmatos

2019-11-21 06:53:39

@sorawee thanks for your input, however I don’t really understand what you mean. CI is not used by the users. It’s a workflow service on the developers side. The developers cannot and should not have to maintain multiple CI systems, which is why we are trying to consolidate. What I meant with what I wrote in the wiki is that our efforts are at the moment to implement CI using GitHub Actions and in order to consolidate, ditch other CI platforms.

notjack

2019-11-21 06:57:09

Also when we’re talking about “CI” we mean specifically CI for the main distribution. The package build server will not and should not move to github actions.

sorawee

2019-11-21 06:58:17

Ah, I see now. I totally misunderstood and thought that this is travis-racket-like thing.

sorawee

2019-11-21 06:58:31

My apologies

pocmatos

2019-11-21 06:58:54

@sorawee ah, Gregs project? No - this has nothing to do with it. :slightly_smiling_face:

notjack

2019-11-21 07:01:10

Yup, CI for authors of Racket packages will not change :simple_smile: