Racket Slack Archive

arifshaikh.astro

2021-10-28 14:23:13

Does anyone know of any utility in racket that a maps a value in 0-1 to a (r, g, b) for a given color map?

laurent.orseau

2021-10-28 14:28:31

If you have 3 values, each in [0, 1], probably you just need to multiply by 256 (and maybe take the exact-floor of this).

If you have a single value in [0, 1], then I’m not sure there’s a standard representation for this? Though I guess it could be r/256 + g/256² + b/256³ , or the reverse order of this

laurent.orseau

2021-10-28 14:30:07

Or instead of 0–1 (which I interpreted as [0, 1]), maybe you mean a color index for plot, and you want its r, g, b value?

laurent.orseau

2021-10-28 14:31:02

If the latter, you probably want https://docs.racket-lang.org/plot/utils.html#%28def._%28%28lib._plot%2Futils..rkt%29._-~3epen-color%29%29

sorawee

2021-10-28 14:31:06

I thought @laurent.orseau was gonna offer an advice on space filling curve :stuck_out_tongue:

greg

2021-10-28 14:32:07

Oh! I see. Thanks! The resulting GUI is busy but manageable for a modest number of locally installed packages.

badkins

2021-10-28 14:32:29

@soegaard2 what is the relationship, if any, between math/matrix and flomat ?

laurent.orseau

2021-10-28 14:33:08

Oh yeah I can do that too :smile: Or maybe a discrete probability distribution over all the numbers that can be represented as finite binary numbers in [0, 1]? (and I guess you could well define the latter based on space filling curves!)

greg

2021-10-28 14:33:08

I looked again at my hasty/abandoned query, and realized the problem was a cycle due to base depending on racket-lib and vice versa.

greg

2021-10-28 14:34:09

(For my own immediate purpose it would be fine to handle this by just excluding base, I’m interested in things above that level.)

badkins

2021-10-28 14:39:08

My choice of Racket as my company’s primary language was based on a long term view, so I might as well take a long term view in the data science space :) I’m very much a newbie though, so I think I’ll get experience with the Python ecosystem to get the lay of the land, and then try and identify the best way to help move Racket forward in this area.

soegaard2

2021-10-28 14:39:08

None (besides the author :slightly_smiling_face: ).

Since math/matrix is implemented in Racket (using arrays from math/array) it handles matrices over both integers (bignums), rationals and floating points. All algorithms are implemented in Racket.

The matrices in flomat are basically represented as vectors of floating point layed out in the way BLAS/LAPACK expects. Almost all functions simply call the BLAS/LAPACK function.

arifshaikh.astro

2021-10-28 14:40:27

The problem I am trying to address is the following: I have a 2d scatter plot and now I want to use a colorbar created using a linear-gradient%. How do I now assign colors from this linear-gradient% to the scattered points depending on their value from a 3rd list. I was thinking on normalizing the values in the 3rd list to [0-1] and then using a function which gives a (r, g, b) for a value from the linear-gradient% . Let me know if there is a better way to do this.

badkins

2021-10-28 14:40:53

I realize I’m at the stage where I very much don’t know what I don’t know, but it seems to me that for data science & machine learning, floating point matrices would be sufficient, no?

soegaard2

2021-10-28 14:42:02

Yes. I believe so.

laurent.orseau

2021-10-28 14:43:20

If x is your value in [0, 1], how about r = floor(x*256) g = floor((1-x)*256) b = 127

laurent.orseau

2021-10-28 14:45:28

Hm, maybe your problem is the other way round. I’m a little confused.

laurent.orseau

2021-10-28 14:45:43

What’s the 3rd list?

arifshaikh.astro

2021-10-28 14:48:58

For example, I want to show z coordinate as a color in my scatter plot of the (x, y) coordinates given a list of X, Y, Z

laurent.orseau

2021-10-28 14:49:37

Ok. Then you should skip the linear-gradient% (from which it’s hard to query values) and use a formula like the one I suggested above

laurent.orseau

2021-10-28 14:50:22

Or: r = g = b = floor(x*200) if you want grayscale

laurent.orseau

2021-10-28 14:50:38

(while avoiding the whites, since your background is likely white)

arifshaikh.astro

2021-10-28 14:53:36

hmmm, let me see how this looks.

sorawee

2021-10-28 15:01:34

Alternatively, if you want rainbow color, you can use:

(define (convert x) (define c (hsv->color (hsv x 1 1))) (values (send c red) (send c green) (send c blue)))

sorawee

2021-10-28 15:01:42

Here’s an example

sorawee

2021-10-28 15:02:03

sorawee

2021-10-28 15:04:25

Terrible in practice since 0 and 1 have similar color. Perhaps you might want just a half of it

arifshaikh.astro

2021-10-28 15:06:55

Thanks. But yeah, I need a color-gradient that shows the min and max values nicely. Rainbow colors won’t be appropriate.

laurent.orseau

2021-10-28 15:09:29

Ah, too bad, I had just made the function manually: #lang racket (require pict racket/draw) (apply hc-append (for/list ([x (in-range 0 1 1/100)]) (define r (exact-floor (* 255 x))) (define g (exact-floor (* 255 (abs (* 2 (- x 1/2)))))) (define b (exact-floor (* 255 (cond [(< x 1/3) (* 3 x)] [(< x 2/3) (- 1 (* 3 (- x 1/3)))] [else (* 3 (- x 2/3))])))) (filled-rectangle 2 10 #:color (make-object color% r g b) #:draw-border? #false)))

laurent.orseau

2021-10-28 15:09:46

laurent.orseau

2021-10-28 15:11:29

Or if you replace x with (- 1 x) for red:

badkins

2021-10-28 15:22:43

Is it possible to call Python code directly from Racket? If so, would you mind posting a simple example?

laurent.orseau

2021-10-28 15:23:37

More colors on the same span: #lang racket (require pict racket/draw) (apply hc-append (for/list ([x (in-range 0 1 1/400)]) (define r (exact-floor (* 255 (- 1 x)))) (define g (exact-floor (* 255 (- 1 (abs (* 2 (- x 1/2))))))) (define b (exact-floor (* 255 (cond [(< x 1/4) (* 4 x)] [(< x 1/2) (- 1 (* 4 (- x 1/4)))] [(< x 3/4) (* 4 (- x 1/2))] [else (- 1 (* 4 (- x 3/4)))])))) (filled-rectangle 2 10 #:color (make-object color% r g b) #:draw-border? #false)))

laurent.orseau

2021-10-28 15:25:05

apart from system or process and friends?

soegaard2

2021-10-28 15:27:14

There were a #lang python at some point - but numpy and friends are not implemented in Python. They are wrappers over C code (mostly).

capfredf

2021-10-28 15:39:18

@mflatt I got an error when using pkg-build with docker bin/racket: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29` not found (required by bin/racket) the docker image, I think, is https://hub.docker.com/r/racket/pkg-build, which is based on 16.04. Looks like we need to use a newer version of ubuntu

badkins

2021-10-28 15:43:54

Yes, apart from those. I’m familiar with using those as a workaround, but I wasn’t sure if there was a more direct way.

mflatt

2021-10-28 15:48:28

If I understand, you built Racket on a newer Linux installation, and so it doesn’t run on older installations like the Docker image suggested by the pkg-build docs. You could either create a different Docker image to use (probably starting with a Dockerfile in the pkg-build package) or build a Racket distribution through a Docker container running an older Linux (like “racket/distro-build:unix-installer-test”).

capfredf

2021-10-28 15:51:09

Ah, yes, the Racket was built on 20.04

capfredf

2021-10-28 15:51:12

Thank you very much

laurent.orseau

2021-10-28 16:07:28

If you have lots of short calls to python, one (non-trivial) solution is to setup a server in python and communicate via channels. That would avoid the cost of starting a python program each time, but of course it’s substantially heavier to setup

ben.knoble

2021-10-28 16:10:46

https://github.com/Aeva/snake-oil

seanbunderwood

2021-10-28 16:55:05

Dates and strings are also in R/julia/Python data tables. I’d say without them you won’t attract any data scientists, because time series and categorical data are also huge use cases. Text manipulation, too. And not having a unified library and data structure that can handle all of it would be a big inconvenience.

seanbunderwood

2021-10-28 16:59:45

That’s part of why I was pushing for building on top of Arrow. Just handling numbers is a decent 80% solution that probably covers most scientific computing, but the other 20% is still critical for users more in the data science space, and I think it’s probably better to take the up-front hit on supporting that stuff from the ground up than it is to go for the quick easy gains but risk getting painted into a corner.

seanbunderwood

2021-10-28 17:01:04

Or, to put it more pithily, the world seems to have largely moved on from Fortran, and now wants Pandas.

thechairman

2021-10-28 17:11:11

there’s the sawzall package that handles a lot of that kind of mixed tabular data

thechairman

2021-10-28 17:11:28

very reminiscent of the tidyverse R stuff

badkins

2021-10-28 17:24:27

@seanbunderwood I hear what you’re saying re: data tables, but can Python data tables be used directly for linear algebra / general matrix stuff? Or do people copy data out of data tables into something to do linear algebra on?

badkins

2021-10-28 17:25:46

I think the first step for me personally is to simply get started with some basic data science stuff to see what exists, and what’s missing. Then I’ll solicit input from others about possible options for moving forward.

badkins

2021-10-28 17:35:24

From <https://www.machinelearningplus.com/data-manipulation/101-python-datatable-exercises-pydatatable/|this page> it looks like copying is probably done w/ Python data tables: import datatable as dt df = dt.fread('<https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv>') # to pandas df pd_df = df.to_pandas() # to numpy arrays np_arrays = df.to_numpy() # to dictionary dic = df.to_dict() # to list list_ = df[:,"indus"].to_list() # to tuple tuples_ = df[:,"indus"].to_tuples() # to csv df.to_csv("BostonHousing.csv")

badkins

2021-10-28 17:43:56

Looking forward to @hazel’s talk at RacketCon !

badkins

2021-10-28 17:47:14

Just noticed that @alexharsanyi , from this thread, created the data-fram package that sawzall is built-upon. Looks like there some nice building blocks already in Racket.

seanbunderwood

2021-10-28 17:59:06

There is not necessarily any copying of the actual data, though you can if you want to. Typically what happens, though is that you’re just getting two different objects that point to the same underlying data.

thechairman

2021-10-28 17:59:54

yeah, the blocks are mostly all there, they just need some more glue imo

thechairman

2021-10-28 18:00:25

the ideal would be a nice self-contained system like R tidyverse or F#’s FsLab

seanbunderwood

2021-10-28 18:01:31

That is specifically how it works with functions like to_pandas or to_numpy. Going to a list of dicts or tuples is a bigger operation because you’re moving the data to a fundamentally different layout in memory.

badkins

2021-10-28 18:07:22

@seanbunderwood just so I understand, by “There is not necessarily any copying of the actual data”, are you saying that, in Python, one can load a datatable with a mix of numbers, dates, strings, etc., and then perform linear algebra on a subset of that datatable, without copying data? That sounds like magic to me :)

badkins

2021-10-28 18:09:52

Maybe @alexharsanyi and @soegaard2 can comment on the possibility of doing something similar from data-frame to flomat . I guess at a minimum, data-frame would need to store columns contiguously.

badkins

2021-10-28 18:10:58

Seems like it would be necessary to either not have a header row in the datatable, or have the header row be a separate datastructure from the main array.

ben.knoble

2021-10-28 18:14:09

Feedback welcome: https://pkgd.racket-lang.org/pkgn/package/scribble-lp2-manual

greg

2021-10-28 18:28:44

Why does the racket-doc package depend on the drracket package?

greg

2021-10-28 18:29:07

That seems counter-intuitve.

greg

2021-10-28 18:29:16

https://github.com/racket/racket/blob/master/pkgs/racket-doc/info.rkt#L38

greg

2021-10-28 18:29:25

https://pkgs.racket-lang.org/package/racket-doc

greg

2021-10-28 18:35:26

It looks like that happened about 5 years ago to update the style guide in https://github.com/racket/racket/commit/5f5fc0935d88abf389e465c7e34d61e351dd2e8a#diff-0802933065577058b09c5a205086d9284177f05251729e55c8c9e4212517c0cb

greg

2021-10-28 18:36:19

I guess one reason I ask is that drracket is a somewhat “heavy” dependency, for example it pulls in the gui-lib package.

mflatt

2021-10-28 18:44:11

The racket-doc package also directly depends on gui, though. I vaguely remember being concerned about the drracket dependency, but discovering that it didn’t much matter. then again, I may misremember.

sorawee

2021-10-28 18:44:40

See also: https://github.com/racket/racket/pull/3215 which removes the DrRacket dep

samth

2021-10-28 18:59:46

I would like to merge that PR, or something like it, but there are two problems: 1. There’s no way to do indirect references to identifiers. Creating one seems hard. 2. @mflatt didn’t like the scribble/docnames approach, which moves some information that should be in eg the drracket package into racket-doc. I think if (1) was fixed, though, we could persuade him.

soegaard2

2021-10-28 19:05:58

The representation used in flomat is: ; BLAS/LAPACK represents matrices as one-dimensional arrays ; of numbers (S=single, D=double, X=complex or Z=double complex). ; This library uses arrays of doubles. (define _flomat (_cpointer 'flomat)) ; The array is wrapped in a struct, which besides ; a pointer to the array, holds the number of ; rows and columns. Future extension could be to ; allow different types of numbers, or perhaps ; choose specialized operations for triangular matrices. ... ; m = rows, n = cols, a = mxn array of doubles ; lda = leading dimension of a (see below) (struct flomat (m n a lda) #:methods gen:custom-write [(define write-proc flomat-print)] #:methods gen:equal+hash [(define equal-proc (λ (A B rec) (and (= (flomat-m A) (flomat-m B)) (= (flomat-n A) (flomat-n B)) (or (equal? (flomat-a A) (flomat-a B)) (flomat= A B epsilon))))) (define hash-proc ; TODO: Avoid allocation in hash-proc. (λ (A rec) (define-param (m n) A) (rec (cons m (cons n (flomat->vector A)))))) (define hash2-proc (λ (A rec) (define-param (m n) A) (rec (cons n (cons m (flomat->vector A))))))])

greg

2021-10-28 19:25:44

Oh I wasn’t aware of the context and the PR. I’ll check that out.

badkins

2021-10-28 19:26:17

@soegaard2 is (flomat-a) a vector ? If so, what is the type of the elements of the vector? I expect BLAS/LAPACK are not expecting boxed values, so it must be in an IEEE 64-bit floating point format, right?

soegaard2

2021-10-28 19:28:05

It’s a pointer to a piece of memory with floating points.

This allocates memory for an mxn matrix.

(define (alloc-flomat m n) (if (or (= m 0) (= n 0)) #f ; ~ NULL (cast (malloc (* m n) _double 'atomic) _pointer _flomat)))

greg

2021-10-28 19:29:31

> The racket-doc package also directly depends on gui, though. Well, derp. I see that now. Although that does makes me wonder why it would depend on gui as opposed to just gui-doc, I’ll see if I can figure out why.

badkins

2021-10-28 19:29:32

Oh! Wow. Ok, so to be able to convert from a data-frame to flomat seamlessly, it seems data-frame would have to use the same type of memory layout, and that seems very unlikely :)

notjack

2021-10-28 19:30:00

The next Rhombus meeting is happening today, details available here: https://github.com/racket/rhombus-brainstorming/discussions/180

greg

2021-10-28 19:30:18

(Maybe for reasons similar to what Sam just said. I’ll marinade myself in the PR…)

soegaard2

2021-10-28 19:30:20

Yes.

badkins

2021-10-28 19:31:03

I have no idea how accepted the Arrow memory layout is, but hypothetically, I wonder how hard it would be to convert flomat to use that since you’re managing memory so directly already. Maybe it’s already close to it.

badkins

2021-10-28 19:35:29

I think I now understand a little more clearly what @seanbunderwood was trying to convey :) It does seem like a data frame that uses the same memory layout as flomat would be very handy to avoid having to copy tons of data. My main goal would be to be able to have zero-copy within Racket code, but if we could get interop with other languages also, that would be a win.

greg

2021-10-28 19:36:21

Also to be fair I’m not sure that gui-lib is even all that “heavy” per se. I think I’m sensitized to it because, if someone does end up installing it on a headless server, it sets the stage for various awkward scenarios.

soegaard2

2021-10-28 19:36:21

The lda can be used to make “gaps” between rows.

soegaard2

2021-10-28 19:36:46

It’s normally used to make a submatrix that shares the memory with a larger matrix.

soegaard2

2021-10-28 19:37:31

Potentially one could store other information in such gaps - but I think it would be a hassle to work with.

badkins

2021-10-28 19:39:59

Yeah, I was just (naively?) thinking that if you had a data frame like: 2021-10-24T12:00:00 12.34 32.54 2021-10-24T12:00:01 67.43 28.98 2021-10-24T12:00:02 42.97 31.21 you would be able to “convert” the 2x3 number portion to a flomat array w/o copying.

badkins

2021-10-28 19:41:06

Because they would both be viewing memory as: 12.34 67.43 42.97 32.54 28.98 31.21

badkins

2021-10-28 19:42:55

I suppose that would bring up memory ownership issues, but in a worst case, a memcpy of the contiguous data would probably be much quicker than a loop.

seanbunderwood

2021-10-28 20:30:04

Also, a memcpy in order to get set up for a matrix multiply may not be a big deal in practice. Multiplying matrices is an embarrassingly small percentage of where working data scientists actually spend their time. Maybe 90% of it is data cleaning, prep, EDA, stuff like that. And that’s a spot where the bottleneck is how fast humans can work, not CPUs or memory controllers.

seanbunderwood

2021-10-28 20:33:15

Like, I think that’s where Breeze failed on the JVM. They focused too much on making the math efficient & powerful, but the ergonomics are weak, so, when I’m stuck using it, I end up feeling like it’s an uphill struggle to do my actual job.

notjack

2021-10-28 21:22:32

Meeting notes here: https://github.com/racket/rhombus-brainstorming/discussions/180#discussioncomment-1555545

alexharsanyi

2021-10-28 22:03:18

You might also want to look at this: https://docs.racket-lang.org/colormaps/index.html

laurent.orseau

2021-10-28 22:16:24

Is it possible to typeset something like > ;;; some comment with racketblock and (code:comment _content_)? @racketblock[ (code:comment ";;; something") gives me: ; ;;; something

spdegabrielle

2021-10-28 22:24:40

Thank for the notes. Very appreciated.

mflatt

2021-10-28 22:47:38

I don’t think racketblock offers a way to do what you want with code:comment. You’d have to escape at the point where you want a comment and generate the output including a leading ;.

laurent.orseau

2021-10-28 22:55:08

I see, thank you

gknauth

2021-10-29 02:09:23

iowa?

joel

2021-10-29 02:10:53

Feed…corn… maybe Iowa was a bit too lateral

gknauth

2021-10-29 02:12:06

Got it. This is why I asked:

joel

2021-10-29 02:20:10

I ended up going in a different direction: https://docs.racket-lang.org/splitflap/index.html\|https://docs.racket-lang.org/splitflap/index.html

joel

2021-10-29 02:20:39

Anyways I’m a Minnesota guy. Couldn’t bring myself to name it Iowa

a11ce

2021-10-29 04:01:58

how do i get drracket to log stuff? i tried set PLTSTDERR info; drracket but nothing is printing

notjack

2021-10-29 04:20:40

DrRacket has a log viewer you can use