arifshaikh.astro
2021-10-28 14:23:13

Does anyone know of any utility in racket that a maps a value in 0-1 to a (r, g, b) for a given color map?


laurent.orseau
2021-10-28 14:28:31

If you have 3 values, each in [0, 1], probably you just need to multiply by 256 (and maybe take the exact-floor of this).

If you have a single value in [0, 1], then I’m not sure there’s a standard representation for this? Though I guess it could be r/256 + g/256² + b/256³ , or the reverse order of this


laurent.orseau
2021-10-28 14:30:07

Or instead of 0–1 (which I interpreted as [0, 1]), maybe you mean a color index for plot, and you want its r, g, b value?



sorawee
2021-10-28 14:31:06

I thought @laurent.orseau was gonna offer an advice on space filling curve :stuck_out_tongue:


greg
2021-10-28 14:32:07

Oh! I see. Thanks! The resulting GUI is busy but manageable for a modest number of locally installed packages.


badkins
2021-10-28 14:32:29

@soegaard2 what is the relationship, if any, between math/matrix and flomat ?


laurent.orseau
2021-10-28 14:33:08

Oh yeah I can do that too :smile: Or maybe a discrete probability distribution over all the numbers that can be represented as finite binary numbers in [0, 1]? (and I guess you could well define the latter based on space filling curves!)


greg
2021-10-28 14:33:08

I looked again at my hasty/abandoned query, and realized the problem was a cycle due to base depending on racket-lib and vice versa.


greg
2021-10-28 14:34:09

(For my own immediate purpose it would be fine to handle this by just excluding base, I’m interested in things above that level.)


badkins
2021-10-28 14:39:08

My choice of Racket as my company’s primary language was based on a long term view, so I might as well take a long term view in the data science space :) I’m very much a newbie though, so I think I’ll get experience with the Python ecosystem to get the lay of the land, and then try and identify the best way to help move Racket forward in this area.


soegaard2
2021-10-28 14:39:08

None (besides the author :slightly_smiling_face: ).

Since math/matrix is implemented in Racket (using arrays from math/array) it handles matrices over both integers (bignums), rationals and floating points. All algorithms are implemented in Racket.

The matrices in flomat are basically represented as vectors of floating point layed out in the way BLAS/LAPACK expects. Almost all functions simply call the BLAS/LAPACK function.


arifshaikh.astro
2021-10-28 14:40:27

The problem I am trying to address is the following: I have a 2d scatter plot and now I want to use a colorbar created using a linear-gradient%. How do I now assign colors from this linear-gradient% to the scattered points depending on their value from a 3rd list. I was thinking on normalizing the values in the 3rd list to [0-1] and then using a function which gives a (r, g, b) for a value from the linear-gradient% . Let me know if there is a better way to do this.


badkins
2021-10-28 14:40:53

I realize I’m at the stage where I very much don’t know what I don’t know, but it seems to me that for data science & machine learning, floating point matrices would be sufficient, no?


soegaard2
2021-10-28 14:42:02

Yes. I believe so.


laurent.orseau
2021-10-28 14:43:20

If x is your value in [0, 1], how about r = floor(x*256) g = floor((1-x)*256) b = 127


laurent.orseau
2021-10-28 14:45:28

Hm, maybe your problem is the other way round. I’m a little confused.


laurent.orseau
2021-10-28 14:45:43

What’s the 3rd list?


arifshaikh.astro
2021-10-28 14:48:58

For example, I want to show z coordinate as a color in my scatter plot of the (x, y) coordinates given a list of X, Y, Z


laurent.orseau
2021-10-28 14:49:37

Ok. Then you should skip the linear-gradient% (from which it’s hard to query values) and use a formula like the one I suggested above


laurent.orseau
2021-10-28 14:50:22

Or: r = g = b = floor(x*200) if you want grayscale


laurent.orseau
2021-10-28 14:50:38

(while avoiding the whites, since your background is likely white)


arifshaikh.astro
2021-10-28 14:53:36

hmmm, let me see how this looks.


sorawee
2021-10-28 15:01:34

Alternatively, if you want rainbow color, you can use:

(define (convert x) (define c (hsv->color (hsv x 1 1))) (values (send c red) (send c green) (send c blue)))


sorawee
2021-10-28 15:01:42

Here’s an example


sorawee
2021-10-28 15:02:03

sorawee
2021-10-28 15:04:25

Terrible in practice since 0 and 1 have similar color. Perhaps you might want just a half of it


arifshaikh.astro
2021-10-28 15:06:55

Thanks. But yeah, I need a color-gradient that shows the min and max values nicely. Rainbow colors won’t be appropriate.


laurent.orseau
2021-10-28 15:09:29

Ah, too bad, I had just made the function manually: #lang racket (require pict racket/draw) (apply hc-append (for/list ([x (in-range 0 1 1/100)]) (define r (exact-floor (* 255 x))) (define g (exact-floor (* 255 (abs (* 2 (- x 1/2)))))) (define b (exact-floor (* 255 (cond [(< x 1/3) (* 3 x)] [(< x 2/3) (- 1 (* 3 (- x 1/3)))] [else (* 3 (- x 2/3))])))) (filled-rectangle 2 10 #:color (make-object color% r g b) #:draw-border? #false)))


laurent.orseau
2021-10-28 15:09:46

laurent.orseau
2021-10-28 15:11:29

Or if you replace x with (- 1 x) for red:


badkins
2021-10-28 15:22:43

Is it possible to call Python code directly from Racket? If so, would you mind posting a simple example?


laurent.orseau
2021-10-28 15:23:37

More colors on the same span: #lang racket (require pict racket/draw) (apply hc-append (for/list ([x (in-range 0 1 1/400)]) (define r (exact-floor (* 255 (- 1 x)))) (define g (exact-floor (* 255 (- 1 (abs (* 2 (- x 1/2))))))) (define b (exact-floor (* 255 (cond [(< x 1/4) (* 4 x)] [(< x 1/2) (- 1 (* 4 (- x 1/4)))] [(< x 3/4) (* 4 (- x 1/2))] [else (- 1 (* 4 (- x 3/4)))])))) (filled-rectangle 2 10 #:color (make-object color% r g b) #:draw-border? #false)))


laurent.orseau
2021-10-28 15:25:05

apart from system or process and friends?


soegaard2
2021-10-28 15:27:14

There were a #lang python at some point - but numpy and friends are not implemented in Python. They are wrappers over C code (mostly).


capfredf
2021-10-28 15:39:18

@mflatt I got an error when using pkg-build with docker bin/racket: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29` not found (required by bin/racket) the docker image, I think, is https://hub.docker.com/r/racket/pkg-build, which is based on 16.04. Looks like we need to use a newer version of ubuntu


badkins
2021-10-28 15:43:54

Yes, apart from those. I’m familiar with using those as a workaround, but I wasn’t sure if there was a more direct way.


mflatt
2021-10-28 15:48:28

If I understand, you built Racket on a newer Linux installation, and so it doesn’t run on older installations like the Docker image suggested by the pkg-build docs. You could either create a different Docker image to use (probably starting with a Dockerfile in the pkg-build package) or build a Racket distribution through a Docker container running an older Linux (like “racket/distro-build:unix-installer-test”).


capfredf
2021-10-28 15:51:09

Ah, yes, the Racket was built on 20.04


capfredf
2021-10-28 15:51:12

Thank you very much


laurent.orseau
2021-10-28 16:07:28

If you have lots of short calls to python, one (non-trivial) solution is to setup a server in python and communicate via channels. That would avoid the cost of starting a python program each time, but of course it’s substantially heavier to setup


ben.knoble
2021-10-28 16:10:46

seanbunderwood
2021-10-28 16:55:05

Dates and strings are also in R/julia/Python data tables. I’d say without them you won’t attract any data scientists, because time series and categorical data are also huge use cases. Text manipulation, too. And not having a unified library and data structure that can handle all of it would be a big inconvenience.


seanbunderwood
2021-10-28 16:59:45

That’s part of why I was pushing for building on top of Arrow. Just handling numbers is a decent 80% solution that probably covers most scientific computing, but the other 20% is still critical for users more in the data science space, and I think it’s probably better to take the up-front hit on supporting that stuff from the ground up than it is to go for the quick easy gains but risk getting painted into a corner.


seanbunderwood
2021-10-28 17:01:04

Or, to put it more pithily, the world seems to have largely moved on from Fortran, and now wants Pandas.


thechairman
2021-10-28 17:11:11

there’s the sawzall package that handles a lot of that kind of mixed tabular data


thechairman
2021-10-28 17:11:28

very reminiscent of the tidyverse R stuff


badkins
2021-10-28 17:24:27

@seanbunderwood I hear what you’re saying re: data tables, but can Python data tables be used directly for linear algebra / general matrix stuff? Or do people copy data out of data tables into something to do linear algebra on?


badkins
2021-10-28 17:25:46

I think the first step for me personally is to simply get started with some basic data science stuff to see what exists, and what’s missing. Then I’ll solicit input from others about possible options for moving forward.


badkins
2021-10-28 17:35:24

From <https://www.machinelearningplus.com/data-manipulation/101-python-datatable-exercises-pydatatable/|this page> it looks like copying is probably done w/ Python data tables: import datatable as dt df = dt.fread('<https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv>') # to pandas df pd_df = df.to_pandas() # to numpy arrays np_arrays = df.to_numpy() # to dictionary dic = df.to_dict() # to list list_ = df[:,"indus"].to_list() # to tuple tuples_ = df[:,"indus"].to_tuples() # to csv df.to_csv("BostonHousing.csv")


badkins
2021-10-28 17:43:56

Looking forward to @hazel’s talk at RacketCon !


badkins
2021-10-28 17:47:14

Just noticed that @alexharsanyi , from this thread, created the data-fram package that sawzall is built-upon. Looks like there some nice building blocks already in Racket.


seanbunderwood
2021-10-28 17:59:06

There is not necessarily any copying of the actual data, though you can if you want to. Typically what happens, though is that you’re just getting two different objects that point to the same underlying data.


thechairman
2021-10-28 17:59:54

yeah, the blocks are mostly all there, they just need some more glue imo


thechairman
2021-10-28 18:00:25

the ideal would be a nice self-contained system like R tidyverse or F#’s FsLab


seanbunderwood
2021-10-28 18:01:31

That is specifically how it works with functions like to_pandas or to_numpy. Going to a list of dicts or tuples is a bigger operation because you’re moving the data to a fundamentally different layout in memory.


badkins
2021-10-28 18:07:22

@seanbunderwood just so I understand, by “There is not necessarily any copying of the actual data”, are you saying that, in Python, one can load a datatable with a mix of numbers, dates, strings, etc., and then perform linear algebra on a subset of that datatable, without copying data? That sounds like magic to me :)


badkins
2021-10-28 18:09:52

Maybe @alexharsanyi and @soegaard2 can comment on the possibility of doing something similar from data-frame to flomat . I guess at a minimum, data-frame would need to store columns contiguously.


badkins
2021-10-28 18:10:58

Seems like it would be necessary to either not have a header row in the datatable, or have the header row be a separate datastructure from the main array.


ben.knoble
2021-10-28 18:14:09

greg
2021-10-28 18:28:44

Why does the racket-doc package depend on the drracket package?


greg
2021-10-28 18:29:07

That seems counter-intuitve.





greg
2021-10-28 18:36:19

I guess one reason I ask is that drracket is a somewhat “heavy” dependency, for example it pulls in the gui-lib package.


mflatt
2021-10-28 18:44:11

The racket-doc package also directly depends on gui, though. I vaguely remember being concerned about the drracket dependency, but discovering that it didn’t much matter. then again, I may misremember.


sorawee
2021-10-28 18:44:40

See also: https://github.com/racket/racket/pull/3215 which removes the DrRacket dep


samth
2021-10-28 18:59:46

I would like to merge that PR, or something like it, but there are two problems: 1. There’s no way to do indirect references to identifiers. Creating one seems hard. 2. @mflatt didn’t like the scribble/docnames approach, which moves some information that should be in eg the drracket package into racket-doc. I think if (1) was fixed, though, we could persuade him.


soegaard2
2021-10-28 19:05:58

The representation used in flomat is: ; BLAS/LAPACK represents matrices as one-dimensional arrays ; of numbers (S=single, D=double, X=complex or Z=double complex). ; This library uses arrays of doubles. (define _flomat (_cpointer 'flomat)) ; The array is wrapped in a struct, which besides ; a pointer to the array, holds the number of ; rows and columns. Future extension could be to ; allow different types of numbers, or perhaps ; choose specialized operations for triangular matrices. ... ; m = rows, n = cols, a = mxn array of doubles ; lda = leading dimension of a (see below) (struct flomat (m n a lda) #:methods gen:custom-write [(define write-proc flomat-print)] #:methods gen:equal+hash [(define equal-proc (λ (A B rec) (and (= (flomat-m A) (flomat-m B)) (= (flomat-n A) (flomat-n B)) (or (equal? (flomat-a A) (flomat-a B)) (flomat= A B epsilon))))) (define hash-proc ; TODO: Avoid allocation in hash-proc. (λ (A rec) (define-param (m n) A) (rec (cons m (cons n (flomat-&gt;vector A)))))) (define hash2-proc (λ (A rec) (define-param (m n) A) (rec (cons n (cons m (flomat-&gt;vector A))))))])


greg
2021-10-28 19:25:44

Oh I wasn’t aware of the context and the PR. I’ll check that out.


badkins
2021-10-28 19:26:17

@soegaard2 is (flomat-a) a vector ? If so, what is the type of the elements of the vector? I expect BLAS/LAPACK are not expecting boxed values, so it must be in an IEEE 64-bit floating point format, right?


soegaard2
2021-10-28 19:28:05

It’s a pointer to a piece of memory with floating points.

This allocates memory for an mxn matrix.

(define (alloc-flomat m n) (if (or (= m 0) (= n 0)) #f ; ~ NULL (cast (malloc (* m n) _double 'atomic) _pointer _flomat)))


greg
2021-10-28 19:29:31

> The racket-doc package also directly depends on gui, though. Well, derp. I see that now. Although that does makes me wonder why it would depend on gui as opposed to just gui-doc, I’ll see if I can figure out why.


badkins
2021-10-28 19:29:32

Oh! Wow. Ok, so to be able to convert from a data-frame to flomat seamlessly, it seems data-frame would have to use the same type of memory layout, and that seems very unlikely :)


notjack
2021-10-28 19:30:00

The next Rhombus meeting is happening today, details available here: https://github.com/racket/rhombus-brainstorming/discussions/180


greg
2021-10-28 19:30:18

(Maybe for reasons similar to what Sam just said. I’ll marinade myself in the PR…)


soegaard2
2021-10-28 19:30:20

Yes.


badkins
2021-10-28 19:31:03

I have no idea how accepted the Arrow memory layout is, but hypothetically, I wonder how hard it would be to convert flomat to use that since you’re managing memory so directly already. Maybe it’s already close to it.


badkins
2021-10-28 19:35:29

I think I now understand a little more clearly what @seanbunderwood was trying to convey :) It does seem like a data frame that uses the same memory layout as flomat would be very handy to avoid having to copy tons of data. My main goal would be to be able to have zero-copy within Racket code, but if we could get interop with other languages also, that would be a win.


greg
2021-10-28 19:36:21

Also to be fair I’m not sure that gui-lib is even all that “heavy” per se. I think I’m sensitized to it because, if someone does end up installing it on a headless server, it sets the stage for various awkward scenarios.


soegaard2
2021-10-28 19:36:21

The lda can be used to make “gaps” between rows.


soegaard2
2021-10-28 19:36:46

It’s normally used to make a submatrix that shares the memory with a larger matrix.


soegaard2
2021-10-28 19:37:31

Potentially one could store other information in such gaps - but I think it would be a hassle to work with.


badkins
2021-10-28 19:39:59

Yeah, I was just (naively?) thinking that if you had a data frame like: 2021-10-24T12:00:00 12.34 32.54 2021-10-24T12:00:01 67.43 28.98 2021-10-24T12:00:02 42.97 31.21 you would be able to “convert” the 2x3 number portion to a flomat array w/o copying.


badkins
2021-10-28 19:41:06

Because they would both be viewing memory as: 12.34 67.43 42.97 32.54 28.98 31.21


badkins
2021-10-28 19:42:55

I suppose that would bring up memory ownership issues, but in a worst case, a memcpy of the contiguous data would probably be much quicker than a loop.


seanbunderwood
2021-10-28 20:30:04

Also, a memcpy in order to get set up for a matrix multiply may not be a big deal in practice. Multiplying matrices is an embarrassingly small percentage of where working data scientists actually spend their time. Maybe 90% of it is data cleaning, prep, EDA, stuff like that. And that’s a spot where the bottleneck is how fast humans can work, not CPUs or memory controllers.


seanbunderwood
2021-10-28 20:33:15

Like, I think that’s where Breeze failed on the JVM. They focused too much on making the math efficient & powerful, but the ergonomics are weak, so, when I’m stuck using it, I end up feeling like it’s an uphill struggle to do my actual job.



alexharsanyi
2021-10-28 22:03:18

You might also want to look at this: https://docs.racket-lang.org/colormaps/index.html


laurent.orseau
2021-10-28 22:16:24

Is it possible to typeset something like > ;;; some comment with racketblock and (code:comment _content_)? @racketblock[ (code:comment ";;; something") gives me: ; ;;; something


spdegabrielle
2021-10-28 22:24:40

Thank for the notes. Very appreciated.


mflatt
2021-10-28 22:47:38

I don’t think racketblock offers a way to do what you want with code:comment. You’d have to escape at the point where you want a comment and generate the output including a leading ;.


laurent.orseau
2021-10-28 22:55:08

I see, thank you


gknauth
2021-10-29 02:09:23

iowa?


joel
2021-10-29 02:10:53

Feed…corn… maybe Iowa was a bit too lateral


gknauth
2021-10-29 02:12:06

Got it. This is why I asked:



joel
2021-10-29 02:20:39

Anyways I’m a Minnesota guy. Couldn’t bring myself to name it Iowa


a11ce
2021-10-29 04:01:58

how do i get drracket to log stuff? i tried set PLTSTDERR info; drracket but nothing is printing


notjack
2021-10-29 04:20:40

DrRacket has a log viewer you can use