Racket Slack Archive

soegaard2

2021-12-27 10:11:41

The difference between eq? and eqv? ought to be small.

Computing the hash for an element in a hasheq set is done using the adresss of the element only.

Computing the hash for a hash(equal) set is more involved. Two lists xs and ys with equal values needs to get the same hash value. And the only way to ensure that is to compute hash values for the individual elements and combine them.

That is computing a hash for eq takes time O(1) whereas computing a hash for equal takes time O(n) where n is the size of the data structure.

badkins

2021-12-27 15:46:59

Switching from eqv to eq dropped the time from 4.1 to 1.4 seconds, so it doesn’t seem small to me :)

soegaard2

2021-12-27 16:23:59

That’s indeed surprising.

sorawee

2021-12-27 16:28:15

From Discord

sorawee

2021-12-27 16:30:14

@badkins can you run my version (https://racket.discourse.group/t/racket-set-performance/479/8) which does double hashing, and see how long it takes compared to your latest version?

badkins

2021-12-27 16:32:23

Where is the code?

sorawee

2021-12-27 16:32:48

Well, your initial version, with the code that I wrote

sorawee

2021-12-27 16:33:15

I overrode the meaning of mutable-set, in-set, etc.

badkins

2021-12-27 16:33:23

Oh, that’s old news now. <https://github.com/lojic/LearningRacket/blob/master/advent-of-code–2021/solutions/day25/day25-set.rkt|This version> is the fast one now.

sorawee

2021-12-27 16:34:32

Yes, I understand that your latest version is fast. I want to know how fast is that version compared to yours on your machine

sorawee

2021-12-27 16:34:43

Because it also uses hasheq

badkins

2021-12-27 16:35:38

Gotcha. I’d have to find the correct version in the git history, and to be frank, I’ve reached my cap on time for this :)

badkins

2021-12-27 16:37:56

I did make a note to myself to revisit this later though. I’d like to see if I can help improve set / hash performance. It seems like it could get closer to Python, unless there’s just a fundamental limit with Chez Scheme that can’t match the C code that Python is using.

badkins

2021-12-27 16:39:01

Specifically with eqv? since eq? seems ok.

badkins

2021-12-27 16:40:21

Hmm… if Python tuples are immutable, maybe the hashing for them is more similar to a eq? ?

badkins

2021-12-27 16:41:10

Scratch that, I didn’t think it through - they could still be in different locations.

sorawee

2021-12-27 16:41:12

I’ve heard Python’s tuple is slower than list lol

sw5355700

2021-12-27 19:12:17

The expectation is that there would be a performance difference between the different hashing techniques and the difference in runtime performance wasn’t that significant. I don’t think its as surprising as it seems.

spdegabrielle

2021-12-27 19:43:12

I’ve seen is suggested before that docs should include performance characteristics of different operations. Is it worth collating the above and making a wiki page or a PR?

renatiko.04

2021-12-28 02:21:09

@renatiko.04 has joined the channel