
Argh, I somehow feel compelled to say that you can sort first and then do a linear pass to remove the duplicates. (But remove-duplicates has a fast implementation for eq? anyway iirc)
Yes, you’ve just learned nothing useful. You’re welcome.

Looks like sort followed by linear pass is twice as fast as sort followed by remove-duplicates.

(define (sort-and-dedup-1 lst)
(remove-duplicates (sort lst symbol<?) eq?))
(define (sort-and-dedup-4 lst)
(define (dedup lst prev)
(if (null? lst)
'()
(let ([ s (car lst) ])
(if (eq? s prev)
(dedup (cdr lst) prev)
(cons s (dedup (cdr lst) s))))))
(dedup (sort lst symbol<?) #f))

2 used a let loop
, but it was slower than #4 (learned that trick from Matthew Flatt !), and #3 was let loop
w/o the reverse
which was the quickest, but in reverse order.

% racket dedup.rkt
cpu time: 1825 real time: 1841 gc time: 772
cpu time: 925 real time: 940 gc time: 116
cpu time: 877 real time: 892 gc time: 38
cpu time: 892 real time: 907 gc time: 40

The slowest part, by far, was creating the list of a million symbols :) But I used a random-string
function I had lying around that uses crypto-random-bytes
(require file/sha1
racket/list
racket/random)
(define (random-string n)
(let* ([ half-n (ceiling (/ n 2)) ]
[ random-bytes (crypto-random-bytes half-n) ]
[ str (bytes->hex-string random-bytes) ])
(if (even? n)
str
(substring str 0 n))))
(define (random-symbols n)
(let loop ([ n n ][ result '() ])
(if (< n 1)
result
(loop (sub1 n) (cons (string->symbol (random-string 10)) result)))))

Hmm… I just realized my symbols are too long. With 10 random characters, there probably aren’t any duplicates :(

<sigh> with 4 million 3 character symbols, it’s a wash - I guess you can just ignore everything I said above :) % racket dedup.rkt
cpu time: 1728 real time: 1776 gc time: 113
cpu time: 1778 real time: 1824 gc time: 231
cpu time: 1764 real time: 1810 gc time: 163
cpu time: 1725 real time: 1771 gc time: 117

Now I’m curious about the code for remove-duplicates
- I suspect it’s using a hash, so the more dupes, the faster it is.

https://discord.gg/6Zq8sH5 , someone should post the discord racket link in the header of #general ?

> 3 was let loop w/o the reverse which was the quickest, but in reverse order. Well, you could have written symbol>?
and sorted them using this comparator. The let loop reversing would put the list back in the ascending order.

Will Discord do the same for Slack? :stuck_out_tongue: This is what’s called quid pro quo, right? lol

Or at least pin this message.

Seems the Slack link is not in the resources channel on Discord. I’ll pin it there.

What about remove-duplicates first, then sort?

@sorawee I didn’t find symbol>?
when I looked. Apparently there is one in #lang mischief
though.

you can create it yourself using symbol<?

Sure, but the timings didn’t seem to warrant it.

Good catch @laurent.orseau % racket dedup.rkt
cpu time: 1698 real time: 1745 gc time: 145
cpu time: 1740 real time: 1787 gc time: 237
cpu time: 1626 real time: 1673 gc time: 127
cpu time: 1711 real time: 1757 gc time: 213
cpu time: 78 real time: 80 gc time: 0

That’s with 3 character symbols, so lots of dupes.

Last one is: (define (sort-and-dedup-5 lst)
(sort (remove-duplicates lst eq?) symbol<?))

With 1M 5 char symbols (vs. 4M 3 char): % racket dedup.rkt
cpu time: 1225 real time: 1241 gc time: 233
cpu time: 968 real time: 983 gc time: 85
cpu time: 1275 real time: 1290 gc time: 396
cpu time: 947 real time: 961 gc time: 47
cpu time: 893 real time: 902 gc time: 173
https://gist.github.com/lojic/0a096547ec502facd6f5920cdcb00124

set the channel topic: Racket — http://racket-lang.org — http://pasterack.org - Discord -https://discord.gg/6Zq8sH5 - Slack invite link: http://racket-slack.herokuapp.com — Archived at https://benknoble.github.io/racket-slack-archive/

i didnt know users could have permissions to set the header lmaoo

I always forget about pasterack! Pasterack is is awesome!

The next Rhombus virtual discussion meeting is today at 1pm pacific time (3pm central time, 4pm eastern time). For today’s meeting, I’ve drafted a State of Rhombus document we can discuss: https://docs.google.com/document/d/10GTdmxo6Uty_-SQY8hrz5unCwtNi_YIsuI5yghmZ6hU/edit?usp=sharing
Zoom link: https://utah.zoom.us/j/96590513005 GitHub discussion thread: https://github.com/racket/rhombus-brainstorming/discussions/180#discussioncomment-1478832 Calendar event: https://calendar.google.com/event?action=TEMPLATE&tmeid=MG5jdmtoYmg2dXU1MG5vc2JlOWdjc2FkaWNfMjAyMTEwMTRUMjAwMDAwWiBqYWNraGZpcnRoQG0&tmsrc=jackhfirth%40gmail.com&scp=ALL

Is that in 1 hours time from now?

Yes

Meeting summary: we talked about the State of Rhombus document and agreed that it needs a few more concrete details about the next steps, especially in regards to our plan for Rhombus libraries. More information there would make it easier for people to find sections of the Rhombus project they can contribute to or take ownership of. I plan to add this information to the document and then we’ll review it again at the next meeting on October 28th.

@jeremiah.meert has joined the channel