
@huu2uan has joined the channel

Spot the error (to run it, save it as “most-common.rkt”) #lang racket
;;; Problem
;;; Given a text file and an integer k, print the k most
;;; common words in the file (and the number of occurences)
;;; in decreasing frequency.
(define (solution file k)
(define freqs (make-hash))
(define (add word) (hash-update! freqs word add1 1))
; read and compute frequencies
(with-input-from-file file
(thunk
(for ([line (in-port read-word)])
(displayln line)
(add freqs))))
freqs)
(define (read-word in)
(define (peek) (peek-char in))
(define (skip) (read-char in))
(define (read) (read-char in))
(let loop ()
(define c (peek))
(cond
[(eof-object? c) (read)]
[(not (char-alphabetic? c)) (skip) (loop)]
[else (list->string
(for/list ([_ (in-naturals)]
#:break (not (char-alphabetic? (peek))))
(read)))])))
(solution "most-common.rkt" 3)

The number k isn’t used yet.

The problem is from: https://www.cs.tufts.edu/~nr/cs257/archive/don-knuth/pearls-2.pdf

@soegaard2 in-port
here is super slow. I don’t know why

In my case - I made a silly mistake.

(add freqs) should have been (add line)

It took a while before a noticed.

Ah, and it’s this that takes a lot of time

Make sense

The amount of slow down is surprising though.

I don’t get why hash-update! becomes the bottleneck.

It needs to compute a hash of a hash table, but the the tables isn’t that large.

Is there an existing function that works like uniq?


@soegaard2 remove-duplicates
, I think

I knew it!

Anyone know what I’ve got to do to get the racket package server to build documentation for my package, qualified-in? It’s been on the package server for a couple days. Locally, raco pkg install
builds the documentation correctly (I can view it with raco docs qualified-in
). https://pkgs.racket-lang.org/package/qualified-in

Seems like package server is dead

The last successful build is on August 14

The index of packages is rebuilt every few minutes, for example to point to a new version of the package (often that “pointing” is just registering the new checksum). But the packages themselves are built on the server to report whether they build, have dependency problems, etc, as well as link the documentation, only every few days.

CC: @jeapostrophe

@michaelmmacleod qualified-in
is a great idea btw, excellent job