wanpeebaw
2020-7-21 16:59:02

(require srfi/13 srfi/14) (define (valid-dna? seq) (string-every (char-set #\A #\C #\G #\T) seq)) Do I need to manually extract (char-set #\A #\C #\G #\T) to a separate definition as following version. Or the compiler/optimizer will recognize it as a constant and do it for me? Are there any performance difference between the two? (require srfi/13 srfi/14) (define (valid-dna? seq) (define nucleotides (char-set #\A #\C #\G #\T)) (string-every nucleotides seq))


sorawee
2020-7-21 17:10:38

You think that the second version is better (in terms of efficiency) somehow? Why?


wanpeebaw
2020-7-21 18:59:53

If valid-dna? is called multiple times, I thought (char-set #\A #\C #\G #\T) would be evaluate multiple times. The second one may only evaluate once at read/compile time because it’s a constant definition.


sorawee
2020-7-21 19:01:43

Several things: 1. If you intend to avoid evaluating (char-set #\A #\C #\G #\T) multiple times when valid-dna? is invoked multiple times, then you need to lift it outside of valid-dna?. Your second code still has (char-set #\A #\C #\G #\T) inside the function, so it won’t make a difference.


sorawee
2020-7-21 19:03:31
  1. Donald Knuth said “premature optimization is the root of all evil”. Judging that (char-set #\A #\C #\G #\T) is a constant time operation, whether you lift it or not probably won’t make a difference. Before doing any optimization, you should benchmark it first to see where the bottleneck is. Optimizing wrong spots won’t make your program faster.

wanpeebaw
2020-7-21 19:05:52

Like this? (require srfi/13 srfi/14) (define nucleotides (char-set #\A #\C #\G #\T)) (define (valid-dna? seq) (string-every nucleotides seq))


sorawee
2020-7-21 19:06:06

Yes


sorawee
2020-7-21 19:07:37

If you want to do that for the sake of readability, go for it. If you want to do it to optimize the function, no, it won’t make your program faster.


wanpeebaw
2020-7-21 19:08:41

I’m not trying to optimize a program. I just trying to avoid redundant computation.


sorawee
2020-7-21 19:08:58

That’s exactly what optimization is…


soegaard2
2020-7-21 19:08:58

Alternative: (define valid-dna? (let ([nucleotides (char-set #\A #\C #\G #\T)]) (λ (seq) (string-every nucleotides seq))))


wanpeebaw
2020-7-21 19:13:12

I thought it should be done at compile time. (char-set #\A #\C #\G #\T) should be a constant no matter where i put it.


sorawee
2020-7-21 19:15:31

I don’t know where you get the idea about “done at compile-time”, but it’s wrong. char-set is a function. It needs to be run at run-time.


sorawee
2020-7-21 19:16:06

Optimizer might be able to partially evaluate your program at compile-time, but that’s not guaranteed.


badkins
2020-7-21 19:52:37

@wanpeebaw a literal will be evaluated at compile time, e.g.: (define (valid-dna? seq) (for/and ([ c (in-string seq) ]) (member c '(#\A #\C #\G #\T))))