@samth Do you (or anyone else, of course) have any textbook/common examples of situations in which you’d benefit from using in-value in a for loop? Now that you mention it, I think a fair degree of situations I’ve ran into could simply make use of define in the body, or define-values if several bindings need to be made (to prevent a big series of defines in immediate succession). I initially thought I had just provided a particularly bad example, but after trying to think of another, better example, it seems I could just use define for most if not all of them.
It’s most useful if you’re going to use the result in another sequence and in the body, or with #:when
Ahh, I think I understand why it would be useful with #:when - because the define in the body wouldn’t be visible to the #:when, I think.
The other part - about using the result in another sequence as well as the body - isn’t something I understand/comprehend to be honest, but I think I probably will just need to come upon the situation myself and end up asking here, at which point someone will say “in values!” and I’ll kick myself, haha.
Thank you very much for your help. I was making heavy use of named let in most of my code, and realised I could/should probably replace them with for/… constructs if I can to be more idiomatic. Again, sorry if the first example was subpar.
@sydney.lambda Here’s a usage of in-value
I found in some of my code: (define-values (exported-variables exported-syntax) (module->exports mod))
(for*/set ([export-list (in-list (list exported-variables exported-syntax))]
[phase-export-list (in-list export-list)]
[ph (in-value (phase (first phase-export-list)))]
[export (in-list (rest phase-export-list))])
(define name (first export))
(module-binding mod ph name))
A second example: (make-constructor-style-printer
(λ (_) type-name)
(λ (this)
(for*/list ([i (in-range size)]
[kw (in-value (keyset-ref fields i))]
[item (in-list (list (unquoted-printing-keyword kw)
(accessor this i)))])
item)))
Thanks @notjack :slightly_smiling_face: So, in the second example, it’s useful because you’re referencing “kw” in the binding form for “item”, right? It could be my eyes missing something, but is there a reason you can’t use define in the body for “ph” in the first example? Or is it just because it’s conceptually part of the other bindings, if you know what I mean?
Correct, that’s why I used it in the second example. In the first example I’m not actually sure why I used it. A define
in the loop body would have worked too. I think I used it because I didn’t like the idea of repeatedly calling (phase (first phase-export-list))
multiple times. Using in-value
means it’s only called once per export-list, whereas doing it in the loop body would have called it multiple times per export list.
And for racket/base
the export list is something like 1700+ items long
Ahh, I never thought of that! So it’s useful for efficiency too in that situation. Got it, thanks.
Another 6502/assembly question, if that’s okay. I’m leaving the parser until last for my assembler and doing the “interesting” part first — the actual code generation. Therefore, I’m assuming I’ve been given a parse tree by the future parser. I just wanted to ask if my parse tree looks reasonable: ;assembly code
LDX #LO $1234
LDA $00
BEQ NONZERO
LDX #$2A
NONZERO: LDA #$01
;assumed" parse tree
(define the-program
'([OPCODE (NAME . LDX)
(ARG . (LOBYTE 4660))
(MODE . IMM)]
[OPCODE (NAME . LDA)
(ARG . 0)
(MODE . IMM)]
[OPCODE (NAME . BEQ)
(ARG . NONZERO)
(MODE . REL)]
[OPCODE (NAME . LDX)
(ARG . 42)
(MODE . IMM)]
[LABEL (NAME . NONZERO)]
[OPCODE (NAME . LDA)
(ARG . 1)
(MODE . IMM)]))
I decided to have the parser do the job of converting number literals in the code to decimal to make Racket/the assembler’s job easier (can just treat all numbers as decimal without converting) especially considering the use of # in the assembler syntax.
I did a little test in Python and I’m pretty sure I can have the parser work out the addressing modes by just using a regex on the following operand to check what the addressing mode is, and then including that mode with the arg’s value in the operand node.
Just to clarify, the LO means grab the low or high byte of an address. It was a bit of an afterthought as I forgot about that feature, so I’m not sure if just wrapping the argument’s value in (LOBYTE n) or (HIBYTE n) is a good way to do it, but it seems to work okay.
Sorry for the wall of text, and I hope you guys don’t mind the assembler/6502 related questions I’ve been posting as of late. Thank you :slightly_smiling_face:
I think that’s a perfectly reasonable parse tree representation, and in particular I think it’s a good idea to use plain Racket numbers in the parse tree.
Question: why is NONZERO
wrapped in parens sometimes but not others? Is that a typo, or does it mean something?
ahhh sorry, good catch, that’s a typo. I initially had all the arg values wrapped in a list due to sometimes having them wrapped in (LO n) but realised that was rather unnecessary.
Thank you very much for having a look over it for me. I’ve had this niggling doubt in the back of mind whilst coding the assembler “what if that’s an an unrealistic parse tree.. I could have to rewrite even more than I usually do..”
So your parse tree is a bunch of lists in lists in lists right? That means when you want to examine it you’ll call a bunch of first
/ second
/ third
functions (or use pattern matching) on things which can make it hard to tell what the code is doing and will mask bugs if you construct misshapen trees accidentally. You may find it more readable to make structs instead, so the number of fields and their names are known and documented.
Like (struct opcode (name arg mode) #:transparent)
and (struct label (name) #:transparent)
Funnily enough, I just now I added this type: (struct src (tree labels))
but I hadn’t thought of having an opcode and label type specifically, thanks for the idea. That’s much better - self-documenting and prevents a class of errors as you mentioned.
@sydney.lambda You can also use “smart constructors” with keyword arguments to make things clearer when dealing with structs that have many fields, or whose fields don’t have an obvious order: (struct opcode (name arg mode)
#:transparent
#:constructor-name constructor:opcode)
; Smart constructor - accepts keyword args instead of positional args
; Can also pick default values for fields not given, if sensible defaults exist
(define (opcode #:name name #:arg arg #:mode mode)
(constructor:opcode name arg mode))
> (opcode #:name 'LDA #:arg 0 #:mode 'IMM)
(opcode 'LDA 0 'IMM)
I’ve been using those for ages, but didn’t realise you could give a name to them in the struct using #:constructor-name. That’s awesome, thank you!
There’s a few packages that provide convenient macros to make this easier to do automatically, like kw-make-struct
[1] and struct-plus-plus
[2]
[1] https://docs.racket-lang.org/kw-make-struct/index.html [2] https://docs.racket-lang.org/struct-plus-plus/index.html
Thanks for those, sometimes when you stumble across a library you wonder how you managed before without it, haha.
Happy to help :simple_smile:
Coincidentally, today I’m working on docs for my own version of this kind of macro
@sydney.lambda Actually, mind if I use your opcode use case in the examples of my docs?
I’d be flattered, in fact! haha. Sure.
Mind if I pick your brains about something else regarding my 6502-suite? Edit: Actually, I really shouldn’t single people out asking them for help, sorry. I’ll just post it.
I had this code: (define (emulate processor)
(if (>= (Processor-PC processor)
(vector-length (Processor-MEM processor)))
processor
(match processor
[(struct* Processor ([A A] [X X] [Y Y]
[Z Z] [N N]
[MEM MEM]
[PC PC]))
(case (vector-ref MEM PC)
[(#xA9) (emulate (LDA-IMM processor))]
[(#xA2) (emulate (LDX-IMM processor))]
[(#xF0) (emulate (BEQ processor))])]
...)))
(define (LDA-IMM processor)
(define PC (Processor-PC processor))
(define MEM (Processor-MEM processor))
(define RAND (vector-ref MEM (+ PC 1)))
(Processor-copy processor
[A RAND]
[PC (+ PC 2)]
[N (twos-complement-negative? RAND)]
[Z (= RAND 0)])))
so each of the opcode instructions takes a processor and returns a new one. However, I didn’t like the fact that every case was wrapped in (emulate…), so I initially thought of binding the result of the match to new-processor or something and calling emulate on new-processor at the end. However, I had a lightbulb moment and realised, hey, why not cut out the middle man?: (define (emulate processor)
(if (>= (Processor-PC processor)
(vector-length (Processor-MEM processor)))
processor
(match processor
[(struct* Processor ([A A] [X X] [Y Y]
[Z Z] [N N]
[MEM MEM]
[PC PC]))
(case (vector-ref MEM PC)
[(#xA9) (LDA-IMM processor)]
[(#xA2) (LDX-IMM processor)]
[(#xF0) (BEQ processor)])])))
(define (LDA-IMM processor)
(define PC (Processor-PC processor))
(define MEM (Processor-MEM processor))
(define RAND (vector-ref MEM (+ PC 1)))
(emulate (Processor-copy processor
[A RAND]
[PC (+ PC 2)]
[N (twos-complement-negative? RAND)]
[Z (= RAND 0)])))
so, because emulate is always called on the result of the operator functions, just have those functions call emulate themselves with the result. Of course, you can’t normally make use of this because functions can return their results to a variety of functions, but I thought it was really elegant in this case as the code for emulate looks very clean (perhaps even pseudo-imperative, dare I say, which is ideal for an inherently-imperative use-case).
Do you think this style (sort of reminds me of continuation-passing style but with the continuation function being fixed) is alright to use in a case like this?
Happy to help. Now let me read it :p
The mutual recursion works pretty well here, I think. Personally I like the first version better though, because I can write tests for each instruction independently and the recursion is kept in a single isolated spot. But it’s a soft preference.
Also, thanks for the use case! Added to docs:
You’re very welcome. I prefer having each field as a kwarg, it’s extremely easy to make mistakes with positionals especially when adding/removing fields at a later date. That would save me having to define them manually every time - very cool.
@sydney.lambda Looking closer, are you familiar with match-define
? I think it can help you remove some of the rightward drift in emulate
: (define (emulate processor)
(cond
[(>= (Processor-PC processor)
(vector-length (Processor-MEM processor)))
processor]
[else
(match-define
(struct* Processor ([A A] [X X] [Y Y] [Z Z] [N N] [MEM MEM] [PC PC]))
processor)
(case (vector-ref MEM PC)
[(#xA9) (emulate (LDA-IMM processor))]
[(#xA2) (emulate (LDX-IMM processor))]
[(#xF0) (emulate (BEQ processor))])]
...))
I am, but somehow it never occurred to me that I should use it there. I had read about the struct* form earlier that day, and I guess my mind hadn’t yet made the connection that I could use it with match-define. Thanks!
Note that this requires you use cond
instead of if
because the branches of if
don’t have anywhere to put definitions
(I usually avoid if
for this reason)
That’s another thing I’ve had to be aware of recently. I used to always use if, if (no pun intended) there were only two cases to the conditional. After realising that cond is much more versatile (for the reason mentioned, among others) I’ve started replacing them with cond even for two-case conditionals.
I missed that one, of course :wink: thanks for pointing it out.
@notjack the bit/byte functions in rebellion are coming in very handy for my use-case by the way, so thanks again for making them and sending the link my way.
Ah right that was you! I’m glad they’re getting some use. Is your code on the package catalog or on github somewhere? It’s helpful for me to be able to compile and test your code if I’m considering changes to APIs you’re using.
Another option instead of a large case. Use a “jump table”. Make a vector named handlers of length 256 filled with thunks - one for each opcode. Then instead of (case (vector-ref MEM PC) ...)
you can write ((vector-ref handlers (vector-ref MEM PC)) processor)
.
ah, if I’m understanding correctly I think that’s similar to what I did before: (define OPCODE 0)
(define NAME 1)
(define MODE 2)
(define LENGTH 3)
(define OPS
(let ([v (make-vector 256 (vector "" "" "" 0))])
(for ([op (read-json (open-input-file "6502_instructions.json"))])
(match-let ([(hash-table ('opcode opcode)
('name name)
('mode mode)
('bytes length)) op])
(vector-set! v (hex-string->number opcode) (vector
opcode
(string-downcase name)
mode
(string->number length)))))
v))
However, at the time I hadn’t thought to have every opcode function take a processor argument, so I couldn’t figure out how work the functions into that method due to them requiring different arguments. Pretty sure I can do that now though, so thanks for reminding me @soegaard2 :slightly_smiling_face:
Whenever I’ve used that method, I’ve named the indexes of the vector using defines to make it clear what is being extracted from them. I think in the case of that one it would look like (C syntax, as Racket’s vector indexing is a little verbose and doesn’t illustrate the point as well): OPCODES[OP_NUM][NAME]
rather than OPCODES[OP_NUM][0] (what is 0? magic numbers...)
or OPCODES[OP_NUM][LENGTH]
I often wonder if it would make sense to define a “faux-dictionary” type, which has the characteristics of a vector but allows you to access it using dictionary syntax; the above would use the same syntax, but would automatically define the index names for you and perhaps do some magic to ensure that you can’t shadow the names, or that you can still use the names for other purposes.
You could make a helper function that encapsulates looking up the handler for an opcode and calling it on a processor
Yet another instance of me making things more complicated than they need to be, haha. Thanks. So the “magic numbers” would either be inside the function itself (so it’s obvious what it’s getting from the vector’s index via the function name) or the definition would be confined to said function.
Correct!
I’m tempted to type out the entire list of opcodes as a vector literal to save having to read from the json (or any external) file.
Wait, I should just run the program, have it print the literal, then copy it to the clipboard. I really do amaze myself with my silliness sometimes :stuck_out_tongue:
That’s a sign you’re improving your ideas :p
I’m proud of how my ideas have improved over the past few months, and that’s as self-complimentary as I ever get :p I attempted this project a few months back and could not wrap my head around how to go about it, and wound up feeling rather defeated and confused. Now I’m feeling like it’s enjoyably challenging, rather than frustratingly difficult.
I’m happy to hear that. Assembly and racket go together better than most would think, it’s always nice to see more projects in that area.
@notjack Thanks for the encouragement, I was starting to feel like a broken record asking all these assembly-related questions in here, haha. I personally find the most interesting topics/areas exist at the extreme ends of the spectrum - functional (and logic) languages on the one end, and assembly on the other. It’s a long way off, but my ultimate project I’ve had in mind for some time is a compiler from Scheme->6502.