sydney.lambda
2019-9-8 11:20:07

Could I get some help with a parser macro please? I’ve written a tentative one in Yacc: (define test-parse (parser [tokens mnemonics atoms delimiters] [start Line*] [end newline] [error (λ (tok-ok? name val) (error (format "~a ~a" name val)))] [grammar (Line* [(Line Line*) (cons $1 $2)] [() '()]) (Line [(Opcode-Statement) $1]) (Opcode-Statement [(Immediate-Opcode Immediate-Operand) (Opcode $1 $2)]) (Immediate-Opcodes [(ADC) 'ADC] [(AND) 'AND] [(CMP) 'CMP] [(CPX) 'CPX]) (Immediate-Operand [(hashtag number) (Operand $2 'IMM)])])) and I’d like to have a form like so: (Immediate-Opcodes ADC AND CMP) ;and so on which expands to the (Immediate-Op ’ADC …) form above. I came up with this: (define-syntax (Immediate-Ops stx) (syntax-case stx () [(Immediate-Opcodes ops ...) #`(Immediate-Opcode #,@(for/list ([op (syntax->list #'(ops ...))]) #`((#,op) (quote #,op))))])) Which - when inspected by inserting a quote after the syntax-quasiquote - seems to be identical to the original syntax I was trying to replace. However, it doesn’t work; I just get the error “parser-production-rhs: Immediate-Opcode is not declared as a terminal or non-terminal” which I take to mean my macro isn’t expanding correctly, so the grammar rule isn’t being inserted. Any ideas? I’m still having trouble switching my mind between compile-time/run-time sadly, so that may have something to do with it. Thanks :)


soegaard2
2019-9-8 11:24:16

Are you sure it is possible to use macros there? The form (parser ...) is not a function call. The parser macro takes all the clauses and expects them to be grammar, start, tokens, etc declarations.


soegaard2
2019-9-8 11:25:09

You can make a smaller example to test whether it is possible to have a macro call produce grammar-ids.


soegaard2
2019-9-8 11:26:37

I think you will need a higher level syntax that besides generating the grammar also expands into a use of parser.


sydney.lambda
2019-9-8 11:27:32

Ahh, I think I understand. I’d likely have to generate the entire grammar using a macro, and not individual productions?


soegaard2
2019-9-8 11:27:42

Yes, I believe so.


soegaard2
2019-9-8 11:29:01

It would have been cool though, if the parser supported “grammar-expanders” like match supports custom match-expanders.


sydney.lambda
2019-9-8 11:42:40

I was thinking the same thing, but inspired by the lex-transformers from the other day. Thanks for your help :) Edit: It just occurred to me that I can run the macro, and then just paste it directly into the source. Still turned out to be useful after all.


soegaard2
2019-9-8 12:41:40

Simple, but effective!


sydney.lambda
2019-9-8 13:46:51

Could I get some assistance in making my grammar less broken, please? (define test-parse (parser [tokens mnemonics atoms delimiters indexes] [start Line*] [end newline] [grammar (Line* [(Line Line*) (cons $1 $2)] [() '()]) (Line [(Opcode-Statement) $1]) (Opcode-Statement [(Immediate-Opcode Immediate-Operand) (Opcode $1 $2)] [(Zero-Page-Opcode Zero-Page-Operand) (Opcode $1 $2)] [(Zero-Page-X-Opcode Zero-Page-X-Operand) (Opcode $1 $2)]) (Immediate-Opcode [(ADC) 'ADC] [(AND) 'AND] [(CMP) 'CMP] [(CPX) 'CPX] [(CPY) 'CPY] [(EOR) 'EOR] [(LDA) 'LDA] [(LDX) 'LDX] [(LDY) 'LDY] [(ORA) 'ORA] [(PHA) 'PHA] [(PHP) 'PHP] [(SBC) 'SBC]) (Zero-Page-Opcode [(ADC) 'ADC] [(AND) 'AND] [(ASL) 'ASL] [(BIT) 'BIT] [(CMP) 'CMP] [(CPX) 'CPX] [(CPY) 'CPY] [(DEC) 'DEC] [(EOR) 'EOR] [(INC) 'INC] [(LDA) 'LDA] [(LDX) 'LDX] [(LDY) 'LDY] [(LSR) 'LSR] [(ORA) 'ORA] [(ROL) 'ROL] [(ROR) 'ROR] [(SBC) 'SBC] [(STA) 'STA] [(STX) 'STX] [(STY) 'STY]) (Zero-Page-X-Opcode [(ADC) 'ADC] [(AND) 'AND] [(ASL) 'ASL] [(CMP) 'CMP] [(DEC) 'DEC] [(EOR) 'EOR] [(INC) 'INC] [(LDA) 'LDA] [(LDY) 'LDY] [(LSR) 'LSR] [(ORA) 'ORA] [(ROL) 'ROL] [(ROR) 'ROR] [(SBC) 'SBC] [(STA) 'STA] [(STY) 'STY]) (Immediate-Operand [(hashtag 8-bit-int) (Operand $2 'IMM)] [(hashtag 16-bit-int) (Operand $2 'IMM)]) (Zero-Page-Operand [(8-bit-int) (Operand $1 'ZP)]) (Zero-Page-X-Operand [(8-bit-int comma-x) (Operand $1 'ZPX)]) ])) After I added the Zero-Page-X rules, I was hit with reduce/reduce errors. I assume it’s because both Zero-Page and Zero-Page-X related productions can start with both the same Opcode token and the same 8-bit-int Operand token, and so the parser doesn’t know that I want it to choose the longest one? That is, the way I’m thinking in my head is that, upon encountering a Zero-Page-Opcode followed by an 8-bit-int, we look ahead to see if there is a comma-x token, and that is what decides whether it’s Zero-Page-X or regular Zero-Page. This is my first ever grammar so, apologies for newbie mistakes; I tried google reduce/reduce errors but I can’t seem to relate it back to my use case. Thank you :)


soegaard2
2019-9-8 13:50:25

> After I added the Zero-Page-X rules, I was hit with reduce/reduce errors. I assume it’s because both Zero-Page and Zero-Page-X related productions can start with both the same Opcode token and the same 8-bit-int Operand token, and so the parser doesn’t know that I want it to choose the longest one?

Sounds correct.


soegaard2
2019-9-8 13:52:40

You could use ADC.x as the opcode name?


soegaard2
2019-9-8 13:55:26

Or postpone the decision to an extra pass after the parser.


sydney.lambda
2019-9-8 14:14:03

That’s one thing I’m struggling with in particular - responsibilities for each part of the assembler. I should probably just stop being a masochist and postpone working out specific modes until code generation or something as you said. I always feel guilty sidestepping something like this for whatever reason, haha.


soegaard2
2019-9-8 14:18:26

You could look at one of the popular 6502 assemblers, like Kick Assembler.


soegaard2
2019-9-8 14:20:13

In “LDA mumble,X” we know it is LDA x-indexed. If mumble is a constant, then the parser has can determine whether we are looking at zero page or not.


soegaard2
2019-9-8 14:21:11

One option is to just parse it as “LDA x-indexed” and let the code generator determine whether it should emit a zero page LDA or a normal one.


soegaard2
2019-9-8 14:23:07

The manual of Kick Assembler (page 6) says: An argument is converted to its zeropage mode if possible. This means that lda $0030 will generate an lda command in its zeropage mode[1]. [1] If the argument is unknown (eg. an unresolved label) in the first pass, the assembler will assume it’s a 16 bit value


sydney.lambda
2019-9-8 14:24:26

I could be wrong, but I think that would only be an issue in distinguishing between Absolute and ZP modes which, whilst I’ll no doubt have similar ambiguity issues with, I’m not sure that’s what the parser is complaining about here.


soegaard2
2019-9-8 14:26:56

What happens if you remove: Zero-Page-Opcode ?


sydney.lambda
2019-9-8 14:30:53

It complains about running into an 8-bit-int token, which I believe is because it’s trying to parse as (Immediate-Opcode Immediate-Operand) and confused because there’s no # to signify immediate. However, with an opcode that supports Zero-Page-X and not Immediate, it accepts it as Zero-Page-X just fine.


soegaard2
2019-9-8 14:31:41

You have a point. In order to see exactly where the parser has a problem, you can add a debug clause: (debug filename).


sydney.lambda
2019-9-8 14:34:47

Thank you! That’s much more descriptive than the error argument.


sydney.lambda
2019-9-8 14:44:22

(define test-parse
  (parser
    [tokens mnemonics atoms delimiters indexes]
    [start Line*]
    [end newline]
    [error (λ (tok-ok? name val)
              (error (format "~a ~a" name val)))]
    [debug "debug.txt"]
    [grammar
      (Line*
        [(Line Line*) (cons $1 $2)]
        [() '()])
      (Line
        [(Opcode-Statement) $1])
      (Opcode-Statement
        [(Immediate-Opcode Immediate-Operand) (Opcode $1 $2)]
        [(Zero-Page-Opcode Zero-Page-Operand) (Opcode $1 $2)])
      (Index?
        [() ""]
        [(comma-x) "X"])
      (Immediate-Opcode
        [(ADC) 'ADC] [(AND) 'AND] [(CMP) 'CMP] [(CPX) 'CPX] [(CPY) 'CPY] [(EOR) 'EOR] [(LDA) 'LDA] [(LDX) 'LDX]
        [(LDY) 'LDY] [(ORA) 'ORA] [(PHA) 'PHA] [(PHP) 'PHP] [(SBC) 'SBC])
      (Zero-Page-Opcode
        [(ADC) 'ADC] [(AND) 'AND] [(ASL) 'ASL] [(BIT) 'BIT] [(CMP) 'CMP] [(CPX) 'CPX] [(CPY) 'CPY] [(DEC) 'DEC]
        [(EOR) 'EOR] [(INC) 'INC] [(LDA) 'LDA] [(LDX) 'LDX] [(LDY) 'LDY] [(LSR) 'LSR] [(ORA) 'ORA] [(ROL) 'ROL]
        [(ROR) 'ROR] [(SBC) 'SBC] [(STA) 'STA] [(STX) 'STX] [(STY) 'STY])
      (Zero-Page-X-Opcode
        [(ADC) 'ADC] [(AND) 'AND] [(ASL) 'ASL] [(CMP) 'CMP] [(DEC) 'DEC] [(EOR) 'EOR] [(INC) 'INC] [(LDA) 'LDA]
        [(LDY) 'LDY] [(LSR) 'LSR] [(ORA) 'ORA] [(ROL) 'ROL] [(ROR) 'ROR] [(SBC) 'SBC] [(STA) 'STA] [(STY) 'STY])
      (Immediate-Operand
        [(hashtag 8-bit-int) (Operand $2 'IMM)]
        [(hashtag 16-bit-int) (Operand $2 'IMM)])
      (Zero-Page-Operand
        [(8-bit-int Index?) (Operand $1 (string->symbol (string-append "ZP" $2)))])]))

Would you consider that cheating @soegaard2 ?


soegaard2
2019-9-8 14:45:59

Not at all. When there reduce errors, you need to be be creative.


sydney.lambda
2019-9-8 14:48:13

Awesome!! Thanks for taking the time to help me with this, I really appreciate it.


orfeaskar
2019-9-8 23:56:37

@orfeaskar has joined the channel