Home
Home
Blog
Blog
Old Projects
Old Projects
Current Projects
Current Projects
Publications and Presentations
Publications and Presentations
Fun Stuff
Fun Stuff
Chairman Miau
Chairman Miau
Pictures
Pictures
Whats Inside?
Whats Inside?
Sooty
Other People
Polls
Polls
Disclaimer

chump

chump

About

chump is a language I created to describe line assemblers and disassembles in one description.
It is used in KMD to describe different processors.
I'm not sure which direction to take this project so please do tell me what you think. Email me: cb at cs dot man dot ac dot uk
I will probably create a full multi pass assembler and disassembler.
The reason for using this reather than gnu binutils as is the fast implementation. Its basicly made for people who want to design instruction sets.
Q:Why "chump"? A:Read it upside-down.

Download

chump is available in its source version from the archive.

Licence

chump code is available under LGPL

Updates

If you wish to be informed when chump changes the take a look at the freshmeat chump page

Screen shots


Example Operation

ChangeLog

The change log is here

Requirements

chump requires the following libraries:
GLib - Provides many useful data types, macros, type conversions, string utilities and a lexical scanner.
GDK - A wrapper for low-level windowing functions.
GTK - An advanced widget set.
BFD - the Binary File Descriptor Library. (BFD comes with GCC)

Example Code

The system comes with a sample.chump. This has descriptions of ARM32, MIPS32 and STUMP16 architectures. I am working on 6809 as well.
It took me about 3 days to write the ARM one. About 1 day for the MIPS and 1 hour for STUMP.
You can always get the latest version of the sample.chump from here

Below are descriptions of a STUMP (little 16 bit RISC) written in chump.

 (isa "STUMP16"  ; STUMP is a simple 16bit processor (C) Andrew Bardsley
                 ; Use this description to learn chump
                 ; This is not a good tutorial but you can get the basics
                 ; Firstly the basics:
                 ; The following is the correct syntax to describe a translation
                 ; (("Disasambled descrption")(Assambled description))
                 ; disassamled description is simply a string or set of strings
                 ; Assembled description is a set of bits I (always on),O (always off),
                 ;      X (dontcare but set as on),Z (dontcare but set as off)
                 ; Be careful, I and O are LETTERS. The parser will complain if it doesnt understand.
                 ; e.g. 1 :   (("R3")(OII)) - matches 011 to "R3"
                 ;                              and "R3" to 011
                 ; e.g. 2 :   (("BR")(OZX)) - matches 000, 001, 010 or 011 to "BR"
                 ;                              and "BR" to 001
                 ; e.g. 3 :   (("PC")(III))
                 ;            (("R7")(III))   - matches 111, to "PC" as its first in the list
                 ;                              and "PC" or "R7" to 111
                 ; e.g. 4 :   (define "set" (("S")(I))      defines a rule called "set"
                 ;                          (("") (O)) )    this rule can now be used in all rules below
                 ;               (("ADD" set) (OI set))     we can now use the predefined rule in another rule
                 ;                                          remember to place the rule in both the binary and ascii sections
                 ; e.g. 5 :   (define "imm" (int 4 + 4))    "imm" is defined to be a 4bit hex number. When DISASSEMBLING 4 is added
                 ; e.g. 6 :   (define "imm" (relative 4))   "imm" is defined to be a 4bit relative number offset from the current position
                 ; e.g. 7 :   (("#" ("imm" (int 4))) (imm)) the imm rule is defined in the rule. Its only valid in this rule
                 ;                                          and previous definition is ignored in this rule
                 ; Take a look at the STUMP instruction set
                 ;          Instruction types
                 ;          1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
                 ;          5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
                 ;          --------------------------------
                 ; Type 1     OP |0|S| DST |SRCA |SRCB |SHIFT
                 ; Type 2     OP |1|S| DST |SRCA | IMMEDIATE
                 ; Cond Br  1|1|1|1| COND  |    OFFSET


                            
    (define     "reg"           (("R0")(OOO))   (("R1")(OOI))       ; Firstly we 'define' a description of all the registers
                                (("R2")(OIO))   (("R3")(OII))       ; R0 - 000, R1 - 001 ... PC - 111, R7 - 111
                                (("R4")(IOO))   (("R5")(IOI))       ; 111 is overloaded, When disassambling it will choose the first 
                                (("R6")(IIO))   (("PC")(III))       ; one ("PC") but when assembling either are acceptable
                                                (("R7")(III)))
    (define "dst" reg)                                              ; DST is a register
    (define "srca" reg)                                             ; so are SrcA and SrcB
    (define "srcb" reg)
    (define "op"                (("ADD")(OOO))  (("ADC")(OOI))      ; These are the 6 OP codes
                                (("SUB")(OIO))  (("SBC")(OII))
                                (("AND")(IOO))  (("OR") (IOI)))
    (define "set"   (("S")(I))                                      ; If set bit is set then add an S onto the opcode
                    (("") (O)))                                     ; e.g. ADD -> ADDS

    (define "shift" (("")      (OO))                                ; Shift types
                    ((", ASR") (OI))
                    ((", ROR") (IO))
                    ((", RRC") (II)))

    (define "cond"  (("")   (OOOO))                                 ; Branch conditions
                    (("AL") (OOOO))
                    (("NV") (OOOI))
                    (("HI") (OOIO))
                    (("LS") (OOII))
                    (("CC") (OIOO))
                    (("CS") (OIOI))
                    (("NE") (OIIO))
                    (("EQ") (OIII))
                    (("VC") (IOOO))
                    (("VS") (IOOI))
                    (("PL") (IOIO))
                    (("MI") (IOII))
                    (("GE") (IIOO))
                    (("LT") (IIOI))
                    (("GT") (IIIO))
                    (("LE") (IIII)))

    (define "dir"   (("LD")(O))                                     ; The difference between an ST and an LD is in the S bit
                    (("ST")(I)))


    (("NOP") (OZZ Z O OOO ZZZ ZZZ ZZ))                               ; These are the descriptions of the instructions
    (("NOP") (IOZ Z O OOO ZZZ ZZZ ZZ))                               ; These two NOP descriptions overlap other instructions
    
    (("CMP" "\tf10" srca ", " srcb  shift )                         ; e.g. CMP  R3, R6, ASR
            (OIO O I OOO srca srcb shift))
            
    (("CMP" "\tf10" srca ", " ("imm" (int 5)) )                     ; e.g. CMP  R4, 12
            (OIO I I OOO srca imm))                                 ; notice the inline definition if "imm"
            
    (("MOV" set "\tf10" dst ", " ("imm" (int 5)))                   ; e.g. MOVS     R3, 12
            (OOOO set dst OOO imm))
            
    (("MOV" set "\tf10" dst ", " ( "src" ((reg)(OOO reg))           ; e.g. MOV      R3, R5
                                         ((reg)(reg OOO))) shift)   ; note inline definition can also be translations
            (OOOO set dst src shift))
        

    ((op set "\tf10" dst ", " srca ", " srcb shift)                 ; e.g. ADD      R4, R7, R2
     (op O set dst srca srcb shift))

    ((op set "\tf10" dst ", " srca ", " ("imm" (int 5)) )           ; e.g. SUBS     R6, R2, C
     (op I set dst srca imm))
     
    (("B" cond "\tf10" ("offset" (relative 8 )))                    ; e.g. BNE      100
     (IIII cond offset))
                    
    ((dir "\tf10" dst ", [" srca ", " srcb shift "]")               ; e.g. LD       r4, [r3,r0]
     (IIO O dir dst srca srcb shift))

    ((dir "\tf10" dst ", [" srca ", " ("imm" (int 5)) "]")          ; e.g. LD       r4, [r3,12]
     (IIO I dir dst srca imm))
 )