What would be cool for the square inch contest.... How about a homebrew CPU with TTL chips ? This will be the smallest TTL Homebrew CPU in the world ! Like the old CPU's of the eighties, it has 40 connections, an 8 bit bus, and can address 64 Kilobytes. But it also has some very unusual properties....
Interesting is, that it occupies less area than an old 40-pin DIP processor...
BREAKING NEWS The Square Inch processor is working ! It drives the 6-digit multiplexed display !
This project is very much a race against the clock. The idea for this project occurred to me after the first week of september, leaving three weeks before the end of the Square Inch Contest. Since it involves processor design, hardware and pcb design, and software, this is hardly possible in three weeks.
Well let's start with the usual characteristics:
8 data lines
16 address lines
Single memory space for program and data
Memory Read and Memory Write lines
Reset and clock inputs
5 volt power lines
Indirect register addressing
Conditional jumps and branches
And now the less usual characteristics:
Two rows of pins, with 0.05 Inch (1.27 mm) pin distance
Microprogram is in FLASH memory and can be written with a RPi as programmer
Architectural registers are in RAM
There is NO ALU
Only 8 IC's
The registers are in RAM. That has been done before (see TMS9900). This will not give you a speed devil, but it is needed to fit the design in one square inch. Another thing left out of the CPU for this reason is...... the ALU. ( I do not intend to connect the square-inch 4 bit TTL ALU to this cpu ).
NO ALU... I could have programmed a small PIC or AVR as ALU, but that's cheating. Programming without ALU is very doable, as I've shown in my project NeuronZoo. In the Neuronzoo project, the neurons are like software objects. The eight out-going connections of a neuron, Axons, are like eight fields within the object structure, that can contain pointers to other objects. The neurotransmitters are like processor registers that point to a certain object. Processing is done by following links in the structure and changing the pointers in the object structure. Different execution paths can be followed, depending on the existence of a link. It has much in common with LISP. Everyone with theoretical computer background can tell you that such a system is Turing complete, meaning that it can execute every possible computer program, if it is given enough memory and time. The NeuronZoo project demonstrates this by adding numbers and generating chess moves. [ End of NeuronZoo commercial ].
So the CPU can do "calculations" by following links. It can also use a byte as index in a table, or do a "calculated" jump in the microcode. This last feature can be used to interpret the instructions.
12-bit microcode program counter UPC. The lowest 4 bits increment at every cycle. The microcode can do a jump to another 12 bit address by writing 12 new bits to this register. A logic low on the RESET line will reset the microcode to address zero.
Two 8-bit address pointers H and L. These can supply the 16 bit address to the external address bus.
An 8-bit register B, to temporary hold a value that is transferred from one place to another.
A HC139 decoder provides decoded read- and write signals to the registers and memory
An 8-bit flash memory provides the 8 microcode bits. There are 13 address lines, 12 come from the microcode program counter, and the 13th comes from the upper bit of the H register. So in each cycle, the executed microcode depends on the H7 bit, opening the door to conditional execution. For normal, unconditional instructions, the microcode for H7=1 is the same as for H7=0.
Dedicated microcode bits are or'ed with A0, A1, A2 and A15. This provides a limited form of indexed addressing. Also, the microcode can enable or disable the output of the H-L register pair. This gives the microcode the possibility to directly address a few memory locations. A15 can be used to choose between external ROM and RAM.
The hardware registers UPC, H, L and B are only visible to the microcode and not to the instructions of the CPU. The instructions of...
The ZIP contains the current SW status:
- index.php and simas_n.js together are the assembler that runs in a browser. Live at www.enscope.nl/simas_nac. but that may be a newer version
- test.txt is the application SW that multiplexes the LEDs
- mc_prog runs on the RPi and programs the microcode
- 1sq_prog runs on the RPi, programs the application Flash, and can singlestep the processor
The smallest homebrew CPU in the world is working now !
Last week, I spend a few evenings changing the layout, to correct the pinout of the flash chip. Not a really easy task, this is a very crowded (only 2 sided) layout. One week ago, on thursday evening, the new layout was sent to China, and I got the pcb's yesterday. It was assembled today (I had ordered a double set of components the first time, perhaps already having the feeling that the first time would not be right).
It is now running a simple program, it fills 6 memory locations with the 7-segment values for the digits 1 to 6, and then writes each of them to the multiplexed display in a loop.
To show that there's nothing hidden, I disconnected the RPi, and I also show the back side of the pcb.
The display is a bit dim, that's because it is only on during 3 instructions and off during 4 instructions. But in reality it is looking better. Perhaps I'll also have to make the resistors a bit lower in value. You can see the program on the Browser-based JS assembler page.
Back side of the pcb:
I will soon update the schematics and gerbers, and describe the application pcb, the NAC, and the modifications.
After having the whole system debugged with the NAC pcb, having display multiplexing working, it was time to replace the NAC by the real thing, the 1x1" CPU.
After having corrected a few bad soldered pins, it was still impossible to program the microcode. When reading the flash, all locations returned 0x00. Strange, because unprogrammed locations normally return 0xFF.....
Back to the datasheet......
Well everything that CAN go wrong, WILL go wrong. So if you don't check the datasheet for the pinout.... The pinout for the TSOP is different from the DIP pinout ! Beginner's mistake, I assumed they would be the same without even thinking about it.
So this means re-doing the pcb design and order new PCB's. Today, I will continue to program the 6-digit clock.
The NAC pcb is a bigger version of the square inch pcb. It is intended to make debugging easier. It has two unused footprints, to have room for extra IC's if that would be needed.:
Debugging started with the NAC connected to the application pcb and the Raspbery Pi programmer.
The Raspberry Pi will have two python scripts, one for programming the microcode and one for programming the application code. The last one can also single-step the processor and display the micro-instructions and databus value at each step.
I corrected a few problems. At this moment, stepping through the microcode works, but only for about 6 micro-instructions after reset. After that, microcode reads as FF... have to investigate....
Next picture shows the square inch cpu on the white application pcb. The application pcb contains RAM (empty socket on picture), Flash-ROM, I/O and support functions like clock generation, reset circuit and connecting to the RPi programmer:
And here is the backside of the CPU, with the 'big' microcode Flash:
In this log I will explain more about the connections to the CPU.
To use the processor, you only need the databus, address bus and control bus to connect memory and I/O to the CPU. PROG/ must be high (inactive) and EN/ must be low (active). Address lines A3 up to A14 need an external 2K2 pull-down resistor.
VCC must be 5 volt. A suitable TTL clock signal must be applied to the CLK input. To start the CPU, the RST/ signal must be low during one or more clock cycles. After reset, the CPU will fetch its start address from the first ROM bytes at address 0x8000 (lsb) and 0x8001 (msb). Afterwards, instructions can be fetched from any position in the 64K address range.
When the processor makes the MR/ signal low, the selected memory or input device should place a byte on the databus. When the processor makes the MW/ signal low, the databus contents should be written to the selected memory or output location. An external demultiplexer will be needed to select RAM, ROM or I/O.
The microcode bus and Programming signals are needed to program or update the microcode (that is in the flash memory of the CPU). Normally, the microcode bus will output the microcode bits. This can be used for debugging, but it is not needed to connect this for normal operation.
To program a microcode memory location, the correct address must be set in the UPC register, and the flag bit (bit 7 of the H register) must be set correctly. This can be done by letting the RPi send microcode and databus bits to the CPU, to accomplish this. When the correct address has been set, the byte that must be programmed can be placed on the microcode bus, and a short active-low pulse on the PROG/ input will program the byte into the flash memory (the exact programming sequence is a little bit more complicated, refer to the datasheet of the flash device). To check the programmed byte, the EN/ input can made low to read the microcode byte from the flash.
Note that the microcode is non-volatile, once programmed it will always stay in the CPU, also when there is no power.
The programming mode can also be helpful to program the external flash ROM that holds the user program for the CPU. By sending the correct microcode instructions, the address bus can be set to the address that you want to program, so no external multiplexer is needed to connect the programming address to the flash ROM. But of course it is also possible to have the external flash in a socket and program it by inserting it in a universal programmer.
To have a quick start, a very simple instruction set was chosen. It is based upon the zero-page addressing mode, and most operations work on 16-bit values. This makes it a 16 bit processor with an 8 bit bus. There is a single 16-bit accumulator, surprisingly called "A", and a 16-bit PC. A single instruction can load or save the 16 bits in A from or to the zero page. Also, the zero page values can be used as a pointer (as in the 6502) . This makes indirect load or store possible. There is an increment-by-two that works with a table in external ROM. All instructions are two bytes long. The opcodes are simply the 8-bit start addresses in the microcode, 16 bytes apart, giving a maximum of 16 opcodes. It is not an efficient opcode since only 4 of the 8 bits in the opcode are used.
Note that these instructions can be (almost) freely chosen, but that they must be supported/interpreted by the microcode program. To support this simple instruction set, the microprogram is less than 256 bytes. 4096 bytes are available, so a much more complex instruction set can be supported.
0 - 3FFF RAM4000 - 7FFF I/O
8000 - BFFF ROM
'A' is a 16 bit accumulator in RAM at 0x0004(lsb) and 0x0005(msb) PC is a 16 bit location in RAM at 0x0002(lsb) and 0x0003(msb). (Execution starts at 0x8010) Temp is a 16 bit temp storage in RAM at 0x0006(lsb) and 0x0007(msb) reset vector: Address 0x8000 in ROM contains fixed value 0x10, address 0x8001 contains 0x80 address 0x8002 contains 0x00 The following opcodes are defined. All opcodes are followed by a single operand byte. 0x20 LDB AL,#I8 ; 8 bit immediate load AL, AH will be set to zero.
0x30 LDB AH,#I8 ; 8 bit immediate load AH 0x40 LDW A,Z ; 16 bit load from a zero page location
0x50 STW Z,A ; 16 bit store to a zpage location 0x60 LDW A,(Z) ; 16 bit indirect load (address pointer in zero page)
0x70 STW (Z),A ; 16 bit indirect store (pointer in zero page) 0xe0 LDB A,(Z) ; 8 bit indirect load (address pointer in zero page)
0xd0 STB (Z),A ; 8 bit indirect store (pointer in zero page) 0x80 BR label ; replace lower 8 bits of PC
0x90 BRM label ; replace lower 8 bits of PC if bit 7 of ACC is 1
0xa0 BRP label ; replace lower 8 bits of PC if bit 7 of ACC is 0 0xb0 PAGE label ; jp to another page, label is 16 bit (lower 8 bits ; must be zero, so it's a 2 byte instruction) 0xc0 INCD A ; 8-bit increment-double (increment-by-two) accumulator (needs table in ; ROM at 0x8100) ; The second instruction byte has msb of table address
; The same table is used to increment the PC.
0xc0 DECD A ; 8-bit decrement-by-two accumulator (needs table in ROM ; at 0x8200) ; The second instruction byte has msb of table address
MICROCODE INSTRUCTIONS bytecode 0x00 LD B,(HL+nn) ; nn = 0-7 or 0x8000 - 0x8007 0x10 LD B,(nn) 0x20 LD L,(HL+nn) 0x30 LD L,(nn) 0x80 ST (HL+nn),B 0x90 ST (nn),B 0xa0 LD UPC,BL ; UPC[0-7] <- B[0-7], UPC[8-11] <- L[0-3] 0xb0 LD UPC,B ; UPC[0-7] <- B[0-7], UPC[8-11] <- 0 0xc8 LD H,L+nn ; nn = 0-7 IR3 must be set to 'write' to ROM 0xd8 LD H,nn ; nn = 0-7 IR3 must be set to 'write' to ROM 0x00 - 0x07 add this to bytecode for nn = 0-7 0x08 add this to bytecode for nn = 0x8000 - 0x8007 Microcode instructions can have an additional M or P, to make them conditional: M (minus) -> execute when H = 1 P (plus) -> execute when H = 0
Note that the micro-instruction that writes to H will also write to the external bus, due to the simplicity of the decoder. To prevent writing rubbish to RAM, this micro-instruction has IR3=1 to generate an address in the 0x8000-0xFFFF range. The external address decoding should be such that RAM and I/O is in the 0x000-0x7FFF range, in order not to be corrupted by this decoder effect.
Also note that the "+" in (HL+nn ) is actually a bitwise-OR and not an addition.