On my quest to build a fast QR Code generator in Minecraft, I saw a video by DavidJR Redstone. He dropped his latest revision of his QR Code generator, but one notable feature was the use of a Trap Door ROM. Inspired, I set out to build by own version,
We can use the fact that walls will attach to trapdoors when powered - and propagate that signal down instantly - to create a sort of ROM. In this case, I stacked them two high - but DavidJR stacked around 16 high:
You can see how he uses both sides of the walls to propagate the signal. But there's an issue.
The left side is one "byte" of data, and the right side is the other byte. In my design, and how it seems to work in DavidJR's design, the LHS overrides the RHS in some places, and the RHS overrides the LHS. To get to the bottom of this, I did some investigation on the ORE server. A kind jetFEE42 led me to the ROM I was looking for. The solution to this is quite simple, yet smart.
DavidJR softpowers the block with the torch, so that only one line is selected at a time. In fact, DavidJR's ROM is brilliant. Since only observers can detect the trapdoor, a pulse is generated. This pulse is timed perfectly with the XOR that needs to be done with it's output and sent to the next module:
- 1 - the output from the ROM
- 2 - the output from the XOR goes to the next module
- 3 - at the same time, the output from the previous XOR is latched with a repeater lock.
This is such an ingenious design, as it allows for extremely fast pipelining. We can analyze the delay of the ROM and see that it is around 10-13 ticks to the XOR. This is breathtaking speed for a 256 byte ROM. One SGTMIGHTYMIKE noted that a single trapdoor design might be faster. However, there is always a tradeoff to be made: a larger design will be inherently slower due to the larger distance redstone needs to travel. The double trapdoor may be smaller, but needs more control circuitry. Another concern to keep in mind is the output of the ROM - whether double or single trapdoor, it will be a pulse from an observer. In DavidJr's case, it is fine, if not helpful.
Removing an extra ROM and adder
Another consideration to be made is that fact that the input to the ROM needs to go through an adder first. There are N copies of the ROM, and each has an assigned constant to which the input needs to be added to before the sum goes to the ROM. we can remove this additional 10-13 tick delay by having different ROMs. This will certainly be impractical to do by hand, so we should create a script to generate a file we can later insert using worldedit.
Those familiar with polynomial division in a galois field will be aware that the input to this addition must first go through another ROM to convert from a "regular" number to an exponent (this makes multiplication easier, as it is simply addition with exponentials).
Essentially, what I am saying is that for any given term in the generator polynomial, the multiplication of the generator polynomial by the lead term of the message polynomial is a function of the lead term of the message polynomial as an integer, and the output is an integer. This means that we need 1 ROM per term in the generator polynomial:
(lead term: integer) ---(generator term specialised multiplication)--> multiplied lead term: integer
(multiplied lead term: integer, message term: integer) ---(xor)--> next lead term: integer
All in all, this is to say that we can remove the integer -> exponent ROM, and instead specialize the generator term ROMs.
It is difficult to make further optimizations from an algorithmic standpoint, because the next lead term will always have a hard dependency on the current message term and the current lead term. It is not possible to know what the next input to the ROMs should be without doing the required calculation anyway.
We have already increased the speed of an individual division step by 3 times from the DavidJR design (at least, conceptually). What further improvements can be made? the XOR step is as fast as possible. We need to make the ROM as fast as possible. DavidJR used a 16 by 16 trapdoor ROM to achieve 256 bits. This is likely near as fast as possible for trapdoor ROMs.
SGTMIGHTYMIKE used pumpkins to make a single trapdoor design. Notably, pumpkins carry redstone power but do not update walls! The colored blocks represent trapdoors:
I inspected the timings of this ROM. It might be a tick or so faster than DavidJR's design, but there is always a plus and minus to the exact details. We want to take advantage of the instant update downwards of the walls, but not spend too much time getting the signal upwards. Another consideration is the output of the walls. It might be easier to locally disable observer's output. Consider this, we can disable observers output in the same tick that the walls update. On the other hand, we need to disable individual ranks of walls before the tick that they update. Maybe we can utilise this tick to parallelize a bit, but the issue also becomes that the bottom area of the ROM will be larger. Let us build a ROM to explore a bit.
With this design, it takes 5 ticks from an input at the to a given layer of observers activated:
If we want to now disable the observer's output, as we will eventually need to do, we might use a comparator on subtract mode. However, this presents an issue. A comparator cannot take an input from an observer; we will need a 2 tick repeater as such:
However, two solutions are possible that rely on the same idea: preventing the observer from detecting the walls updating. On option requires a second observer. We can power redstone dust to "disable" the second observer in the chain. It will no longer output. It will, of course, trigger other updates when we disable or enable this "redstone lock," but we are operating under the assumption that the ROM is perfectly synced and that we will take a snapshot of the output at the correct tick.
The other option is to have another pumpkin layer, but this is controlled not layer-by-layer, but instead rank-by-rank. at the bottom of each rank of walls - except the one we want to actually activate the observer, we can enable another layer of trapdoors.
I will look into the second option because it is faster - no second observer. However, we need to make sure that this bottom most layer updates before the rest. Otherwise, we will get a conflict. This means that we need to delay the vertical decoder to one tick after the horizontal decoder. Let us try to make a small and fast horizontal decoder. I found a design by Cinnimin_ which I adapted to the use case. Incidentally, this is also 5 ticks
We will need to delay the data pulse by one tick, so that the control pulse runs at 5 ticks and then the data pulse at 6. Using this wiring, we successfully transferred the data from the topmost layer to the bottom in just 6 ticks
After several hours of experimentation, this doesn't work. Both option 1 and 2 need the layer decoder update in order to work; we need a different solution.