(C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

Massively Parallel Turing Machine (MPTM)
February 2004 (original design from 1996)
Full Document first created Maui, Spring 2005
Edited/Updated Tokyo and Maui, November 2005



Dave Scruton, Fractallonomy, Inc.
155 Wailea Ike Place #39, Kihei HI 96753
fractal@maui.net (808)879-3880


(C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

MPTM: Massively Parallel Turing Machine

Created by Dave Scruton, 1985. Updates in 1994, 2006


1. Basic Concept/Applications:

The MPTM is a very simple machine built entirely of special memory, connecting logic and lasers/optics.

  • All computation is carried out by virtual 'particles' which inhabit the memory. Any possible logical sequence can be performed. Any other architecture can be mimicked literally using "particle-carpet microcode"

  • These particles can travel through the memory linearly and change course as they move.

  • As the particles travel through memory, they can interact with data already present in the memory (heretofore referred to as 'space' or 'memory space' and alter the contents of the space in a pre-programmed manner.

  • Each memory 'space' is linked logically to its neighbors, in a 2D, 3D , or n-dimensional lattice. The memory can be binary, trinary, or of any modulus.

  • Data is read and retrieved via optical links to the memory.

  • If memory exceeds the TB level, large floating-point numbers could be attached to particles, allowing for scientific applications like studying heat transfer, weather simulations, and protein-folding simulations.

  • Mechanical simulations could be carried out in a literal sense, with 3D parts spread out over the memory space, so virtual approaches real.

  • The memory of one chip can be optically linked to the memory on an adjacent chip, using on-chip lasers, prisms, and fiber optics. Neighboring chips can carry on optical 'conversations' with each other asynchronously, and particles traversing through memory on one chip can 'jump' across this optical connection to the next chip. The laser frequency can be used as a chip address, i.e. chip 'n' only looks at laser data at frequency 'f'; it ignores any other data meant for other chips, and no multiplexing is required. Each chip has a set of configuration bits which determine the laser frequency it is sensitive to. (see configuration below) As a result, arbitrarily large memory spaces of any dimension can be created for computation using the same chip.
    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    2. Particles and Space

    In the above diagram, a group of memory 'cells' is shown. Each memory cell represents a place where a 'particle' can be stored as it 'moves' through 'space'. Note that the cells are all linked together logically. The logic between cells should be able to transfer data, and also should be able to operate on the data in a variety of simple ways (invert, increment, decrement, bitshift, AND, OR, etc).

    As a result, the particles can travel along any path, altering the memory as they travel along. Multiple sets of particles can interact with each other, producing complex results from a simple architecture.

    This diagram shows a single chip, with its laser optics. There are multiple paths connecting the 'cells' in memory with the laser optics, so any particle can also travel from chip-to-chip via laser optics. This makes the computer easily expandable; all connections are optical.


    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    3. How Software is executed

    These are the steps involved in code execution:
  • Load data/executable memory from outside (if needed).
  • Place particles at their initial locations, load their starting states.
  • Start the particles up, they propagate through memory in some sprogrammed direction.
  • When the particles reach a certain point, (or a Data Wall) they are halted.
  • The data trail left by the particles is now available.

    Data Trail

    As a particle travels through space, it leaves behind a data trail , which consists of the contents of memory left behind after the particle's transit. If many particles are travelling along on parallel paths, the data trail can take on dimensions, becoming a data stream.

    Data Wall

    A Data Wall is a preset pattern of bits which occupy a space in memory. This bit pattern isn't a particle, and it doesn't ever move. It is deposited there by the fiber network and must be erased by same (maybe this will change). The Data Wall should be made contiguously through the memory, so as not to leave any open space, along a straight X,Y or Z path (Angular paths could leak).

    Cyber Real-Estate...

    Another good use for a Data Wall would be in creating an enclosure within memory, for storing global data, or for dedicated use, such as a telecommunications 'easement' in each memory block, reserved for future expansion or new media. At any rate, the Data Wall would form a rectangular enclosure like this:
    
       WWWWWWWWWWWWWWXXXXWWWWWWWWWWWWWWWWWW
       W........       XX         ........W
       W......         XXXXXXXXXXX <----------These X's form an empty
       W....       GGGGXX........X   .....W    box. This is the same
       W....       GGGGXX........X    ....W    as putting in power
       W..         GGGGXX........X      ..W    lines before houses...
       W GGGGXGGGGGGG  XX........X        W
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<---Data transfer line
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX     to get data in/out
       W   GGGGGGGG     XX GGGXGGGGGGXGGG W
       W                XXWWWWWWWWWWW     W
       W..     GGG      XXWXgggggggXW   ..W  <--Note this 2nd-level
       W....  GGGGGGG   XXWXgggggggXW ....W      enclosure of a Wall.
       W....  GGGGGGG   XXWWWWWWWWWWW ....W      This 'ggg' Data can only
       W......          XX          ......W      be accessed by a
       W........        XX        ........W      higher-priveleged 
       WWWWWWWWWWWWWWWXXXXWWWWWWWWWWWWWWWWW      or superuser process.
    
    This is a 2D depiction of an n-dimensional virtual memory array, which provides neighbor-to-neighbor communications in most or all of its built-in dimensions. A 2D map is used for simplicity.

    ..also note along the horizontal pair of 'XXX' lines, most of the space is taken up (it has 'G's on it) Note though, that there must be a space made for data transit (the single 'X's that break up the GGGG strings) so that others can access data off in the corners of the walled area. Without making 'beach access' available to all, much of the walled-off areas would be useless to all but the earliest adopters.

    For best results, the Data Wall should be made in a rectangular shape. All processing which occurs inside the Data Wall shall be trapped therein, and only fiber i/o is supported to that area. This allows for multiple processes on a single chip without any risk of cross-process corruption (at least directly by particle motion).

  • there is still a chance that bad data could be deposited in a chip-to-chip manner using the fiber network


    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    4. How Memory is Used

    Here is a simple application of the MPTM which will use an array of memory similar to an 8-bit shift buffer as 'space'.
    The 'particle' in this 'space' can move either left, or right. This is but one possible use of the memory; there are infinite ways of creating new 'particles' using the memory as an ever-changing data matrix.

    Can you guess what this example does?


    Output over time: (showing data AND control bits)
    
    
           time    0   1   2   3   4   5   6   7   8   9.....
       location    2   3   4   5   4   3   2   1   2   3
              D0   0   1   1   1   1   0   0   0   0   1
              D1   0   0   0   0   0   1   1   1   1   1
              D2   0   0   0   0   0   0   0   0   0   0
              D3   0   0   0   0   0   0   0   0   0   0
            
              C0   0   0   0   1   1   1   1   0   0   0
              C1   0   1   0   0   0   1   0   0   0   1
              C2   1   0   1   0   1   0   1   0   1   0
              C3   0   0   0   1   0   0   0   1   0   0
    
    fiber output:  0   -   1   -   1   -   2   -   2   -
    
    
    As you can see, the incremental value in D0-D3 comes out of the fiber array (from two different fiber ports, at addresses 2 and 4).

    This is a simple ramp wave generator. Data starts at address 2 (initial position), and gets transmitted; after that, it moves to the right and gets incremented. Each time it is changed, the Data is sent out a Fiber conduit, and the particle moves on to the next address. When the particle encounters the 'Reverse Direction' command, it changes from Left traversal to Right traversal (the 'Direction' bit gets inverted). Note with this design that the particle actually gets transmitted TWICE each time it gets incremented. This could be changed with a different algorithm, this one is shown for simplicity. There could be a special opcode which only increments left-travelling particles, and another which only outputs to fiber right-travelling particles. With these two new opcodes you could imagine a smaller program doing the same thing.

    Just to be safe, there are Data Walls present at each end of the particle's path, to make sure that the processing stays put within this area of memory.

    Also note that the control area is used bit-wise, i.e. one bit per control function. It could just as easily be used as a four-bit control value, resulting in 15 different opcodes (plus Data Wall).

    As an alternatitive, if there are fewer fiber conduits (say every 3rd or 4th location), areas with fiber access can be made to auto-transmit their contents. If this is the case, the particles need to be forced to travel through the auto-transmit space.

    Addressing

    Addressing needs to take into account three things:
    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    5. Bit Depth and Complexity

    The above example had only four bits, so the number of different types of operations is very limited. But if the memory was 32 bits deep, which is a common size now, the number of possible operations increases greatly. For example, different sets of bits could represent multiple numbers used concurrently in the logic, to maintain running sums, averages, etc.

    Opcodes could be represented numerically instead of bit-wise, so 8 bits of control could represent 255 different opcodes (plus the Data Wall).

    If 128 or 256 bits of memory is available, large floating-point numbers could be attached to particles, allowing for scientific applications like studying heat transfer, weather simulations, and protein-folding simulations.

    If a larger problem arises, all that is needed is more MPTM chips added to the array, and the processing capability goes up linearly WITH the problem size.

    It has not been proven, but this architecture may be able to handle NP-complete problems like the Travelling Salesman problem, and may be able to handle increasing complexity in a linear fashion, just by adding more MPTM chips based on problem size.


    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    6. Parallel Processing

    The number of parallel processes that may exist within the MPTM is limited solely by the amount of memory/space available. There needs to be some sort of outside control to regulate which processes go where, how much space they will occupy, and when they start/stop. This could be regulated by an external computer, which may be of any type of architecture (including MPTM!)


    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    7. Error handling

    Since the MPTM architecture is basically asynchronous, and there are no 'traffic cops' regulating how the particles travel through space, the Data Walls must be used extensively to make sure that particles stay where they belong and don't corrupt memory outside their intended space.

    Also, it is possible that data could be sent in via the optical ports and corrupt space that already is being used by active particles; this should be watched for. Maybe a Data Wall that regulates the direction of flow in and out of memory may be needed at each fiber inlet/outlet to prevent unwanted particles from entering a certain area of space during execution.


    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    8. Chip configuration

    Each chip in the MPTM needs a set of programmable EEPROM bits which determine basic chip function.

    Chip configuration must be performed by an external 'control computer', which is the same computer which is used primarily to load new programs into the MPTM and also to initiate processing and collect results. This control computer can be of any architecture, as long as it can send and receive the data appropriately.

    It is probable, with larger and larger chips, that only designating a small subset of the chip as a "special" area (see below) will become more popular than dedicating the entire chip to one purpose. Some of these modes look like toggles which may go on and off during processing, while others look like more permanent settings. I recommend the permanent settings to be one bit each, if possible. The toggles should use the fastest architecture (parallel or muxed) to store their information.

    Basic chip configuration:"special" chip modes


    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    9. When Particles Collide

    Collisions between particles are also possible, and could result in many possible outcomes: All three possibilities could have uses, for example reflection (particles bounce off one another) could be used in the simulation of atomic particles or gasses.
    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    10. Basic Input/Output (I/O)

    Communication between the MPTM supercomputer and the outside world is very simple albeit heavy on the number of external processors necessary. Please keep in mind, that in a worst-case situation, there may be tens of thousands of particles moving through space, each of which may require an input or output stream. This is probably going to be one of the first bottlenecks in this design, the ability to get data in and out of the computer in a timely manner.

    Basic memory considerations: There should be a temporary block of RAM for every 4-8 blocks of MPTM spatial RAM. The temporary block(s) get loaded via laser transfer from the external source(s), each of which can be asynchronous. When a suitable amount of data has been loaded for input/output, the transfer occurs.

    Input

    For data going out, the different data tracks (one per particle or maybe one per memory block)are loaded with the outgoing data for each respective track. At this point, the data tracks are MUX'ed together into the laser output beam (there are at least 6 per chip for redundancy) The output beam goes through one or more sets of prisms before being emitted from the chip proper (most likely we will be using many IR bands)

    Output

    Output is the reverse of input, the data gets emitted from a particular row/column in the particle space, and is bound for some location (IP address, maybe?). The first place data will end up is in a large collection buffer on the outside end of the laser communication network.

    IMPORTANT NOTE: Please Read
    The Output must be broadcast in a different set of color bands than those being used for input, to avoid cross-illumination. An alternative would be to use a different set optics for input AND output, to keep the two subsystems completely isolated. This sounds nice, but the fiber overhead would make this project prohibitively complex and expensive, so I am going with a common optical path for both input and output traffic.

    In the early stages of fabrication, I recommend using visible laser source, red colors for input and blue colors for output, and maybe greens for chip-to-chip communications. This not only makes debugging a snap, but provides an excellent demonstration tool so people used to traditional computer architectures can more easily grasp the data and processing flow of the MPTM system.

    Data Identification (optical laser data)

    This is an easy way to brand the data. Every millisecond or every second, whatever is easiest, the laser sends out Morse Code (on/off, dots and dashes) indicating the following: There is a lot of descriptive data here, but for most chip-to-chip communications, only a few fields matter, namely the layer/row/column data. Data feeds from external sources, such as from different IP addresses, should be color-tagged to indicate their foreign origin. This would make firewalls much easier to implement, as well, since we are looking for different colors as opposed to different data. If the optical data is NOT in the right colorspace but its IP address indicates a foreign source, then this data transmission should immediately be flagged down as a possible virus/worm/attacker and dealt with appropriately. One good place to put 'bad guys' would be a 'playpen' in an unused area of memory, with a double set of fences built around it (or maybe more fences still. A particle could be sent around and around in the space between the two fences, and flag down any data transmissions it may see trying to break through.

    Also note, the laser frequencies (local intra-chip ONLY) could be canned, based on the row/column/layer of each spatial memory location. Such a beam would have to be modified for chip-to-chip travel, so as not to get confused with either the destination chip's local traffic or some other 3rd chip's chip-to-chip traffic destined for the same chip. Solution: There needs to be an automated way to modify local chip-to-chip laser traffic's wavelength by shifting it based on its chip-to-chip destination.


    ** NOTE: the coordinate/addressing system mentioned above is an example, and is arbitrary. The initial MPTM chip will ideally be an 8x8 XY matrix of particle "spaces", with only one "layer". However, future chip designs may incorporate many layers of logic, so I am including the layer as an integral addressing feature. "Rackboox" is arbitrary, referring to the piece of hardware that the MPTM chip is physically mounted to. After that, the Internet-2 enhanced IP address is perfectly suitable for addressing at the higher level.
    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    Appendix: Applications:


    Most of these are very time-consuming using typical Von Neumann architectures, even with RISC-based computing. The MPTM architecture speeds up these tasks by orders of magnitude; it all depends on how much memory you want to buy.

    RGB Frame Buffer showing checkerboard


    This demonstrates how any image can be stored in the MPTM space array. You can process as many images as you want, you just need enough memory.
    --- = data wall
    -f- = fiber inlet/outlet in data wall
      or   
       |
       f
       |
    
    010 = RGB frame data in memory, one bit deep
    
                       -f-f-f-f-f-f-f-f-f
                      |  1 1 1 1 0 0 0 0  |
                      f  1 1 1 1 0 0 0 0  f  
                      |  1 1 1 1 0 0 0 0  |
                      f  1 1 1 1 0 0 0 0  f
                      |  0 0 0 0 1 1 1 1  |
                      f  0 0 0 0 1 1 1 1  f
                      |  0 0 0 0 1 1 1 1  |
                      f  0 0 0 0 1 1 1 1  f
                        -f-f-f-f-f-f-f-f-
    
    
    This is basically a static storage area, new data gets deposited and read via the fiber inlet/outlets

    Particle-Carpet Microcode

    If you record the states of the main microcode registers of any CPU, you get a "magic carpet" which contains a trail of computations. Such carpets could be laid into MPTM memory, and using an array of particles all travelling parallel to each other, 128/256/512/etc bits across, the particles could read the microcode and act based on its values. Given this situation, any architecture could be emulated perfectly with no additional software.

    Data Shunt for streaming media

    This takes advantage of the massive amount of memory used in this design, and of the dense ramification of fiber connections throughout the memory system.

    This example shows four-bits deep:

    --- = data wall
    -f- = fiber inlet/outlet
    010 = stream of data in memory
    
                       -f-f-f-f-f-f----f-f-f-f-f-f-f----f-f-f-
                           
                    ... 1 0 1 1 0 0 1 1 1 0 1 0 0 1 0 0 1 0...
                    ... 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 0 0 1...
                    ... 1 0 1 1 0 0 1 0 1 0 1 0 1 1 0 0 1 0...
                    ... 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 0 1...
    
                       -f-f-f-f-f-f----f-f-f-f-f-f-f----f-f-f-
    
    This represents four bits of streaming media (uncompressed),
    and can just as easily be 32 bits, to encase a pixel stream.
    It represents a video clip being streamed for consumption
    over a remote network.  
    

    All a web server has to do is receive data from this stream and pass it on. Each MPTM chip has a data MUX which can stream out optical data from any or all of the fiber inlet/outlets (optical ports) at the same time, so obtaining the data from memory can be done in parallel by a large number of processes.

    Heat Transfer using 3-bit data

    --- = data wall
    -f- = fiber inlet/outlet in data wall
      or   
       |
       f
       |
    
    numbers = single particle's "temperature"
    
                       -f-f-f-f-f-f-f-f-f
                      |  0 0 0 0 0 0 0 0  |
                      f  0 2 3 4 3 2 0 0  f  
                      |  0 3 5 6 5 3 0 0  |
                      f  0 4 6 7 6 4 0 0  f  
                      |  0 3 5 6 5 3 0 0  |
                      f  0 2 3 4 3 2 0 0  f  
                      |  0 0 0 0 0 0 0 0  |
                      f  0 0 0 0 0 0 0 0  f  
                        -f-f-f-f-f-f-f-f-
    
    
    This shows simple neighbor-to-neighbor heat transfer, with the temperature represented as a 3 bit number. Each particle is static, but transfers its data value to its neighbor with attenuation. This type of application can handle complex problems very rapidly; its only limitation is memory size.

    Data Copy (on/off chip)

    This shows a block data copy (or DMA from outside) which moves a chunk of data from one place to another; it also handles data going off-chip.

    Note that, since the 'neighbor' can be a remote chip or even a remote IP chip, this operation can be used by web servers to deliver data rapidly and in parallel.

    ddd = data wall
    -f- = fiber inlet/outlet in data wall
      or   
       |
       f
       |
    
    x x x ... = area to be copied
    y y y ... = destination of copy
    
    
    			Source Chip		 Neighboring Chip
                       -f-f-f-f-f-f-f-f-f	  	 -f-f-f-f-f-f-f-f-f
                      |  0 0 0 0 0 0 0 0  | 	|  0 0 0 0 0 0 0 0  |
                      f  0 0 0 0 0 0 0 0  f---------f  0 0 0 0 0 0 0 0  f  
                      |  0 x x x x 0 0 0  |		|  0 0 0 0 0 0 0 0  |
                      f  0 x x x x 0 0 d  f---------f  y y y y 0 0 0 0  f  
                      |  0 x x x x 0 0 d  |		|  y y y y 0 0 0 0  |
                      f  0 x x x x 0 0 d  f---------f  y y y y 0 0 0 0  f  
                      |  0 0 0 0 0 0 0 d  |		|  y y y y 0 0 0 0  |
                      f  0 0 0 0 0 0 0 0  f---------f  0 0 0 0 0 0 0 0  f  
                        -f-f-f-f-f-f-f-f-             -f-f-f-f-f-f-f-f-
    
    
    
    The fiber links between chips allow neighboring chips to share data rapidly. If data is going to a non-neighbor chip, then the data must be uploaded off-chip, and goes to one of the surface laser propagators. The data is coded by wavelength so it reaches the correct chip.

    Data that goes from chip-to-chip (the three rightmost columns, in the above case) is either relayed using the chip-to-chip fiber links (shown as dashed lines above) or using the chip-mounted laser system to 'beam' the data to an adjoining (or remote, as the case may be) chip.

    Note that if the neighbor has been remapped using a chip configuration control (see above), the data may go far away to a remote chip. Also, if the neighbor has been walled off for some reason (see Data Walls above) then this operation will fail.

    Data Copy-the Swarm

    Instead of having the data 'jump' from one place to another, the data can also act as particles and just move over to the new location, providing that there is an empty path wide enough for the data 'swarm' or 'cloud' to pass through.
    dddddd = data wall
    ====== = laser conduit/fiber optics
    -f-f-f = fiber inlet/outlet in data wall
    
      or   
       |
       f
       |
    
    x x x ... = area to be copied
    y y y ... = destination of copy
    
    
    			Source Chip		 Neighboring Chip
                       -f-f-f-f-f-f-f-f-f	  	 -f-f-f-f-f-f-f-f-f
                      |  0 0 0 0 0 0 0 0  | 	|  0 0 0 0 0 0 0 0  |
                      f  0 0 0 0 0 0 0 0  f=========f  0 0 0 0 0 0 0 0  f  
                      |  0 x x x x 0 0 0  |		|  0 0 0 0 0 0 0 0  |
                      f  0 x x x x 0 0 d  f=========f  y y y y 0 0 0 0  f  
                      |  0 x x x x 0 0 d  |		|  y y y y 0 0 0 0  |
                      f  0 x x x x 0 0 d  f=========f  y y y y 0 0 0 0  f  
                      |  0 0 0 0 0 0 0 d  |		|  y y y y 0 0 0 0  |
                      f  0 0 0 0 0 0 0 0  f=========f  0 0 0 0 0 0 0 0  f  
                        -f-f-f-f-f-f-f-f-             -f-f-f-f-f-f-f-f-
    
    
    
    In this case the x's just start to move to the right, when they get near the destination they move down by one slot. The data space in between the start and destination areas will be trashed as a result. However, this move can be done quite fast, if it is less than the size of one chip, and doesn't burden the offchip laser system at all (unless the neighbor is a remote chip!)

    Note that if the neighbor has been walled off for some reason (see Data Walls above) then this operation will fail.

    Maser (microwave emitting device)

    MASER: Create a MASER array that can send out columnar beams at multiple frequencies in multiple directions simultaneously from a single point source. May have LASER applications as well.

    This could be done using MPTM hardware "blocks" that may also suitable for standard computations, and any general-purpose application could be run on it.

    ------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------
    ...end of document...
    ------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------









    (C) Copyright 1989-2010 fractallonomy, inc. all rights reserved.

    Dave Scruton, Fractallonomy, Inc.
    155 Wailea Ike Place #39, Kihei HI 96753
    fractal@maui.net (808)879-3880