From: Bill Dally [billd@csl.stanford.edu] Sent: Friday, October 12, 2001 7:33 AM To: Ben Serebrin; Ian Buck Cc: William R Mark; hanrahan@cs.stanford.edu; Ian Andrew Buck; billd@cva.stanford.edu Subject: RE: Conditional code? Ian, We're all agreed that on the software side you should be able to express any "C" control construct in a kernel - if, for, do, while, subroutine calls. How to efficiently implement this is an interesting research question. Instruction bandwidth is *very* expensive - it costs more to deliver an instruction to an ALU than the ALU costs (even a 64-bit FP ALU). Ben is looking at a bunch of alternatives ranging from full SIMD to full MIMD. To evaluate them he needs some benchmarks. If all we have to do is Bresenham, then SIMD+conditional streams will work just fine - we run a lot of kernels just like this on Imagine with very high efficiency. However, I had the impression from Pat that there were applications with more complex conditional structures in the inner loop. If you know of such programs, it would help a lot to put a good benchmark set together. ----Bill PS. This comparison of control alternatives will also make a great ISCA paper and perhaps a chapter of Ben's thesis. At 12:41 AM 10/12/01 -0700, Ben Serebrin wrote: >Ian, > >I think I understand from your mail that you mean each cluster should >independently be able to do all these things that C can do. I agree that >we have plenty of logic gates for MIMD properties, but as soon as you >start spending gates on memory, you can hit troubles. As soon as we need >to have separate instruction memories for each of N clusters, we have >N*(instruction memory size) area used up for instruction memory. It's >likely that programs are big, so we'd need to provide a lot of memory. > >Since streams benefit from data parallelism, it's likely that the work >being performed on streams will often be the same or similar in each >cluster, so complete independence of all the clusters may not be >necessary. > >I'm working on a document now based on a discussion Bill Dally and I had >at lunch today, looking at the various shapes MIMD can take. I'll send >the completed version to you tomorrow, hopefully. But as a preview, the >most interesting form to me is the following: > >As in Imagine, there is a large, shared instruction memory that broadcasts >instructions to all clusters. Each cluster will have its own PC and its >own small instruction memory. The PC can address either the current >instruction word coming from the shared memory, or the PC can address the >small cluster-local instruction memory. At a conditional branch, a >cluster may branch into (or out of) its local memory. > >This solves the problem of having N large instruction memories, and takes >advantage of the fact that any deviations from the main program are likely >to be small variances followed by a merge back into the main program. >This could, for example, allow a cluster that is rasterizing a large >triangle to work continuously while other clusters fetch the next >triangle. In current Imagine code, the clusters that are loading share >instructions with the clusters that are still working, so the working >clusters must stall each time the finished clusters require a data load. > >Synchronization issues need to be addressed. Also, if we keep the SRF >form, the independently-running clusters will still need to coordinate >their SRF accesses (since the SRF is ganged--there's only one address). >An alternative to this may be N side-by-side RAMs with separate addresses >that can act either as an SRF, or as an independent set of RAMs. There >seems to be some problem with this last notion, but I can't see it yet. > >Ben > > >On Thu, 11 Oct 2001, Ian Buck wrote: > > > > > Basically, you should be able to do all that C can do: loops, data > > conditionals, branches, perhaps even recursion. You should look at the > line > > rasterizer in the Brook document, has all three. I think we've reached a > > point in chip design that your so limited by the pinout (pad ring > > determining die size), the extra gates needed for MIMD should be free. > > > > http://graphics.stanford.edu/streamlang/brook_v0.1.pdf > > > > Ian. > > > > > -----Original Message----- > > > From: Ben Serebrin [mailto:serebrin@Stanford.EDU] > > > Sent: Thursday, October 11, 2001 4:01 PM > > > To: William R Mark; hanrahan@cs.stanford.edu; Ian Andrew Buck > > > Subject: Conditional code? > > > > > > > > > Hi, all, > > > > > > Bill Dally and I were talking today about the degree of MIMDism in the > > > SSS; there's an interesting spectrum between SIMD and MIMD, and some of > > > the midpoints are rather interesting. I'm working on a memo > comparing the > > > architectures. > > > > > > One of the things that would beneficially drive my thinking is a few good > > > examples of what kinds of conditional program code we might be > looking at. > > > Any suggestions? > > > > > > Thanks very much! > > > Ben > > > > > > > > -------------------------------------------------------------------------- Bill Dally billd@csl.stanford.edu (650)725-8945 Professor of Electrical Engineering and Computer Science FAX(650)725-6949 Computer Systems Laboratory, Stanford University Gates Room 301 Stanford, CA 94305 http://csl.stanford.edu/~billd --------------------------------------------------------------------------