From: Ben Serebrin [serebrin@Stanford.EDU] Sent: Friday, October 12, 2001 12:42 AM To: Ian Buck Cc: William R Mark; hanrahan@cs.stanford.edu; Ian Andrew Buck; billd@cva.stanford.edu Subject: RE: Conditional code? Ian, I think I understand from your mail that you mean each cluster should independently be able to do all these things that C can do. I agree that we have plenty of logic gates for MIMD properties, but as soon as you start spending gates on memory, you can hit troubles. As soon as we need to have separate instruction memories for each of N clusters, we have N*(instruction memory size) area used up for instruction memory. It's likely that programs are big, so we'd need to provide a lot of memory. Since streams benefit from data parallelism, it's likely that the work being performed on streams will often be the same or similar in each cluster, so complete independence of all the clusters may not be necessary. I'm working on a document now based on a discussion Bill Dally and I had at lunch today, looking at the various shapes MIMD can take. I'll send the completed version to you tomorrow, hopefully. But as a preview, the most interesting form to me is the following: As in Imagine, there is a large, shared instruction memory that broadcasts instructions to all clusters. Each cluster will have its own PC and its own small instruction memory. The PC can address either the current instruction word coming from the shared memory, or the PC can address the small cluster-local instruction memory. At a conditional branch, a cluster may branch into (or out of) its local memory. This solves the problem of having N large instruction memories, and takes advantage of the fact that any deviations from the main program are likely to be small variances followed by a merge back into the main program. This could, for example, allow a cluster that is rasterizing a large triangle to work continuously while other clusters fetch the next triangle. In current Imagine code, the clusters that are loading share instructions with the clusters that are still working, so the working clusters must stall each time the finished clusters require a data load. Synchronization issues need to be addressed. Also, if we keep the SRF form, the independently-running clusters will still need to coordinate their SRF accesses (since the SRF is ganged--there's only one address). An alternative to this may be N side-by-side RAMs with separate addresses that can act either as an SRF, or as an independent set of RAMs. There seems to be some problem with this last notion, but I can't see it yet. Ben On Thu, 11 Oct 2001, Ian Buck wrote: > > Basically, you should be able to do all that C can do: loops, data > conditionals, branches, perhaps even recursion. You should look at the line > rasterizer in the Brook document, has all three. I think we've reached a > point in chip design that your so limited by the pinout (pad ring > determining die size), the extra gates needed for MIMD should be free. > > http://graphics.stanford.edu/streamlang/brook_v0.1.pdf > > Ian. > > > -----Original Message----- > > From: Ben Serebrin [mailto:serebrin@Stanford.EDU] > > Sent: Thursday, October 11, 2001 4:01 PM > > To: William R Mark; hanrahan@cs.stanford.edu; Ian Andrew Buck > > Subject: Conditional code? > > > > > > Hi, all, > > > > Bill Dally and I were talking today about the degree of MIMDism in the > > SSS; there's an interesting spectrum between SIMD and MIMD, and some of > > the midpoints are rather interesting. I'm working on a memo comparing the > > architectures. > > > > One of the things that would beneficially drive my thinking is a few good > > examples of what kinds of conditional program code we might be looking at. > > Any suggestions? > > > > Thanks very much! > > Ben > > >