GRAMPS & Map-Reduce PPL 9 April, 2009 * GRAMPS Background / Intro * Map-Reduce Background / Intro - Graph - Example environments: - Google cluster - Many-core host - GPU and other accelerator chips - Example apps: - Google paper: word count, grep, URL frequency, Reverse Index, etc. - Phoenix: Word count, string match, reverse index, linear regression, matrix multiply, k-means, PCA, image histogram - Machine Learning algorithms - Etc. * Extending GRAMPS for Map-Reduce - Queue Set Reprise - Add three things: - Dynamic subqueue creation - "Keyed" subqueue indexing - Instanced thread stages - Built image histogram as a testbed - Split: subdivide image into N pixel regions - Map: Per-pixel in region, compute luma, emit (luma, 1) - Reduce / Combine: Per luma, sum all the values - Histogram Reduce (GRAMPSViz) - Histogram Combine (GRAMPSViz) - Looking at some other things: - "Filter" / Partial reduction shaders - Combine enables parallelism *within* a subqueue. - Revisiting push coalescing - A app *without* a partial reduction (Feedback: what about floating point math or some form of blending?) - Somewhere on the radar - Micro cores - Packet sizes / chunking - More detailed scheduling opportunities / needs - Opportunities when reduce is combine - Abstracting Map-Reduce API from app