Previous | Next --- Slide 16 of 50
Back to Lecture Thumbnails
matteosantamaria

If I had to perform a scan-like operation in my code, I certainly wouldn't be clever enough to implement it this way. It goes to show that relying on library implementations can have huge performance benefits.

rtan21

Why do we drop/zero-out the last element? To me it looks like we are discarding the last part of the forward sweep that we did before the reverse sweep step. Is that out of necessity due to the nature of the algorithm?

abraoliv

I know that the professor mentioned what the dashed black line in the middle indicated, but can someone remind me? Also, I dont understand the significance of the dotted red lines and why they seem to point zeros to other zeros.

tonycai

The dotted black line indicates that the algorithm runs in two stages, so the span is 2*lg(n). I believe the dotted red just differentiates indicates that the element is moved to a different index in the next iteration (while the solid red line indicates the the element is sum of 2 elements from a previous iteration).

ecb11

This set of parallelized algorithms seems to be something that might span lengthily the development of parallel hardware (GPUs, etc.), and I'm curious if there's a field of research dedicated to parallel algorithms and if so where they hold the most potential (I'd imagine in large data set manipulations). This one in particular showcases that we'd still need accompanying large computing power (e.g. processors to justify O(NlogN) work over the original O(N)), but it could be worth it for big data applications (I'd imagine).

student1

The entire process behaves by gathering groups of information and deliver the information in an interleave manner such that the final result is exactly what a scan function do. However, my question is that why the last element is left out? Is this the intention?

juliewang

@student1 I think the top section is showing inclusive scan (last item included) while the bottom section is showing exclusive scan (last item is left out)

juliewang

Scratch that earlier comment, the 0 is added so that when you do the "down sweep" you can get 0, a0, a0-1, a0-2, ... a0-14. It's also noted on this slide that this is exclusive scan, we want to include everything up to but not including a[i]

jasonalouda

Are we expected in this class to be able to come up with such clever implementations of work-efficient parallel code?

paavni

If this talks about the exclusive scan, why do we need to calculate a8-15 in the first place. Isn't that sort of redundant as we are not using it anywhere.

leo

@paavni we want to calculate/scan the entire 16 elements. Having the scan be exclusive means not including the current element index in our computation. The 0th index includes elements up to and not including 0 (0 by default), 1st index includes elements up to and not including 1 (a_0 value), etc

jtguibas

@matteosantamaria I agree wholeheartedly with you. I would have never come up with this!

A bar cat

I am wondering what are case scenarios where we would use something like an exclusive scan? I guess I don't see why we need a whole algorithm that does this.

ccheng18

On the fourth line form the bottom, why do we assign the 7th index as 0. And what are the steps to even begin thinking about these kinds of solutions? Do they just come with experience?

Please log in to leave a comment.