I think this approach would definitely help speed up convolutional layers, but I think the risk that I've seen with my ML classes is that you could have implemented your model incorrectly, yet in such a way that backpropagation and optimization can still occur. As in the code still runs, but the model is mathematically wrong. I've seen these situations usually result in poor model learning, but I can imagine that implementing something like this could have small errors that might go undetected as a subpar learning accuracy. The reason is because when I did the extra credit, I found out how easy it is to screw up a simple matrix multiplication when you are doing these types of spatial related optimizations. Do current libraries that do convolutions already implement these for us? Because I definitely would be cautious of implementing my own for the sake of not being able to catch the errors which are very easy to create.
tennis_player
@derrick It seems that most ML developers will just use library functions such as PyTorch or TensorFlow to write their models. This is just quicker for the developers and avoids some of the issues you are hinting. People who would be writing these functions are engineers who actually work on these frameworks and constantly try to achieve the best possible performance on a single function. Then, it’s up to the model developer to string all the pieces together to make a working DNN.
I think this approach would definitely help speed up convolutional layers, but I think the risk that I've seen with my ML classes is that you could have implemented your model incorrectly, yet in such a way that backpropagation and optimization can still occur. As in the code still runs, but the model is mathematically wrong. I've seen these situations usually result in poor model learning, but I can imagine that implementing something like this could have small errors that might go undetected as a subpar learning accuracy. The reason is because when I did the extra credit, I found out how easy it is to screw up a simple matrix multiplication when you are doing these types of spatial related optimizations. Do current libraries that do convolutions already implement these for us? Because I definitely would be cautious of implementing my own for the sake of not being able to catch the errors which are very easy to create.