Villane wrote:
It would be interesting to use this in ScalaBox2D, except I think it's hard to find code to parallelize in a fine-grained way in Box2D. What do you think?
1) This Scala wrapper is freaking awesome. I've avoided OpenCL up until now because it was going to cost me more in development time than it would save me in compute time, but the wrapper is unbelievably straightforward.
2) We discussed parallelizing Box2d computations a while back, and I believe the way it worked out was as follows:
- Each island can be stepped independently without much trouble, so that's easy
- Each island can be
solved independently, so that's also easy
- Even within an island, we may be able to get away with solving each (possibly conflicting) contact independently and just ignoring the fact that it's happening concurrently, because the solver is pretty much agnostic to the order of contact evaluation and using a slightly stale value as an input is not going to make a major difference. This was never tested (or at least it was never shown to lead to a speedup, I think I tested the behavior and it seemed okay, but I couldn't get any speed bump without some massive restructuring of the Java code because of thread creation overhead).

- The old broadphase was pretty much impossible to parallelize
This is probably my OpenCL ignorance shining through, but my main question is: how tiny does each computation have to be in order to get a benefit from pushing (many copies of) it through that pipeline? I've always been under the impression that graphics card processing is only useful for situations where you have a simple set of vector operations that needs to be applied thousands of times, without much actual logic going on, in which case we're not likely to see very much improvement.
I wouldn't be too eager to introduce this dependency in the Scala Box2d, even if it did help - it ultimately relies on native code that can't run on a lot of machines, which means a whole lot of headaches (no unsigned applets, for one, also a messier build, probably getting into even weirder stuff if you want to support machines without OpenCL abilities).