I disagree, AMD and nVidia appear to have optimized their cards differently (directly comparing Kepler and Southern Islands). AMD has larger memory busses to drive higher resolutions more reliably, their cards are somewhat fatter (larger) and slower clocked than nVidias chips. nVidia appeared to be more focussed on a high clock chip, possibly to cut on Silicon usage, and possibly other production costs.
Compare the 7970 and 680.
Cores: 2048 vs 1536 (33.3% more on 7970)
Core clock: 925 vs 1058 (boost) (GTX 680 ~14% faster clocked)
Memory bus: 384-bit vs 256-bit (50% more on 7970)
Memory clock (effective): 5.5GHz vs 6GHz (GTX 680 ~9% faster)
Memory bandwidth: 264GB/s vs 192GB/s (7970 ~37.5% faster)
As far as raw performance (GFLOPS) the 7970 beats the 680 by just over 10%, but in certain games it fell behind at lower resolutions. I can only assume that the difference here is caused by a) drivers and b) the architecture itself, possibly by the ROP count being the same, but faster clocked on the 680. At higher resolutions the advantages of the 384-bit over the 256-bit memory bus becomes apparent, the 7970 being able to handle large amounts of VRAM more effectively.
Overall I would consider the cards pretty much even in performance, trading blows with each other in different games at different settings.
But, the nVidia card has a consistently lower power consumption for similar performance, uses less silicon, and has a generally more meager subsystem as far as memory etc. are concerned, so actually I would consider the GTX 680 a more efficient card overall than the 7970, as it uses less to get pretty much the same performance at 1920x1080.
As far as Physx goes, although I don't agree completely with it being proprietary, but I do believe that running it on a GPU is better latency wise, and possibly better as far as general speed of the operations goes. Although you are cutting into your graphics horsepower, you are able to run it off the VRAM, and also take advantage of the massive paralellization the GPU has to offer with its 1.5k cores compared to 4 cores of an i5. It would really come down to the individual calculations, you would have to look at them on an individual basis and decide which you can parallellize and which you are better off processing serially on the CPU. Another advantage could be that you can do the physics calculations as part of the graphics calculations without having to exchange a lot of information between the GPU and CPU. Also, as the information of the scene (before rendering) is stored on the GPU, it would be best to edit it using the GPU, so you don't need to transfer the information from the GPU VRAM to the CPU and back.
Note that there are obviously a lot of specifics I don't know about, as usually the optimization of these systems really will come down to trial and error, and I have not analyzed the process completely. But, as I said, there are very legitimate advantages to doing physics calculations on a GPU rather than a CPU.
One of the things that struck me about what nVidia is doing with the proprietary software is that they are integrating them very deeply into the game engines themselves, which actually is somewhat of a different approach to the problem than AMD have. I feel that both are legitimate, but AMDs approach allows the game developer more freedom, at the expense of an increased (EDIT: let me restate that: HUGELY INCREASED) workload on their (the game devs) part, while the approach nVidia has allows the developer to save a lot of time when making the game, and allows nVidia to optimize the process at any later point, of course at the expense of more work having to be done on nVidias part.