There is a general movement of AMD towards the enterprise side of things, their consumer cards are pretty sweet deals for people that need raw computing power. All AMD CPUs are unlocked, but also all AMD CPUs and chipsets have full hardware virtualization. This was traditionally something for the enterprise server market, but as RedHat have shown with nVidia cards running on Intel VT-d enabled platforms, there is also the opportunity of considerable graphics acceleration, at least in linux, which scales well.
Scaling was never a feature of MS-Windows, so people had to buy ever more absurdly overpowered hardware to run pretty basic stuff with acceptable performance. Not only DirectX was the problem. DirectX was always a lie. Microsoft had to come up with a "direct API" after the Core APIs that Steve Jobs introduced when he stole BSD to make BeOS, which became MacOS/OSX. That is a much more "direct" API than DirectX ever was, and everyone that has done multitrack audio production on a PC and on a Mac, will tell you that DirectX never was "direct", in fact, everyone used the Steinberg ASIO API for low latency sound card access instead of DirectX. The same goes for video, DirectX is the most "indirect" graphics API in existence on any platform, it's simply a disgrace, but it was pushed further because Microsoft dodn't allow any alternative on Windows.
Lately, AMD has been very busy bringing enterprise server technology to all of their products, and leveraging performance of various subsystems along the way. An example is OpenCL. AMD GPUs have a lot of compute cores, and AMD graphics cards have a wide data bus. This was used by AMD to allow users of a scalable system, like linux, to try and integrate the existing load balancing technology that has always been a focus point of linux, into those AMD products, and that has worked. In linux, using existing open source tools, you can use AMD GPUs as compute coprocessors. Problem is, this is a feature of linux, that requires some knowledge of the user to configure the load balancing. So AMD took that technology, and made an API that basically recognizes running applications, and activates a profile-based load balancing for them based on the standard existing open source linux tools. That technology is also free to be used by nVidia or Intel, but those manufacturers don't have as many compute cores on their graphics products, and often have a narrower data bus. Also, Intel doesn't believe in enabling hardware virtualization on consumer platforms, and nVidia doesn't help with development of open source drivers that would enable this, and tries to stick to CUDA for it's own proprietary binary blobs. The logical next step for AMD would then be to provide an API that would basically be a simple tool to manipulate the load balancing and direct hardware access functionality, and would allow application developers to create their own performance profiles system-wide (of course not in Windows, that is technically not possible, but in linux, it is normal). The benefit of Mantle in Windows is thus limited to circumventing the buggy and crappy DirectX, but the benefit of Mantle API and profile-based load balancing software optimized for AMD GPUs in a linux operating system, would be exponentially higher. That would allow AMD to not only improve graphics performance enormously, but also to improve compute performance in general, and, on a linux system, provide a cost effective way for users to scale up their system using GPUs. I'm running an array of 5 AMD GPUs right now on one of my AMD Opteron-based servers in a HPC test project, and I'm very likely going to add at least 5 extra cards to that machine. It scales very well for compute performance, and that's without AMD-optimized software, just using the traditional linux tools. I did do trials with nVidia first, I did trials with Intel, Intel beats the AMD solution in raw power, but not in flexibility and overall HPC performance, nVidia didn't perform that well because it's not flexible at all and the compute performance was not what I had expected.
As to ARM-platforms, very little groundbreaking things can be expected from nVidia and Qualcomm for instance, but take a look at some other ARM-licensees, that are using Mali 624 and siminal GPUs on their SoC's: these are DirectX11 compatible! Yes, you've heard right, DirectX 11 compatibility in a 17 USD SoC with up to 8 application processor cores and up to 8 GPU cores. See it as wine built into the hardware. And all the leading development for these is open source. That means that soon, these SoC's will be able to run whatever a user may want to throw at them, and that would allow users a seamless transition from the legacy platform and applications to the new ones on very cheap devices. These SoC's come to the market in Q1 2014, so that's very near.
So of course AMD - that is not an ARM-licensee - wants a piece of that, and they might just have the piece of the puzzle that makes it all possible, because if they deliver the tools needed to profile load balancing on multiple platforms and circumvent DirectX, they have a similar answer for the x86 platform ready to go. Full hardware virtualization on all of their x86 products means that people can virtualize ARM with great performance on their AMD machines and use ARM-focused applications, but at the same time, run legacy DirectX stuff, and run new windows-based x86 stuff with almost legacy linux grade performance, and run modern linux stuff with incredible performance, much faster than ARM, because ARM runs fast, even though it's basically much less potent hardware, because it isn't held back by closed source proprietary and legacy crap like x86.
nVidia doesn't follow the same path, because they are ARM-licensees, and they think they can persuade consumers to buy both their x86 products, and their ARM-products, and set up honey traps like rebranded open source streaming technology to justify that.
We'll see what approach has the most success in the future...