An idea I had on a cpu, hard to explain

For anyone interested in this subject I recommend having a read through the following blogs and web articles. It will help you to understand the key differences in modern x86 CPU architecture and the challenges that game developers using object oriented design face when working with them;

  1. http://www.overclockers.com/bulldozer-architecture-explained/
  2. http://www.realworldtech.com/bulldozer/3/
  3. http://www.realworldtech.com/sandy-bridge/6/
  4. http://community.amd.com/community/a...he-new-flex-fp
  5. http://gamedevelopment.tutsplus.com/articles/what-is-data-oriented-game-engine-design--cms-21052
 

Tmk the fx parts have 2/3/4 modules with 2cores each.. Making the 4xxx/6xxx/8xxx line up and each pair of cores shares an fpu and other resources.

So, It's hypothetically possible for them to share the same load and act as one?

Ehh,... 

Read this http://www.reddit.com/r/buildapc/comments/1e8226/discussion_amds_module_architecture_the_fx_8350/

I think I get it. Thanks for the link

I don't know if anyone has posted this information as I have not read through the thread, but this is what you are describing.

 

http://wccftech.com/amd-invest-cpu-ipc-visc-soft-machines/

Skimmed through it. And yes. I think you're the only person that got it. The computer would use the virtual core, which uses multiple cores  to function. That way if an application only uses 4 cores, the cpu can use 4 virtual cores, each using 2 cores. I'm actually glad this is being made

The problem is that that implementation would be rather difficult and would probably require a lot of finagling. I am not sure what all would need to be optimized for it, but I doubt that we would be able to see this on today's platforms with a bios update. It would likely require the mobo, the cpu, the API, and the software to all be in coordination in order to pull it off. Somewhat like HSA. AMD seems to be ok with attempting things like this, which I personally like to see. How well it will work and whether or not it is even worth it vs standard, multi-core optimization and HSA (which already has an advantage in that it is a real thing already) remains to be seen. Definitely something to keep an eye on, and I am always excited about new technologies, so we will see.

This can make it easier for the consumer, because the cpu will always be able to use all of it's resources if it has to. Optimization wouldnt be a bottleneck because of the virtual cores. If an application is optimized for 2 cores, and i have an 8 core. The cpu could make 2 virtual cores and use 4 cores in each virtual core. 

I have to say I'm pretty hyped about this

What I am saying is that with this technology, I doubt that it would perform better than if the software was optimized for the number of cores that the cpu in question has. The technology itself will have overhead. Considering that most software devs have realized the multi core optimization is becoming a must, there is now an arms race of sorts between the multi core optimization and the soft core technology (or whatever it is called). If everyone uses multi-core optimization and that becomes easy enough to implement in future software, then the need for soft core goes away. On the other hand, if soft core becomes a standard, then software devs wouldn't need to bother with multi core optimization (assuming that it is effective enough to not necessitate the use of multi core optimization). What I am saying is that the future is up in the air right now. We will see how it goes.

One thing is for certain though. Something like soft cores would be much better than focusing on single core performance. Now we just need to make sure that the various tasks are being handled by the soft cores are divided up effectively to allow for proper multi tasking.

The way i see it, no matter how the application is optimized, the soft core optimized cpu would adjust itself according to how the application is optimized, so in the end soft core cpus would win the race. I'm just guessing though

That's not really how it works. Interesting idea if it could be implemented but it would require a whole lot of reengineering. However a logical core isn't exactly an emulated core. Linustechtips made a great video about this.

But basically if you imagine your mouth to be a core and your hands threads hyperthreading would basically be you being able to shove food in your mouth with two hands instead of one. Allowing you to continue shoving food in your mouth if your mouth finishes chewing before your hand can provide more food.

I don't think you understand. Your analogy pertains to hyper threading. The mouth being the physical core and the arms being the threaded ones. What I'm going for is Having 2 mouths and one hand. So two physical cores working ONE thread. 

The link 1920.1080p.1280.720p  posted describes this in good detail, if you read it. Two physical cores emulate ONE virtual core which works one thread. 

Doesn't the cell processor work this way?

Processors can only process one thread per core. In intels case it's two threads per core. Applications are optimized to use a certain number of threads. That means if the application you are using is optimized for 2 threads then you're gonna have 4 cores that aren't being used. When you're playing videogames that are cpu bound that can be a problem, since the game won't use all your cores.

@OP : I kind of understand where you are coming at, but you would probably have to make your own instruction set for it...(that also is compatible with x64/86 to compete with other systems/ work in the same field)

One thing that I would cringe at is power usage, so you would have to optimize that

Also, you would have to think about thermals, since you are running at so many cores for one or 2 threads at once, you would only have to clock it around 1.5-2 gHz rather than 3-4 gHz, since it might melt

Let's just get the following out of the way:

F!st p0st.

Cool. With that done, I wouldn't actually expect such a scheme to improve performance much if at all for single-threaded tasks. The reason for this is as follows:

In code, there are several different sorts of parallelism. What you first refer to is known as thread-level parallelism, or a CPU's ability to execute multiple independent threads at the same time. This is effectively what makes an 8 core an 8 core: its ability to process 8 different threads or streams of instructions at the same time. Moving on, there's another important form of parallelism that I should mention: instruction-level parallelism. Inside each thread, your CPU is executing a sequence of instructions, some of which can actually be run at the same time as long as there isn't some sort of data dependency (ie. if one instruction isn't going to be modifying another instruction's data.) So, for example, if you're adding A to B, then adding B to C, you'd end up with a data dependency and the code could only be executed serially. On the other hand, if you're adding A to B, and C to D the CPU could execute those at the same time.

Basically what you're trying to do is have the CPU attempt to break down an individual thread and execute it all on multiple cores. That idea does, in and of itself, make sense. If at all possible, why not try to allow more instructions to run at the same time? Because of scheduling. See, in normal code, there doesn't exist enough parallelism to really fill up even 2 piledriver cores worth of instructions. Also, when you factor in the overhead involved in running the scheduling for CPUs outside of hardware, things would get ugly very quickly. The problem with BD/Piledriver is that the CPU itself does a fairly poor job of feeding its execution resources, whether its through caching problems or a less than optimal instruction scheduler. While this doesn't matter for raw throughput (and is thus the reason I'm rocking an FX 8350), it does come into play when trying to make individual threads faster. The 8350 simply can't extract enough parallelism on its own without a little help in the form of optimization.

There is, however, a project that can do something like what you propose. Look up something called OpenMP. It's a framework/set of features that allows modern compilers (ie. not MSVC) to automatically generate thread-parallel code with very little effort on the part of the programmer. The only problem is that it only starts to make sense when you're crunching lots of numbers (which is why it's used for machine learning software like LibSVM) due to the overhead. 

Side note: the reason that Intel runs multiple threads on each of its CPU's cores is that the resources are otherwise wasted by poorly optimized code or lack of parallelism. In other words, because they make their cores so wide, they can fit an entire other thread's worth of instructions per core in just the unused resources.

Quick edit: I'd like to see a CPU with a dynamic number of threads all implemented in hardware. It'd have just one insanely wide core and the OS could scale up or down the number of threads based on load.

Great answer! I'll look into what referred me to. And your idea on a really wide cpu core would also fit the bill. 

 

Guys can you help me out? I'm new here and I'm not too sure what you are talking about. I searched it on google images and it didn't explain a whole lot to me. How do these get you better video game performance? I just want to play BF4 with FPS above 30. My resolution is 24"