Sunday, 3 February 2013


I've been looking at the specs mentioned by some hacker on NeoGaf. They seem to be quite under-powered.. But for everyone that knows about my Nintendoptimism, they know what i'm about to say next.
From both the CPU and GPU standpoint. I'll take on CPU first.


I will briefly list out the CPU specs and then analyze it.
  • 1.2ghz Clockrate
  • Tripple Core 45nm process IBM Power PC architecture
  • O.O.E
  • 1.5 MB shared L3 cache
  • 256 KB L2 cache for core 0+1, 1MB cache for core 3
Ok. Now let me say this. CPU speeds are fast going out of fashion, the same way polygon count no longer matters to GPU. The way Nintendo has chosen to go this time around is the way of an enlarged Cache. What can this do for cpu performance?
To understand this, you will need to know more about OOE (Out of Order Execution).

According to Wikipedia:
In computer engineering, out-of-order execution (OoOE or OOE) is a paradigm used in most high-performance microprocessors to make use of instruction cycles that would otherwise be wasted by a certain type of costly delay. In this paradigm, a processor executes instructions in an order governed by the availability of input data, rather than by their original order in a program. In doing so, the processor can avoid being idle while data is retrieved for the next instruction in a program, processing instead the next instructions which are able to run immediately.

In basic terms, the Out of Order execution does not wait until the system needs something before it processes it. The order of execution is shown below versus In Order Execution.

In-order processors

.Instruction fetch.
.If input operands are available (in registers for instance), the instruction is dispatched to the appropriate functional unit. If one or more operand is unavailable during the current clock cycle (generally because they are being fetched from memory), the processor stalls until they are available.
.The instruction is executed by the appropriate functional unit.
.The functional unit writes the results back to the register file.

Out-of-order processors

.Instruction fetch.
.Instruction dispatch to an instruction queue (also called instruction buffer or reservation stations).
.The instruction waits in the queue until its input operands are available. The instruction is then allowed to leave the queue before earlier, older instructions.
.The instruction is issued to the appropriate functional unit and executed by that unit.
.The results are queued.
.Only after all older instructions have their results written back to the register file, then this result is written back to the register file. This is called the graduation or retire stage.

So if an In Order CPU  has A, B ,C ,D ,E to process, it will first process A, and then it will try to process B. If the variables required to give B a specific result is not yet available, then it stalls the process.
But in Out of Order, the CPU will queue everything in the cache until the variables are available. This helps the cpu escape being idle. So if you were playing COD, and you just stopped moving, the Out of Order CPU  will queue the other operands, as much as the cache can carry. So it doesn't waste processes as the In Order Execution does.

Wii U is the first console in history to use an OOE based CPU. That gives it a huge advantage against the likes of the Xbox 360 in terms of efficiency. Developer will need time to get used to this though.
Also note that the Wii U is reported to have 2 CPUs. The 1st is what i described above, while the 2nd is an ARM processor set aside for the OS.


Everyone knows the Wii U GPU is beefy. It has 400 stream processors, versus the 48 in the Xbox 360. In addition, each stream processor has been modified to handle more advanced mathematical operations like multiplication and stuff. So the quality of each stream processor is much higher than what's on the 360.
Where this gives Nintendo an advantage is that the GPU can share CPU burdens. Because it can do high level math operations, the GPU can be used to handle physics and AI  operations. The technique is known as Compute Shading. Note, this is a Direct X 11 technique.

More so, the GPU has a Tesselation Unit. This is a new technique added to the pixel pipeline. In the pipeline, it is known as Geometry shading. What it does is that it triangulates polygons into smaller triangles. This helps the GPU generate highly detailed models from low polygon count. The Xbox 360 cannot do this. It was first introduce in Direct X 10, and emphasized in Direct X 11.

I believe that if we were to split the  Wii U GPU in half, say 50% for Compute shading and 50% for graphics, that would leave developers with about 200 stream processors, though i doubt it will take that much, given the stream processor GFLOPS is estimated to be about 2FLOPS per processor, at 550mhz. That's about 1.1GFLOPS. The GPU has an estimate of about 440GFLOPS in all compared to Xbox 360's 240 GFLOPS (My GFLOPS calculation is gotten from AMD's site. It is calculated as 2flops*400SPU*550mhz).


Yes the Wii U can run it. That's not new info. But why did Epic not port it directly?

I think it has to do with the Shader 5.0. The Wii U is reported to have Shader 4.1. Now see this. UE 3 was created for Direct X 9, and then upscaled to Direct X 10.1. But UE4 was made for Direct X 11. If it were to be downscaled to the Wii U, certain Direct X 11 exclusive textures and features will be absent, and thus, it wouldn't be any different from the UE3.
Having said this, I must add: DON'T THINK THAT THE UNREAL ENGINE 3 ON THE WII U IS ON THE LEVEL OF THE XBOX 360. ITS DEVASTATINGLY MORE POWERFUL. You can point to the Wii U tech demo. No 360 game can do that, and Epic claims it was just a tip of the ice berg.
Versus the Next Xbox and PS4, from a technical standpoint, yes, they will be more powerful. But from a visual standpoint, there wouldn't be much of a difference.
So if the PS4 displays the stunning ELEMENTAL DEMO epic showed us at E3 2012, then the Wii U should be somewhere around the SAMARITAN DEMO. The difference in visuals will not be very noticeable.

For the record though, I'm an android developer, so I talk to a few people about this matter. Wii U is not android, but it will support Unity, so I can port my games there once i get a license, but that's not relevant. They are not confirmed. They are theoretical, but they are based on information i have gathered from all over the Internet.

Tuesday, 13 December 2011

How much Grpahics can the 3ds Pump out?

Well now. This has been quite a debate for many. People have considered and concluded that the 3ds is either on par with the PSP or the XBOX. I stumbled across something on Wikipedia that seemed to confirm my calculations.
The Gamecube pumped out about 40m polygons, or triangles per second. Well, at least that's what resident evil 4 was running. I also calculated the Wii to pump out some 60m. This holds true, given the Wii was upgraded from the Gamecube on a 1.5 scale.

How did i get these figure? Easy. To get the number of polygons, divide the fill rate by 32, which indicates a 32 pixel triangle (that is, each triangle will be made up of 32 pixels). The Wii fill rate is at 1944 mega pixels, which gives about 60.75m polygons. See the funny thing here. Multiply the Gamecube's 40m by 1.5 and see bow many polygons will be pumped out. 60m.

I tried the same thing with the 3ds, and i arrived at some astonishing figures. The 3ds fill rate is currently at about 1.6 billion pixels, or 1600 mega pixels. I saw this on the internet. But i also calculated it myself. Fill rate is calculated by multiplying gpu clock rate by the number of pixel pipelines. The 3ds has 4 pixel pipelines and 4 vertex pipelines. So 200Mhz * (4 pixel pipelines + 4 vertex pipelines) gives us 1600 mega pixels.  Divide this by 32 pixels and you will arrive at 50m polygons. It could seem mythical, since i don't even have a 3ds or have not even studied computer science. But I confirmed it.

On Wikipedia, The 3ds gpu was was stated to pump out 40m triangles or polygons at 100mhz. I got some information that the 3ds was down-scaled from 200mhz to 133mhz. So i did some more calculations. Lets see how many polygons the gpu can pump out at 1mhz. That (40m/100mhz) will give us 400 000 polygons. Now multiply that by 133mhz. That will give us 53.2m polygons.

That means that the 3ds stands firmly between the Gamecube and Wii in terms of polycount. But it will readily pump out better visuals than the Wii because it has it's own shader unit, that helps it create shaders and textures of a near 360 caliber. Capcom have already stated that they can produce near Resident Evil 5 visuals on the 3ds. That's quite some power packed into a little glossy box.