Sign in to follow this  
Followers 0
lutorm

1.31 FPS tests and CPU vs GPU

103 posts in this topic

So to further investigate whether the FPS are limited by the CPU or the GPU, I did some tests changing the speed of the CPU and the GPU. Here's what I got underclocking the CPU on my system (CPU: Intel E8500 running at 4.18GHz, GPU: ATI 4870):

4.18GHz: 67 / 32 / 35 FPS for benchremagen/vehicles/antwerp

2.66GHz: 45 / 22 / 23

2.00GHz: 33 / 16 / 17

So the FPS depend strongly on the processor speed. If we calculate the FPS per GHz, we see:

4.18: 16.0 / 7.7 / 8.4

2.66: 16.9 / 8.3 / 8.6

2.00: 16.5 / 8.0 / 8.5

So essentially FPS per GHz is constant. That's exactly what you what you'd expect if you are totally CPU limited.

For the GPU on the other hand, I got this when clocking down the card using ATI Tray Tools (now with high quality, everything on, for benchvehicles only):

750MHz: 19

600: 19

500: 19

400: 18

250: 18

200: 17

150: 15

100: 10

So for the GPU we can't start seeing an effect on the frame rate until it's been slowed down to a speed of 25% its default speed! From this, it's possible to do a simple estimate of the fraction of work done by the GPU compared to the CPU as tCPU/tGPU = (GPUfreq1/GPUfreq2 - fps1/fps2)/(fps1/fps2-1). Using that formula and the numbers for 750 and 100 MHz we get tCPU/tGPU = 6.2, so for each frame the CPU works 6.2 times longer than the GPU to render it. Or equivalently, the CPU takes 86% of the time to render a frame. And that's with all graphics effects turned on. The result for benchantwerp is practically identical.

Clearly, at least for my system, these benchmarks are WAY CPU-limited. But let's look at some of the other tests and other settings.

With "best performance / no clutter" like CRS wants the benchmark results for, the tCPU/tGPU values are 3.5 for benchremagen, 10.8 for benchvehicles and 9.5 for benchantwerp. The most GPU-intensive scene is benchremagen, but they are all still CPU-limited. Adding all the extra vehicles adds more work for the CPU than the GPU.

Moving on to the shadows tests, we get 0.75 for shadows high, 0.80 for shadows low and 0.78 for shadows off. Here we are now at a situation where the scene loads the CPU and GPU pretty evenly. Not surprisingly, it's the same case for the weather tests since they use the same scene, we get 0.94 for clear and cloudy and 0.69 for rain.

At this point I was thinking of why the shadows test would load the video card so much more. The only thing that distinguishes that scene from the remagen one are the SpeedTrees! So lets try looking at the speedtrees through binoculars, a well-known fps killer. Using the same equation above we actually get a negative tCPU/tGPU as the frame rate dropped by 7.8x when the GPU frequency dropped by 7.5x. That's not even supposed to be possible, but within the measurement errors that simply means that the frame rate is now totally, utterly limited by the video card!

The conclusion has to be that vehicles and buildings mostly load the CPU while the speedtrees mostly load the GPU, and whether the game is CPU or GPU limited depends entirely on what you look at.

Share this post


Link to post
Share on other sites

intersting lut. Thanks for posting.

I would be interested in finding out at what point upgrading the Graphics Card becomes pointless in terms of performance.

i.e #of pipes, RAM, GPU clock, bus, etc...

e.g. What is the difference going from 96 to 116 to 128 pipes at 512mb RAM. What about going from 512mb to 1gb with 128 pipes. and so on.

Share this post


Link to post
Share on other sites
intersting lut. Thanks for posting.

I would be interested in finding out at what point upgrading the Graphics Card becomes pointless in terms of performance.

i.e #of pipes, RAM, GPU clock, bus, etc...

e.g. What is the difference going from 96 to 116 to 128 pipes at 512mb RAM. What about going from 512mb to 1gb with 128 pipes. and so on.

Yeah, the memory question is an interesting one. Memory will only make a difference if you run out of texture memory and given the comparatively low-res textures in use here, I suspect memory is a non-issue.

As far as performance goes, it scales pretty linearly with number of "streaming multiprocessors" (for ATI and whatever Nvidia calls the same thing). GPU loads are easily parallelizable.

The bottom line is that it becomes pointless when you become CPU-limited. And if you take the SpeedTree example from my test above, it seems dubious you'll be CPU-limited in that scenario even with a 5890 or whatever. Whether it makes sense to buy a $500 video card so you can look at the trees through binoculars with high fps is another matter...

I think in general bang for the buck sense, a fast CPU is likely to matter much more in general situations, even more so when you're online and you add communication, animation, ballistics etc for all those vehicles that are visible to the CPU load in the offline test. I'd rather have high fps when 10 EI come running towards me with guns blazing and I need to hit them than when I'm looking for ETs 2km out through the binos...

Share this post


Link to post
Share on other sites

lI for one ain't upgrading, the only place left to go for me it from 116 pipes to 128, 512mb to 1gb, at the same clock speed. While the increase in RAM would be nice, I can't rightly justify the additional $600 it would cost.

So then the next ? becomes what are the performance increases to be had by upgrading the CPU. Dual-core vs Triple-Core vs Quad core at the same clock speed per core (let's say 2.66ghz since that is what my Dual Core is currently running at). Would I do better to go to 2.9ghz dual core or a 2.6ghz quad? etc.

Share this post


Link to post
Share on other sites
lI for one ain't upgrading, the only place left to go for me it from 116 pipes to 128, 512mb to 1gb, at the same clock speed. While the increase in RAM would be nice, I can't rightly justify the additional $600 it would cost.

So then the next ? becomes what are the performance increases to be had by upgrading the CPU. Dual-core vs Triple-Core vs Quad core at the same clock speed per core (let's say 2.66ghz since that is what my Dual Core is currently running at). Would I do better to go to 2.9ghz dual core or a 2.6ghz quad? etc.

Well, for the CPU it's my understanding that the game still is largely single-threaded, so more than dual cores won't buy you much. You need the fastest single-core performance possible. I believe that means a Nehalem Core i7 920 and up, along with new MB and ram... and that's where I'm stuck, too.

Share this post


Link to post
Share on other sites
Would I do better to go to 2.9ghz dual core or a 2.6ghz quad? etc.

The game isn't optimised for multi-threading at all so a high clock dual core would be more useful than a lower clock quad.

Lut is it possible you could run tests where you assign the game to one single core and have all the other processes on a separate core/cores ? Would be interesting to see just what performance increases can be gained by dedicating a whole core to the game

Share this post


Link to post
Share on other sites

The reason I'm asking about adding cores vs increasing clock speed is the new specs CRS gives for 1.31. The recommended is 2.66ghz Quad. Odd since, like you two, I have been led to believe the game is largely a single-core process.

Also, Normally I have the game running on a dedicated core, and it typically runs at about 35-43% of the total capacity of the processor (50% max per core)

Share this post


Link to post
Share on other sites
The reason I'm asking about adding cores vs increasing clock speed is the new specs CRS gives for 1.31. The recommended is 2.66ghz Quad. Odd since, like you two, I have been led to believe the game is largely a single-core process.

Also, Normally I have the game running on a dedicated core, and it typically runs at about 35-43% of the total capacity of the processor (50% max per core)

Yeah, I wondered about that, too. Maybe the new engine uses more threads? ww2.exe seems to run 6 threads, and I just looked at their CPU utilization with perfmon while doing the different tests. Thread 0 sits with essentially 100% CPU, the other 5 are never above 4%. So no, unless this is also a debug thing, it doesn't seem like a quad core will help you in the least.

Share this post


Link to post
Share on other sites
The game isn't optimised for multi-threading at all so a high clock dual core would be more useful than a lower clock quad.

Lut is it possible you could run tests where you assign the game to one single core and have all the other processes on a separate core/cores ? Would be interesting to see just what performance increases can be gained by dedicating a whole core to the game

You can set process affinity with the task manager. If you require ww2 to run on, say, cpu0, then the os should schedule all other tasks on the other. And I just tried that and it seems to make no difference for performance. (The effect should be small anyway. The only thing it would help is that bouncing the process back and forth between cores will flush their caches during context switch.)

Share this post


Link to post
Share on other sites

It is also interesting to see the differences processors make with a good testing procedure.

For example I have just upgraded my E8500 3.16 to a I7 2.66 920 processor.

I have recorded nearly identical figures to your 4ghz E8500 (I run a GTX 260).

I always believed that texture fill was king (Aligned with a decent processor of course). This raises the question of what card is complete overkill and whats the best "sweet spot" for perfromance versus cost on the card.

I might look at cranking my 920 up see what the results are.

Maybe it's the card holding me back now ??

Share this post


Link to post
Share on other sites

OK I boosted my 2.66ghz I7-920 to 3ghz and it gave a 10% boost (or more) to all benchmarks.

So it seems there is still a direct correlation to processor power even using one of the latest with a now run of the mill decent card.

Share this post


Link to post
Share on other sites
It is also interesting to see the differences processors make with a good testing procedure.

For example I have just upgraded my E8500 3.16 to a I7 2.66 920 processor.

I have recorded nearly identical figures to your 4ghz E8500 (I run a GTX 260).

Ah, cool. I'm not entirely surprised to see an E8500 @ 4.2GHz being equal to an I7 at 2.7GHz. That's the sort of increased performance/GHz (1.5x faster clock for clock) I'd about expect.

And if the framerate went up when you overclocked it (doesn't the 920 do 3GHz in "turbo" mode with only one core running by default?) then you're still not completely GPU-limited.

Try the bino speedtree test and I bet you'll see the framerate totally independent of the processor speed.

Btw, I found this interesting snippet about nonintuitive stuff going on when mixing and matching processor/gpu: http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=3640&p=3

Share this post


Link to post
Share on other sites

Yes thats my point.

My processor is giving more and more performance yet my card is not top of the range - therefore in simple terms my belief that texture fill rate was king as Doc said a ling time ago isn't perhaps true with today's systems.

For example if I dropped down to a 9800GT from a GTX260 we may well see identical frame rates. So how far off are we. ??

At what point do FPS drop using the same processor i.e what graphics card.

I think this is important as many people look towards GTX 290's and 5870's primariliy for this game but they may not get much extra out of it compared to a 260 gtx for example.

From this simple test increasing the processor I have gives increased FPS which means my card is not maxxed out -

Share this post


Link to post
Share on other sites
So to further investigate whether the FPS are limited by the CPU or the GPU, I did some tests changing the speed of the CPU and the GPU. Here's what I got underclocking the CPU on my system (CPU: Intel E8500 running at 4.18GHz, GPU: ATI 4870):

4.18GHz: 67 / 32 / 35 FPS for benchremagen/vehicles/antwerp

2.66GHz: 45 / 22 / 23

2.00GHz: 33 / 16 / 17

So the FPS depend strongly on the processor speed. If we calculate the FPS per GHz, we see:

4.18: 16.0 / 7.7 / 8.4

2.66: 16.9 / 8.3 / 8.6

2.00: 16.5 / 8.0 / 8.5

So essentially FPS per GHz is constant. That's exactly what you what you'd expect if you are totally CPU limited.

For the GPU on the other hand, I got this when clocking down the card using ATI Tray Tools (now with high quality, everything on, for benchvehicles only):

750MHz: 19

600: 19

500: 19

400: 18

250: 18

200: 17

150: 15

100: 10

So for the GPU we can't start seeing an effect on the frame rate until it's been slowed down to a speed of 25% its default speed! From this, it's possible to do a simple estimate of the fraction of work done by the GPU compared to the CPU as tCPU/tGPU = (GPUfreq1/GPUfreq2 - fps1/fps2)/(fps1/fps2-1). Using that formula and the numbers for 750 and 100 MHz we get tCPU/tGPU = 6.2, so for each frame the CPU works 6.2 times longer than the GPU to render it. Or equivalently, the CPU takes 86% of the time to render a frame. And that's with all graphics effects turned on. The result for benchantwerp is practically identical.

Clearly, at least for my system, these benchmarks are WAY CPU-limited. But let's look at some of the other tests and other settings.

With "best performance / no clutter" like CRS wants the benchmark results for, the tCPU/tGPU values are 3.5 for benchremagen, 10.8 for benchvehicles and 9.5 for benchantwerp. The most GPU-intensive scene is benchremagen, but they are all still CPU-limited. Adding all the extra vehicles adds more work for the CPU than the GPU.

Moving on to the shadows tests, we get 0.75 for shadows high, 0.80 for shadows low and 0.78 for shadows off. Here we are now at a situation where the scene loads the CPU and GPU pretty evenly. Not surprisingly, it's the same case for the weather tests since they use the same scene, we get 0.94 for clear and cloudy and 0.69 for rain.

At this point I was thinking of why the shadows test would load the video card so much more. The only thing that distinguishes that scene from the remagen one are the SpeedTrees! So lets try looking at the speedtrees through binoculars, a well-known fps killer. Using the same equation above we actually get a negative tCPU/tGPU as the frame rate dropped by 7.8x when the GPU frequency dropped by 7.5x. That's not even supposed to be possible, but within the measurement errors that simply means that the frame rate is now totally, utterly limited by the video card!

The conclusion has to be that vehicles and buildings mostly load the CPU while the speedtrees mostly load the GPU, and whether the game is CPU or GPU limited depends entirely on what you look at.

unless you used a low resolution you're not properly isolating the GPU. drop the resolution down to 800x600 and run the test again using both best performance and best visuals.

*edit* sorry i have this backwards. i mean crank up the resolution and use best visuals to test GPU. then compare it to a cpu test with low res to get a better gauge.

Edited by madrebel

Share this post


Link to post
Share on other sites
unless you used a low resolution you're not properly isolating the GPU. drop the resolution down to 800x600 and run the test again using both best performance and best visuals.

*edit* sorry i have this backwards. i mean crank up the resolution and use best visuals to test GPU. then compare it to a cpu test with low res to get a better gauge.

Well, true. But that won't tell you what the situation is at the resolution I'm running, which is what I'm interested in. There's no use knowing you're CPU-limited at 800x600 if you run at 1920x1200 and at that point your video card has run out of steam.

Share this post


Link to post
Share on other sites

This test shows that nothing's really changed. This game has always been about CPU speed.

Unless you have a very low end GPU I bet the type of chipset your motherboard uses have more impact then a state of the art GPU vs a two year old GPU.

For this game a good dual core with high frequency is much better then a quad with lower speed.

I run a now pretty old Intel Core 2 Duo E6600 + Nvidia 8800GTS and the performance improvement in this game when I clocked the CPU from the stock 2,66 to 3,00 GHZ was very noticeable.

EDIT: Might also add that network quality seems very related to FPS also. I'm cut of from the cogentco network atm and can only reach the server via secondary server and a 30-some jump route. Along the way there a router that's obiously over-loaded and on avarage it has a packet loss of 10-20%. This started this summer and I've noticed a big impact on the FPS in game. The packetloss took down my FPS a lot. I guess the packet loss loads the CPU and by doing that the FPS suffers.

Edited by lure

Share this post


Link to post
Share on other sites

Good posts. I'm glad to know setting affinity for everything to isolate BGE to a lone core doesnt do a heck of a lot. But is there a simple way to permanently assign an application to use a certain core? I guess I'll google it

nodaker

Share this post


Link to post
Share on other sites
This started this summer and I've noticed a big impact on the FPS in game. The packetloss took down my FPS a lot. I guess the packet loss loads the CPU and by doing that the FPS suffers.

you can get rid of some of this overhead with a good network card. evga's killer nic will do it but for less money you can pick up something like this.

http://www.newegg.com/Product/Product.aspx?Item=N82E16833106033

that NIC offloads packet checksum duties from the CPU to the NIC, basically the same thing those stupid killer "gaming" nics do. in some situations it can lower latency and improve (decrease) cpu utilization.

Good posts. I'm glad to know setting affinity for everything to isolate BGE to a lone core doesnt do a heck of a lot. But is there a simple way to permanently assign an application to use a certain core? I guess I'll google it

nodaker

there used to be a tool for XP that kind of worked. vista and 7 have much better multi core support though so it isn't really needed.

Share this post


Link to post
Share on other sites

My rig...

Windows7 Enterprise(64bit)

i7-920 (stock, no O/C)

ATI 4870 X2

6 GB DDR3 RAM

1680x1050x32

Benchantwerp~22fps

ATI drivers 10.2(cleaned out older drivers before installing)

My CPU runs at 2.66GHz(2.93GHz with turbo boost).

I've set my GFX card to Application controlled, I dont 'force' any settings in the GFX software...

Edited by chimaera

Share this post


Link to post
Share on other sites
My rig...

Windows7 Enterprise(64bit)

i7-920 (stock, no O/C)

ATI 4870 X2

6 GB DDR3 RAM

1680x1050x32

Benchantwerp~22fps

ATI drivers 9.12(cleaned out older drivers before installing)

My CPU runs at 2.66GHz(2.93GHz with turbo boost).

I've set my GFX card to Application controlled, I dont 'force' any settings in the GFX software...

you should try the current driver 10.2 i get better fps with it. your 3 driver releases behind.

http://game.amd.com/us-en/drivers_catalyst.aspx?p=win7/windows-7-64bit

Edited by zaltor

Share this post


Link to post
Share on other sites

I'm finding it interesting that when I use the overlay from ATI Tray Tools that when I'm at the first spawn screen, my GPU usage is bouncing between 11-22%. When I do a .benchantwerp and spawn in, it drops like a rock and sits at 0%, never budges. When I despawn to the unit selection screen, it returns to 11-22%.

I've tested it out with Mass Effect 2 and it seems to give solid readings (<100% when it's bumping into Vsync, spikes 100% when the FPS drop below 60).

Just seems strange that in what should be a graphically intensive scene it's sitting with it's thumb up it's silicon bum.

Share this post


Link to post
Share on other sites

Very interesting. I wonder if this is what has been killing me. I bought a new computer about 6-9 months ago and was expecting a huge improvement over my last rig which was pretty old. Nope low frames. Even lower than my last comp. I get 50-60 at the high end and drop down to 20's in towns. Teens in camped depots. And when I am flying forget getting anywhere near a town. I have followed all the recommmendations about turning off aa, mitmap, vsync, and so on. Even dropped my vis player limits. So many changes that it makes playing infantry terrible because I can no longer detect inf from brush. Pissing me off...and these changes haven't done crap for my frames per sec except it now allows me to fly with my track ir without lagging. Absolutely disheartening since I paid a pretty penny for this computer.

All along it turns out it might be my Quad core processor. I need help finding a way to remedy this since I will not be changing anything on my computer in the forseeable future. How do I dedicate cores to tasks? How do I do so without affecting my graphics extensive design work for my actual work when I am not playing? What should I know that can help fix my fps problems since it seems I can turn back up my gpu settings since it doesn't seem to be the issue?

Windows Vista professional 64 SP2

AMD Phenom II x4 black edition 3ghz

ATI radeon 4850HD 1024 10.2driver

8 gigs DDR2 (I can't remember what the front side bus speed is.)

Soundblaster X-FI

Help!

Share this post


Link to post
Share on other sites
Sign in to follow this  
Followers 0

  • Recently Browsing   0 members

    No registered users viewing this page.