Power
Computers are getting hotter and consuming more energy as clock rates and die sizes go up. For example (and scarily), the new Alienware system is going to come with watercooling and an 800-watt PSU. That's a lot of juice. That juice costs money. (For example, basilisk is going to cost about $18 per month to keep running 24/7 when I get my own apartment. pyrallis a mere fraction of that, I'm sure. Maybe I'll be shutting basilisk down at night.)
This impacts software developers as well, especially for real-time simulations such as games. The traditional way to write a game loop is:
while running: recieve input (input) update simulation (processing) render graphics (output)
The input/process/output operations constitute a frame. Most applications just burn through this loop as fast as possible, even if it isn't necessary. (Oddly enough, even with vsync enabled, waiting for the framebuffer swap still pegs the CPU. Maybe the drivers use a spin lock for latency reasons?) ZSNES, for example, pegs a CPU as long as it is running. For the sake of discussion, let's say a 600 MHz CPU can run ZSNES at full frame rate. On a 3 GHz CPU, 4/5 of the power of the CPU is being totally wasted. In the past, it didn't really matter... processors consumed about the same amount of power under heavy load as when they were idling. The Pentium IV, however, can have a power draw differential on the order of tens of watts between full load and idling. That extra 2400 million cycles per second is costing money! Is there a way we can modify the frame loop above to save CPU time, and thus save some money?
The emulators snes9x and nester both use miniscule amounts of CPU time, at least on my pretty fast Xeons. I took a look at their source to see how... nester's Win32 message loop does a nonblocking message read (PeekMessage) if the emulator says it should be working. If it says it doesn't need to work right away, the message loop does a blocking message read (GetMessage). I assume it has some way for the emulator to say that it needs to start working again (every few milliseconds) so it can break out of the GetMessage call and get back to work. Either way, that small amount of idle time is the difference between 100% CPU load and 2%. snes9x uses a similar mechanism. It calls the Win32 multimedia function timeSetEvent, which calls a callback from another thread every X milliseconds. That callback can increment the frame counter and send an event to the main loop to do some processing. Once again, the difference between this and the naive loop above is a nontrivial amount of processing time.
How much processing time? Let's say that the difference between full and light load on a 3 GHz Pentium IV is 20 watts. Let's also say that the user plays this game or emulator five hours one day. That's one hundred watt-hours (one tenth of a kilowatt-hour), which costs about a cent. But add that up over the course of weeks or months, and multiply times all of the players. Then the difference between two main loops translates into a real monetary difference. And as processors get hotter (and the descrepancy between full and idle goes up), the costs of just burning processor time will get worse. Even now, running SETI@home or distributed.net isn't totally free.
Speaking of all of this, I'd like to measure the power draw of my computer under various load conditions. (Heavy processor usage, heavy 3D usage, heavy disk usage, and idle.) Anyone know how I would go about doing this? Are there cheap power meters I could buy and insert between my surge protector and the wall?
Note: One disadvantage of using a frame-throttling system like the ones that snes9x and nester use is that they can be less reliable and more "jittery" if another process decides it wants to hog the CPU. I'd like to investigate and see if I could come up with a good solution to this... (Ideally, you'd just want the framebuffer switch function to do a blocking wait for the vsync. Maybe it would be as simple as having the spin lock issue HLT instructions instead of NOP?)
p.s. An anecdote: at D&D a while back, Jess had her laptop running the D&D character generator program, where she was keeping track of her stats and such. This is an entirely event-based program with no simulation loop or anything, but it pegs the CPU while it runs. On a laptop, that 1) drains the battery fast and 2) causes the fans to run extra loud, annoying everybody else at the table. Now if only the programmers had decided to only do processing when a Windows message came in!
p.p.s. I still need to talk about my impressions of RSIGuard, but I guess I'll have to do it later, since it's yelling at me to take a break from typing. ;)
I was recently thinking about this problem as well when I was reading about the mini-itx cluster project (http://www.mini-itx.com/projects/cluster/). To quote it exactly, "At present, the idle power consumption is about 140 Watts (for 12 nodes) with peaks estimated at around 200 Watts. The machine runs cool and quiet."
Now, that's 12x800mhz processors. That's not too shabby at all. Perhaps more, quieter, cooler, slower running processors are the future rather than one big, fast, power-guzzling processor.
In other news, AMD is releasing a new line of low-power, high performance processors based upon their Mobile Athlon architecture (for use in embedded and mini-itx computers). The one that caught my eye was the NX1500@6W. This processor runs, without a fan, at 1 Ghz and consumes 6W at average load and 9W under heavy load. That's really freaking cool (no pun intended).
Consider, if you will, lab grown diamonds in diamond based processors. I think that's not far off, and probably won't cost much overall after a year or two of being in circulation.
That new AMD processor does sound impressive. AMD is doing better and better. :) Plus, it's sweet that the new ATI card is relatively "green" too.
I was experimenting with eliminating that 100% load in VB programs, for fun, since almost every app written in VB out there that uses a loop to handle events pegs the CPU... it turns out calling Sleep(1) in after rendering every frame knocks the CPU usage way down without significantly impacting your responsiveness. At least, it did in my tests. It would be cool if Sleep was accurate enough to call it once with the amount of time remaining till the next frame. I'll have to see if timeSetEvent can do that.
I agree, it would be nice if sleep was more accurate... If you have any luck with timeGetTime (or even no luck), make sure to let me know. :)
The best part about using diamond as a semiconductor instead of silicon is that you could overclock your computer until your case melts and your house catches on fire without damaging the processor itself.
Yes, Sleep is phenomenally powerful. In PERL, Sleep can be used to delay a CGI program's execution by up to three minutes. After the three minutes are up a timeout request is dispatched from upstream, but in the meantime said CGI program can use cookies to transfer data back and forth in Javascript. It'd be great to see a game use this technique sometime. Cookies aren't as slow as they would first appear, don't you think?
Where can I see an example of this technique?
I abandoned my own attempt a year ago, after determining that for the time being it simply wasn't worth the risk.
There is a client component, which I never finished, and a server component that needs some debugging, but is otherwise solvent. The server and the client communicate with one another via a simple I/O protocol. I don't believe the delay technique was inplemented in this script; rather, I think I discovered it by accident as I tried to find some way of holding up Tripod's server timeout facilities.
This is the server script. I don't believe I ever implemented the technique here; rather, I think I discovered it by accident as I tried to find some way of holding up Tripod's server timeout facilities, which it would seem are even more stringent now than before. ...It's pretty terrible that an ISP can willingly host Al-Qaeda, and yet be anal as they are about things as harmless as innovative scripting technique. :P
[HTML_REMOVED]
!/usr/local/bin/perl
#
Author name: Anthony Caudill
Creation date:
#
Description: CGI Cookie Server
Runs in a hidden frame, alongside a client script.
use CGI qw/:standard/;
print "Content-type: text/html\n\n"; print "[HTML_REMOVED]";
$ServerActiveFlag = 1;
while ($ServerActiveFlag == 1) {
$LatestTextValue = raw_cookie(); ($LatestTextValue) = split (/;/, $LatestTextValue); ($LatestTextValue) = split (/undefined/, $LatestTextValue); ($LatestTextTitle, $IncomingService) = split (/ServiceBridge=/, $LatestTextValue); }
while ($ServerActiveFlag == 1) {
$LatestTextValue = raw_cookie(); ($LatestTextValue) = split (/;/, $LatestTextValue); ($LatestTextValue) = split (/undefined/, $LatestTextValue); ($LatestTextTitle, $IncomingService) = split (/ServiceBridge=/, $LatestTextValue);
if (chr($IncomingService) == "R") {
$DWORDIndex);
$DWORDIndex);
[HTML_REMOVED] elsif (chr($IncomingService) == "W") {
$DWORDIndex);
$DWORDIndex);
$IncomingStringLength);
elsif (chr($IncomingService) == "D") {
elsif (chr($IncomingService) == "E") {
$IncomingService = ""; } [HTML_REMOVED]
http://lordgalbalan.tripod.com/cgi-bin/CGIServer.cgi
This is the technique itself. Pretty simple, you'd never expect it to work. But boy, does it ever.
[HTML_REMOVED]
!/usr/bin/perl
print "Content-type:text/html\n\n";
while ($A < 10) {
sleep 10;
$A = $A + 1; } [HTML_REMOVED]
http://lordgalbalan.tripod.com/cgi-bin/CGITest.cgi