Go to the new cbloom rants @ blogspot

06-03-11 | Amalgamate

So, here you go : amalgamate code & exe (105k)

The help is this :

amalgamate built May 19 2011, 18:01:28
args: amalgamate
usage : amalgamate [-options] <to> <from1> [from2...]
-q  : quiet
-v  : verbose
-c  : add source file's local dir to search paths
-p  : use pragma once [else treat all as once]
-r  : recurse from dirs [list but don't recurse]
-xS : extension filter for includes S=c;cpp;h;inl or whatever
-eS : extension filter for enum of from dir
-iS : add S to include path to amalgamate

from names can be files or dirs
use -i only for include dirs you want amalgamated (not system dirs)

What it does : files that are specified in the list of froms (and match the extension filter for enum of from dir), or are found via #include (and match the extension filter for includes), are concatted in order to the output file. #includes are only taken if they are in one of the -I listed search dirs.

-p (use pragma once) is important for me - some of my #includes I need to occur multiple times, and some not. Amalgamate tells the difference by looking for "pragma once" in the file. eg. stuff like :

#define XX stuff
#include "use_XX.inc"
#define XX stuff2
#include "use_XX.inc"
needs to include the .inc both times. But most headers should only be included once (and those have #pragma once in them).

So for example I made a cblib.h thusly :

amalgamate cblib.h c:\src\cblib c:\src\cblib\LF c:\src\cblib\external -Ic:\src -p -xh;inc;inl -eh

which seems to work. As another test I made an amalgamated version of the test app for rmse_hvs_t that I gave to Ratcliff. This was made with :

amalgamate amalgamate_rmse_hvs_t.cpp main_rmse_hvs_t.cpp rmse_hvs_t.cpp -I. -v -Ic:\src -p

and the output is here : amalgamate_rmse_hvs_t.zip (83k)

But for anything large (like cblib.cpp) this way of sticking files together just doesn't work. It should be obvious why now that we're thinking about it - C definitions last until end of file (or "translation unit" if you like), and many files have definitions or symbols of the same name that are not the same thing - sometimes just by accidental collision, but often quite intentionally!

The accidental ones are things like using "#define XX" in lots of files ; you can fix those by always using your file name in front of definitions that you want to only be in your file scope (or by being careful to #undef) - also local namespacing variables and etc. etc. So you can deal with that.

But non-coincidental collisions are quite common as well. For example I have things like :

replace_malloc.h :
  #define malloc my_malloc

replace_malloc.c :
  void * my_malloc ( return malloc(); }

It's very important that replace_malloc.c doesn't include replace_malloc.h , but when you amalgamate it might (depending on order).

Another nasty one is the common case where you are supposed to do some #define before including something. eg. something like :

#include "Huffman.h"
that kind of thing is destroyed by amalgamate (only the first include will have effect, and later people who wanted different numbers don't get what they expected). Even windows.h with the WINNT_VER and LEAN_AND_MEAN gets hosed by this.

You can also get very nasty bugs just by tacking C files together. For example in plain C you could have :

file1 : 
static int x;
file 2 :
int x = 7;
and in C that is not an error, but now two separate variables have become one when amalgamated. I'm sure there are tons of other evil hidden ways this can fuck you.

So I think it's basically a no-go for anything but tiny code bases, or if you very carefully write your code for amalgamation from the beginning (and always test the amalgamated build, since it can pick up hidden bugs).

06-03-11 | Recommend me a video game

I think I tried this post before, but let's try again :

No space marines. No WW2. How about just no marine/soldier theme in general. I'm not a big "violence in video games is bad" banger, I just think it's boring. And it's not my fantasy. I don't want to be a fucking soldier, shooting people is horrible, why do you want to pretend to do that?

Not gray or brown. Give me some damn color and beauty. I want to be excited to see the next level, and I want to be surprised and blown away and delighted when I do. There are endless possibilities for fantasy worlds, you can do better than warehouses.

Not primarily about combat ; especially not repetitive combat, like shoot these 20 guys, okay now shoot 20 more guys. Bleh, I'm bored.

Not "Don Bluth" semi-interactive , ala "Press A now" - you pressed A! good for you, monkey! (I'm looking at you, God of War). (actually I kind of liked those games when I was a kid even though it was uncool to like them; my favorite was "Cliff Hanger" which I just learned is actually a Lupin The Third movie!! crazy, when I watched Lupin a few years ago I didn't make the connection)

No giant inventory trees or management games or spread sheets; I do work when I'm working, I don't need to do work in my game.

No abstract puzzle games. I'll play chess or go or something if I want that.

Absolutely no frustrations. Long load times or annoying UI or one bad level that gives me instadeath for no reason - the game's going in the bin. If the game makes me scream at it or grind my teeth in frustration, I don't need that in my life.

My favorite games are generally "light RPG's" like Faery Tale, Zelda, Drakan, where I get to run around a big world, but without a bunch of fucking dialogs, and without managing a big group of characters and stats and such (I generally love true RPG's for about the first quarter of the game, but then you get too many people in your party and too many items and spells and it just becomes a huge pain in the ass).

06-03-11 | New race track near Olympia

The Ridge race track in Shelton is scheduled to open this fall. It certainly doesn't look like it from the video but they claim they are ahead of schedule, and already fully booked through 2012.

The Ridge has a decent road course, but it's rather short, it's 2.42 miles, a small upgrade over the 2.25 miles of Pacific Raceways ; I also really don't like the long straight into a chicane that they are planning at The Ridge, that is the exact kind of feature that kills people. Hopefully PCA runs it without the chicane.

Portland International Raceway (PIR) is even shorter at 1.9 miles ; I found that running the 2.25 mile track at Pacific Raceways was unpleasantly short, it just feels a bit too much like being a hamster on a wheel or just driving around a roundabout, since you're turning the same direction pretty much the whole time; PIR and Pacific Raceways both suck because they have a drag strip sharing the straight with the road coarse ; drag strips have a variety of surface problems that make them incredibly dangerous to drive over, and there have been several nasty crashes over the years that happen right at the point where the road coarse comes onto the drag strip. So The Ridge is big win in that it doesn't have that. Also since it's new it is presumably being designed with decent run-offs and barriers, which our old tracks don't have.

I just found out last week that in the same place (Shelton) you can rent out the airport to drive on; at first I thought you could rent part of the actual runway, which would be awesome, but in fact it's just a big parking lot that the airport owns, which is still better than nothing. Also if you look at Google satellite maps you might see a race track in Shelton already exists. That's the WA State Patrol training track. Bastards.

There's a huge boom of race track development up here right now.

Oregon Raceway Park opened last year, and looks awesome, though a bit too far away from Seattle. It's out in the barren grass land, away from all our deadly trees; I like tracks that are wide open like that because you can see way ahead to know if somebody has had an accident in front of you.

Bremerton has been trying to build a track for a while (currently the old airport there is used for car events, but it's hardly a track); they're hoping to get a Nascar stop, but I doubt it since they're out in the middle of nowhere. Boardman, OR is also dreaming of making this huge PNW Motorspork Park complex with multiple tracks, but they don't have funding yet and that seems like a pipe dream.

It's very hard to find places to drive fast up here. My understanding is that in Europe the track days are open in the sense that you just show up and pay a fee. In the US that does not exist at all because of liability shit. It's always through some kind of club, and they have to pretend that it's "education". (there's also obviously racing, but that's always through a club and then you have to have a race license and a car with cage and fire system and all that). So all the track days here are called "driver education" which is a bit confusing.

Whether or not the event is actually run as education depends on the group and how nitty they are. With some clubs it's a very thin facade of eduction and they immediately start doing donuts and racing each other. Others are a bit more careful to not violate their insurance policy terms. PCA for example is pretty nitty in the beginning, but once you get signed off to go solo it's basically racing.

The Proformance school at Pacific Raceways, for example, is super nitty and pedantic at first, you have to take the class, and the class is pretty terrible; they actually put up cones on the track to force you to drive "the line". Once you take the class then you can do open lapping after that which is okay.

A lot of the problem with these "racing education" classes is that they are just horrible teachers. They're super pedantic and just not very smart. They teach you what you're supposed to do without teaching you *why* you're supposed to do it, and they don't let you experiment and learn for yourself. It reminds me of the bad old days of being in primary education with small-minded teachers who are teaching you the exact machinery of how to do something instead of teaching you the fundamentals and letting you do it however you want. At the first "ground school" that I went to some guy describes early apex and late apex and then asks "what's the right way to corner?" , and here I am still being optimistic and engaged I say "it depends, it depends on the track surface, and what corners are before and after the current corner", and he says "no, we always late apex". Oh, okay, my bad, I though we were human beings who could think and discuss and be realistic and intelligent, in fact we are just supposed to repeat some rote nonsense that you read in a rule book once and treat like dogma. So the "education" is unfortunately just really depressing usually.

A couple of us did the Dirtfish Rally School out in Snoqualmie recently, and it definitely suffers from being overly pedantic. Their faccility is amazing, it's literally like a video game level with rusted old warehouses and big gravel pits, and driving on gravel is really a shit-load of fun, I was jumping with joy every time I got out of the car. I like gravel track driving a lot better than driving on pavement because you get car rotation and so much more crazy weight transfer dynamics and fun stuff going on and slow safe speeds. But I don't really recommend Dirtfish, it's too expensive for the amount of seat time you get, and they're just a bit too serious about doing things the right way. It would be worth it if doing the class qualified you for open lapping (for example the Proformance class is excruciating but the whole point of it is to qualify for open lapping) but of course Dirtfish won't let you do open lapping in their gravel pit.

06-02-11 | Shark 007 codecs are malware!

Beware. I had an old version of the Shark 007 codecs installed on my machine. I'm having some stupid video problem (*) ...

(* = my screen flashes once when I start playing a video and then again when I stop; I believe this is something in the GPU; I've tried to turn off all GPU acceleration for video playback, but it's hard to be sure I've got it all because video is such a cluster-fluck; furthermore the damn ATI catalyst has lots of "smart" override modes where it tries to turn on GPU acceleration even when you didn't ask for it, and do things like automatic interlace fixing and "smooth video", all of which I try to turn off but again it's hard to say for sure that I've actually de-fucked the driver (god damnit, give me a fucking video card driver that just does what the D3D calls tell it to do, don't fucking change the mip mode or anti-alias mode or any such shit); in the end I think I tracked it down (**) )

... so after trying various things I figured I would try changing to the newest version of the Shark codecs. Big mistake. It installs Bing and Ask toolbars without asking permission. Fucker. Nicely, Firefox blocks those addons now, but Firefox doesn't let you uninstall blocked addons which is a bit annoying (you can of course uninstall manually).

(** = I'm pretty sure the problem was the ATI "PowerPlay" clock changing feature. Then the GPU load changes, it clocks up, and when it changes the clock rate your screen flashes. If I just put the GPU at max clock all the time, then I don't get any flashes. This sort of sucks, because I do want to use PowerPlay but I don't want any damn screen flash. The big problem with leaving the GPU at full clock for me is that it makes the machine hotter, which causes the fans to step up to their higher speeds, which makes them quite noisey; with GPU in power save mode the machine is very quiet indeed; I'm also not sure why the GPU has to clock up to play video, is it because some fucker is still using GPU acceleration? or would it do that even with just CPU playback?).

(BTW a lot of people have noticed a similar problem with Flash Player since v10 ; they turned on GPU acceleration for video by default, and now your screen pops when you start and stop flash videos. You can fix it by turning off hardware acceleration in the Flash setting).

Hey bone heads! You don't need to hardware accelerate something that plays back just fine on the CPU! Especially when "hardware accelerate" means "break" as it inevitably does because GPU drivers/users are just always fucked up.

Sometimes I wish you could still buy/use cards like the sweet old Matrox 2d-only cards that just never had any problems. I used to always buy those cards for my home machines back in the Voodoo days and they were solid as a rock.

On a related malware note, for a while now I've been doing "safe browsing" by running Firefox from my ram disk. That way any changes made to my profile by malignant sites just go away when I reboot. I started doing this because of Firefox's insistence on thrashing my disk for its SQL db, and discovered the safety as a side effect.

For example, when you stumble on an attack site, it can be dangerous to even close the window to that site (because they can run on-close triggers). It's better if you just don't click any popup or do anything at all. Instead I run a batch that does :

pskill -t firefox.exe
call dele -r s:\*
call setup_firefox_ramdisk.bat

(s: is my ramdisk that I run firefox from).

The other nice safety feature is that you can wipe cookies and all saved/cached state whenever you want. I used to try to browse with things like Flash completely disabled, no cookies, etc. but it's just impossible to use the modern internet that way. But I still don't want any website to remember my settings ever, so I start from a clean slate each time. (actually I start from whatever saved state that I want, so I saved the state with my google login saved). It's way better than actually browsing with no cookies or even just clearing your browser cache periodically.

On a more positive note, I found this and love it :

Hide Comments with AdBlock Plus

I'm now using these rules :


I really would like to hide comments for *every* site I ever go to. I find that hearing from the common man is absolutely toxic. Sometimes it is just infuriating and depressing how terrible they are, when they say things that are racist or ignorant or small minded or just petty bickering about Justin Bieber. But even when the comments aren't as obviously toxic, they are still very bad for the brain, because you are influenced by them whether you think so or not, and that influence is almost always negative.

05-31-11 | STB style code

I wrote a couple of LZP1 implementations (see previous) in "STB style" , that is, plain C, ANSI, single headers you can just include and use. It's sort of wonderfully simple and easy to use. Certainly I understand the benefit - if I'm grabbing somebody else's code to put in my project, I want it to be STB style, I don't want some huge damn library.

(for example I actually use the James Howse "lsqr.c" which is one file, I also use "divsufsort.c" which is a delightful single file, those are beautiful little pieces of code that do something difficult very well, but I would never use some beast like the GNU Triangulated Surface lib, or OpenCV or any of those big bloated libs)

But I just struggle to write code that way. Like even with something as simple as the LZP's , okay fine you write an ANSI version and it works. But it's not fast and it's not very friendly.

I want to add prefetching. Well, I have a module "mem.h" that does platform-independent prefetching, so I want to include that. I also want fast memsets and memcpys that I already wrote, so do I just copy all that code in? Yuck.

Then I want to support streaming in and out. Well I already have "CircularBuffer.h" that does that for me. Sure I could just rewrite that code again from scratch, but this is going backwards in programming style and efficiency, I'm duplicating and rewriting code and that makes unsafe buggy code.

And of course I want my assert. And if I'm going to actually make an EXE that's fast I want my async IO.

I just don't see how you can write good code this way. I can't do it; it totally goes against my style, and I find it very difficult and painful. I wish I could, it would make the code that I give away much more useful to the world.

At RAD we're trying to write code in a sort of heirarchy of levels. Something like :

very low level : includes absolutely nothing (not even stdlib)
low level : includes only low level (or lower) (can use stdlib)
              low level stuff should run on all platforms
medium level : includes only medium level (or lower)
               may run only on newer platforms
high level : do whatever you want (may be PC only)

This makes a lot of sense and serves us well, but I just have so much trouble with it.

Like, where do I put my assert? I like my assert to do some nice things for me, like log to file, check if a debugger is present and int 3 only if it is (otherwise do an interactive dialog). So that's got to be at least "medium level" - so now I'm writing some low level code and I can't use my assert!

Today I'm trying to make a low level logging faccility that I can call from threads and it will stick the string into a lock-free queue to be flushed later. Well, I've already got a bunch of nice lockfree queues and stuff ready to go, that are safe and assert and have unit tests - but those live in my medium level lib, so I can't use them in the low level code that I want to log.

What happens to me is I wind up promoting all my code to the lowest level so that it can be accessible to the place that I want it.

I've always sort of struggled with separated libs in general. I know it's a nice idea in theory to build your game out of a few independent (or heirarchical) libs, but in practice I've always found that it creates more friction than it helps. I find it much easier to just throw all my code in a big bag and let each bit of code call any other bit of code.

05-30-11 | Wood Counter Tops

Wood ("butcher block") counter tops are fucking retarded. Some problems you might not have thought of :

They burn if you put hot pans on them. So you have to have hot pads and shit on your counter top all the time. Under normal use, that's all fine or whatever, but say you have some minor kitchen fire, you have a pan on the oven that catches fire, you pull it out - you can't just put it on the counter top, or the counter top might catch fire too.

Water ruins them. Water is quite common in a kitchen. In particular, you can't use a dish rack because it's too likely to get water on the counters. If you're a real big moron, you'll extend your butcher block counter tops right up and all around the sink. So now you have a sink, which is wet, and a wood counter, which can't get wet. The inevitable result is warping all around the sink.

They dry out and have to be oiled regularly, like once a month. Basically they're a giant pain in the ass. The reason people get them is for looks, because they look like a cool old rustic kitchen, but they are not actually functional. The proper materials for kitchen counters are stone or ceramic (I'm not convinced about the modern plastics like Corian, but they might be allright, I dunno).

For god's sake if you do feel the need to use wood for your counter top (presumably because you're a moron who cares more about looking like the photos in Dwell than how things function), don't run it straight up to the sink, at least put something else around the sink, and the stove.

Using wood on your counter top is almost as retarded as using leather in your car interior, which is ruined by sun (hmm what comes through car windows?), water, and similarly needs to be oiled regularly or it gets stiff and cracks. Leather is cold in winter and hot in summer and is heavy and smelly and expensive. It's monstrously retarded. We have better fucking materials now!

05-30-11 | Product Review : Honeywell 18155 SilentComfort

Honeywell 18155 SilentComfort Product Review.

This thing is a badly designed piece of crap. The way it works is it sucks in air through a tiny "pre-filter" (at the bottom), pumps it through the fans, then out through the big HEPA filter. The result is that the pre-filter gets clogged with dust almost immediately, like within a few weeks. I wager that 99% of people running these machines have pre-filters full of crap. Once the pre-filter is full of dust, the machine gets much louder and pulls much less air. Not only that but it starts pulling dust into the fans, which then get noisey and are hard to clean.

So, you are forced to constantly change the pre-filter. And they don't actually make a pre-filter that's cut to the size you need, they just sell sheets of charcoal paper, so you have to cut it yourself and it's a big pain in the ass (even if they did, they would be like $20 a pop which is too much since you have to change them at least once a month). I'm not surprised that they rate the HEPA filter part (on the outbound air) of it as "lifetime" because of course it's the prefilter that actually does all the work.

A properly designed machine should have a large surface area paper filter at the air intake.

The other problem with this thing is that it doesn't have a low enough speed setting. Even on lowest setting it is very far from "silent". You hear a constant loud whooshing. It needs a setting that's about 50% the speed of the lowest.

While I'm at it, let's talk some more about dust and filters.

I noticed that my HTPC has been getting steadily louder recently. The problem of course is dust. Dust makes fans noisey, particularly the greasy crud stuck on the blades which really disturbs the air flow. It also reduces their efficiency, and dust inside the PC greatly impedes airflow and passive cooling, so it's important to clean it out regularly (once a year is probably enough).

It would be preferable really to have little paper filters on the PC air intakes that you could just replace, but PC fans are not designed for that so it's not a good idea to add. The biggest PITA is probably the power supply, because the fans are inside the PSU block, and so are some nasty capacitors.

Laptops are a much worse problem, they have tiny passages that can easily get clogged with dust and ruin their cooling. Unfortunatley you can't really clean them without taking them apart. This should be done about once a year, or your laptop will be running louder and hotter than it needs to. I'm sure the insides of game consoles are totally clogged up too.

Some other random places that I clean that you might not be aware of :

The radiator and fan in the back of the fridge.

The fume hood above your stove. The grating for the hood gets filthy and clogged which ruins the suction. Soak in degreaser or just replace. The fan blades of the fume hood need cleaning too, though it's often hard to access.

Heater air vents. Obviously you replace the filter once a month, but you also need to take off all the output grills and clean inside them. I would really like to clean out the entire pipe from the air intake to the output points but I can't imagine how to do that. Basically forced air heaters are blowing giant clouds of filth and alergens into your air all the time and there's nothing you can do about it. One option is to put filters at the output grills, but most heat systems aren't designed for that much back pressure.

Inside your vacuum cleaner. Not only do you replace the paper bag, you take the bottom of the machine and clean out all the dust stuck in the piping. Your suction will be greatly improved.

The exhaust tube for your clothes dryer. Again you can't get to most of the tube, but most of the shit will be at the end near the dryer or the end going out of the house.

05-29-11 | Cars : BMW 1M and Cayman R

These are two very interesting cars getting a lot of recent press. I think they might be the two best cars in the whole world right now, so let's explore some details. I haven't actually driven either, since it's impossible to find either at a dealer still (maybe forever - both are semi-limitted runs and are getting bought as fast as they are made), but I have driven a 135 and Cayman S.

The BMW 1M is basically a 135 with some tweaks. The Cayman R is basically a Cayman S with some tweaks. Both appear to perform much better than their base cars. If you look at lap times, you would conclude they are radically better. But that is a bit misleading.

I believe the tweaks to the 1M and the R are pretty similar, and both are what enthusiasts do to the base cars.

Neither one really has a modified engine at all. That is mildly disappointing. The Cayman R gets +10 hp from air flow mods (which enthusiast owners do to their Cayman S), and the 1M gets +30 hp from air & ecu mods (enthusiast 135 owners do air & ecu for +any hp, 30 is no problem). Also, neither one is really lighter. If you weigh them in the no-radio, no-AC, lightweight seat spec then they seem a bit lighter, but of course you could do that to your S/135 if you wanted. You can rip out your AC and put Recaros in your current car, that's not a serious way to lighten a car, so they are a big fail in that respect (the actual structural weight savings on both cars is something like 30 pounds; it's also retarded how they do the lightening in these cars; they remove your AC and radio, which you want, but then they still put a cosmetic plastic piece over your engine; WTF, remove the useless cosmetic weight and leave the functional shit please; if you want to get serious about lightening, I could go for manual windows and manual door locks and trunk release and all that, you can get rid of all those motors).

(the lack of engine mod is particlarly disappointing in the Cayman R, since Porsche has got a bunch of engines just sitting around that could have gone in there; they could have put the 3.6 version of the 9A1 in there instead of the 3.4 that the Cayman usually gets)

What has been done? Well the formula is exactly the same for both cars : LSD & Stiffer rear.

Porsches and BMW's have been set up for horrible understeer for the past many years. You may have seen me write about this exact issue with Porsches before. Most of the cars don't come with LSD; on Porsches you can option an LSD, but that is sort of a kludge because they aren't properly set up (you want different suspension setups for cars with LSD vs. not). Basically all they're doing with the R and the 1M is undoing the bad setup of the S and the 135. Particularly I think with the 135, it's a travesty that such a good engine has been sold with fucking run-flat tires and no LSD. So they're just fixing that sin.

Once you look at the 1M and R not as new special sporty models, but just as fixes to make these good cars the way they always should have been, you see the point of them. Anyway, some details :

BMW 1M :
N54 +30 hp from piston rings, air, ecu
LSD (viscous)
proper rubber
transmission oil cooler
M3 rear suspension and rear subframe (stiffer)
wider track
light flywheel
steering from E3
ecu (throttle response?)
lots of actual M3 parts

Cayman R
9A1 3.4L +10 hp (exhaust manifold, ecu, 7400 vs 7200 rev limit)
LSD (friction clutch pack)
lower / non-PASM / stiffer suspension
more negative camber (still not enough)
stiffer rear sway
not actually GT3 parts (not adjustable)

The differences between the 1M and 135 are a lot more significant than the differences between the Cayman R and S. In both cases the price premium ($5-10k or so) is so small that of course you should go ahead and buy the 1M/R.

I know a lot more about Porsches than I do about BMW's, and I can say that as usual Porsche have cheaped out and done the absolute minimum to hit their performance target. They could have easily grabbed the GT3 suspension bits, which is what you really want because they are adjustable in a wide range from good on the street to good with R-comp on the track, but no, that would have cut into their massive profit margin, so instead they give you new non-adjustable suspension bits, that are better than the S, but still not actually good enough for the serious enthusiast. You can just see it in every little aspect of modern Porsches; they don't give you the good brake cooling ducts, they don't give you the good shift cables, they don't give you the good throttle body, etc.

To be redundant, basically what a serious Cayman S owner does is they buy the GT3 front LCA's, GT3 sway bars, maybe a stiffer spring set, and a good aftermarket LSD. All that costs you a little bit more than the Cayman R, but then you have a much better car. So if you're going to track mods for your car anyway, there's no point in starting with the R, because you're going to replace all it's parts anyway, just start with the S.

It's sort of an odd fact that all modern Porsches (below the GT3) are shipped intentionally crippled. This not only makes them look back in magazine tests, it makes them look bad in comparisons like "fastestlaps.com" that compare stock cars, and it is a real problem for people who want to race them in SCAA "stock" classes. It's strange and very annoying. It's very useful when a car manufacturer offers a "super sport" option pack (such as the Mustang Boss 302 Laguna Seca) - even if you don't buy that car, it gives you a set of example parts to copy, it shows you how to set up the car for maximum performance, and it gives the magazines a way to test the car in properly set up trim. And if it's an option pack (as opposed to a nominally different model like the R), then you're allowed to use in SCCA stock racing.

One misgiving I have about the Cayman R that would give me pause before buying it new without driving is that it is lower and stiffer than the Cayman S, which is already quite low and quite stiff. It's sort of cheating to make a car that handles better by making it lower and stiffer. It's like making a compressor that gets better ratio by using more memory. It's not actually an improvement in the fundamentals, it's just dialing your trade-off slightly differently on the curve. The really impressive thing is to make a car that handles better without being lower or stiffer.

(the other problem with buying the Cayman R new is that Porsches are absurdly over-priced new, it's basically a $50k car being sold for $70k just because there are enough suckers and no competition)

Attempt to summarize my usual descent into rambling :

The Cayman R is a nice thing because it shows everyone what a properly set up Cayman S can do; the Cayman S has looked falsely bad in lots of tests (see for example Fifth Gear test where Tiff complains of understeer) because of the way it's sold set up all wrong, so the R is good for magazines, but it really isn't that special of a model, and if you wait until 2014 for the next gen 991 Cayman it will be better.

The 1M on the other hand is a very special model that will only be around this year, and gives you loads of goodies for the value; if you were considering a small BMW, the 1M is clearly the one to buy.

05-20-11 | LZP1 Variants

LZP = String match compression using some predictive context to reduce the set of strings to match

LZP1 = variant of LZP without any entropy coding

I've just done a bunch of LZP1 variants and I want to quickly describe them for my reference. In general LZP works thusly :

Make some context from previous bytes
Use context to look in a table to see a set of previously seen pointers in that context
  (often only one, but maybe more)

Encode a flag for whether any match, which one, and the length
If no match, send a literal

Typically the context is made by hashing some previous bytes, usually with some kind of shift-xor hash. As always, larger hashes generally mean more compression at the cost of more memory. I usually use a 15 bit hash, which means 64k memory use if the table stores 16 bit offsets rather than pointers.

Because there's no entropy coding in LZP1, literals are always sent in 8 bits.

Generally in LZP the hash table of strings is only updated at literal/match decision points - not for all bytes inside the match. This helps speed and doesn't hurt compression much at all.

Most LZP variants benefit slightly from "lazy parsing" (that is, when you find a match in the encoder, see if it's better to instead send a literal and take the match at the next byte) , but this hurts encoder speed.

LZP1a : Match/Literal flag is 1 bit (eight of them are sent in a byte). Single match option only. 4 bit match length, if match length is >= 16 then send full bytes for additional match length. This is the variant of LZP1 that I did for Clariion/Data General for the Pentium Pro.

LZP1b : Match/Literal is encoded as 0 = LL, 10 = LM, 11 = M (this is the ideal encoding if literals are twice as likely as matches) ; match length is encoded as 2 bits, then if it's >= 4 , 3 more bits, then 5 more bits, then 8 bits (and after that 8 more bits as needed). This variant of LZP1 was the one published back in 1995.

LZP1c : Hash table index is made from 10 bits of backwards hash and 5 bits of forward hash (on the byte to be compressed). Match/Literal is a single bit. If a match is made, a full byte is sent, containing the 5 bits of forward hash and 3 bits of length (4 bits of forward hash and 4 bits of length is another option, but is generally slightly worse). As usual if match length exceeds 3 bits, another 8 bits is sent. (this is a bit like LZRW3, except that we use some backward context to reduce the size of the forward hash that needs to be sent).

LZP1d : string table contains 2 pointers per hash (basically a hash with two "ways"). Encoder selects the best match of the two and send a 4 bit match nibble consisting of 1 selection bit and 3 bits of length. Match flag is one bit. Hash way is the bottom bit of the position, except that when a match is made the matched-from pointer is not replaced. More hash "ways" provide more compression at the cost of more memory use and more encoder time (most LZP's are symmetric, encoder and decoder time is the same, but this one has a slower encoder) (nowadays this is called ROLZ).

LZP1e : literal/match is sent as run len; 4 bit nibble is divided as 0-4 = literal run length, 5-15 = match length. (literal run length can be zero, but match length is always >= 1, so if match length >= 11 additional bytes are sent). This variant benefits a lot from "Literal after match" - after a match a literal is always written without flagging it.

LZP1f is the same as LZP1c.

LZP1g : like LZP1a except maximum match length is 1, so you only flag literal/match, you don't send a length. This is "Predictor" or "Finnish" from the ancient days. Hash table stores chars instead of pointers or offsets.

Obviously there are a lot of ways that these could all be modifed to get more compression (*), but it's rather pointless to go down that path because then you should just use entropy coding.

(* a few ways : combine the forward hash of lzp1c with the "ways" of lzp1d ; if the first hash fails to match escape down to a lower order hash (such as maybe just order-1 plus 2 bits of position) before outputting a literal ; output literals in 7 bits instead of 8 by using something like an MTF code ; write match lengths and flags with a tuned variable-bit code like lzp1b's ; etc. )

Side note : while writing this I stumbled on LZ4 . LZ4 is almost exactly "LZRW1". It uses a hash table (hashing the bytes to match, not the previous bytes like LZP does) to find matches, then sends the offset (it's a normal LZ77, not an LZP). It encodes as 4 bit literal run lens and 4 bit match lengths.

There is some weird/complex stuff in the LZ4 literal run len code which is designed to prevent it from getting super slow on random data - basically if it is sending tons of literals (more than 128) it starts stepping by multiple bytes in the encoder rather than stepping one byte at a time. If you never/rarely compress random data then it's probably better to remove all that because it does add a lot of complexity.

REVISED : Yann has clarified LZ4 is BSD so you can use it. Also, the code is PC only because he makes heavy use of unaligned dword access. It's a nice little simple coder, and the speed/compression tradeoff is good. It only works well on reasonably large data chunks though (at least 64k). If you don't care so much about encode time then something that spends more time on finding good matches would be a better choice. (like LZ4-HC, but it seems the LZ4-HC code is not in the free distribution).

He has a clever way of handling the decoder string copy issue where you can have overlap when the offset is less than the length :

    U32     dec[4]={0, 3, 2, 3};

    // copy repeated sequence
    cpy = op + length;
    if (op-ref < 4)
        *op++ = *ref++;
        *op++ = *ref++;
        *op++ = *ref++;
        *op++ = *ref++;
        ref -= dec[op-ref];
    while(op < cpy) { *(U32*)op=*(U32*)ref; op+=4; ref+=4; }
    op=cpy;     // correction

This is something I realized as well when doing my LZH decoder optimization for SPU : basically a string copy with length > offset is really a repeating pattern, repeating with period "offset". So offset=1 is AAAA, offset=2 is ABAB, offset=3 is ABCABC. What that means is once you have copied the pattern a few times the slow way (one byte at a time), then you can step back your source pointer by any multiple of the offset that you want. Your goal is to step it back enough so that the separation between dest and source is bigger than your copy quantum size. Though I should note that there are several faster ways to handle this issue (the key points are these : 1. you're already eating a branch to identify the overlap case, you may as well have custom code for it, and 2. the single repeating char situation (AAAA) is by far more likely than any other).

ADDENDUM : I just found the LZ4 guy's blog (Yann Collet, who also did the fast "LZP2"), there's some good stuff on there. One I like is his compressor ranking . He does the right thing ( I wrote about here ) which is to measure the total time to encode,transmit,decode, over a limitted channel. Then you look at various channel speeds and you can see in what domain a compressor might be best. But he does it with nice graphs which is totally the win.

05-19-11 | Nathan Myhrvold

I was reading about the new cookbook "Modernist Cuisine" (which sounds pretty interesting, but DON'T BUY IT) and I was like hhrrmm why does this name sound familiar. And then I remembered ...

Myrhvold back in the day was one of the "gunslingers" at the top of Microsoft who was responsible for their highly immoral and sometimes illegal practices, which consisted of vaporware announcements, exclusivity deals, and general bullying to sabotage any attempts at free market competition in computers. (see here for example). It's hard for me to get back in that mindset now, Microsoft seems so inept these days, but let's not forget it was built on threats, lies, bullying, and stealing (I assume this was mostly Bill's doing).

Nowadays, Nathan Myhrvold is basically enemy #1 if you believe that most patents are evil.

His "Intellectual Ventures" is basically buying up every patent they think they can make money on, and then extracting license fees. He talks a lot of shit about funding invention, but that is an irrelevant facade to hide what they're really doing, which is buying up patents and then forcing people to license under threat of suit.

Even if you support patents you must see that innovation is being stifled because tech startups have to live in fear that the completely obvious algorithm they implemented was patented by some someone, and was then bought by some fund that has a team of lawyers hunting for infringers.

For more info see : 1 , 2 , 3

The Coalition for Patent Fairness has been doing some decent work at making small steps towards reform. The CPF is big business and certainly not revolutionary anti-patentites, they just want it to be a bit harder for someone to patent something absurd like "one click ordering" and then to extract huge damages for it. And of course Myhrvold is against them.

05-17-11 | Dialing Out Understeer

Almost all modern cars are tuned for understeer from the factory. Partly because of lawyers, but partly because that's what consumers want. Certain Porsches (ever since the 964) and BMW's (ever since the E46 M3) have been badly tweaked for understeer.

You can mostly dial it back out. I will describe some ways, in order from most preferred to least. Obviously the exact details depend on your model, blah blah blah. I'm just learning all this stuff, I'm no expert, so this sort of a log of my learnings so far.

In general when trying to get more oversteer you have to be aware of a few issues. Basically you're trying to increase grip in the front and decrease grip in the rear ; you don't want to go so far with decreasing grip in the rear that you decrease overall grip and get much slower. You also don't want to create lift-off oversteer or high speed mid-corner snap oversteer or any of those nasty gremlins.

1. Driving Technique. Go into corners fast, brake hard to load up the front, turn in with a bit of trail brake. This helps a lot. Now move on to :

2. Alignment (Camber & Toe). Basically more camber in front and less camber in rear will increase oversteer, because in a corner the tires are twisted sideways, so by giving the front the ideal camber they will have good contact while cornering, and the rear tires will be on edge and slip. Obviously there's a limit where more toe in front is too much, so the idea is to set the front tires to their ideal camber for maximum grip, and then set the rear somewhere lower to give them less grip. If you want a fun alignment you generally want zero front toe, and just enough rear toe to keep the car stable under braking and in high speed turns (zero rear toe is a bit too lively for anything but autocross use). Note that severe camber on your driven wheels can hurt straight line acceleration.

3. Higher rear tire pressure. This is sort of a temp/hack fix and real racers frown on it, but it is cheap and pretty harmless so it's easy to try. Many people get confused about how tire pressures affect handling, because if you search around the internet you will find some people saying "lower your pressure to decrease grip" and others say "raise your pressure you to decrease grip". The truth is *both* work. Basically there is an ideal pressure at which grip is maximum - changing pressure in either direction decreases grip. However, lower pressure also leads to tires moving on the rim, which is very bad, so if you want to tweak your grip it should always be done by raising pressures (as you lower tire pressure, you get more grip, more grip, then suddenly hit a point where the tires start rolling on the sidewall, which you don't want to get to). So set the front & rear to the ideal pressures, then raise the rear pressure a little to reduce grip in the rear. (for example E46 M3 is supposed to be good at 35F and 42R)

4. Sway bars or spring rates (stiffer in rear and softer in front). You get more oversteer from a car with a stiff rear. Basically stiffer = less grip, it means the whole end of the car will slide rather than the wheels acting independently and sticking (this is why Lotus and McLaren don't use sway bars at all). You don't want to overdo this as a way to get oversteer, because it makes the ride harder (a stiffer sway is just a form of stiffer spring), and it also just reduces overall grip (unless your sways were so severe that you were leaning and getting out of camber, in which case stiffer can mean more grip). But many OEM cars are shipped much stiffer in the front than the rear - they are dialed to have more grip in back and not enough in front, so you can undo that. Note that lots of "tuners" just mindlessly put bigger sways on front and back, when in fact the OEM front bar might be just fine, and putting on a stiffer front just makes things worse. BTW a lot of people make the mistake of just going stiff, thinking stiffer = better handling; in fact you want some weight transfer because weight transfer is what gives you control over the grip. The ideal car will either oversteer or understeer depending on what you do to it, and weight transfer lets you do that.

5. Narrower rear tires. Basically undoing the staggered setup a bit. This obviously decreases max grip in the rear. Many cars now are shipped on rear tires that are really too big. See below for more notes.

6. Wider front tires. Going wider in the front as part of a package of going narrower in the rear may make sense; for example a lot of cars now are sold on a 235/265 stagger, and they may do well on a 245/245 square setup. However, many people mistakenly think wider = faster, and will do something like change the 235/265 to 265/265. Wider is not always better, particularly in the front. It makes turning feel heavier and makes turning response not as sharp. It produces more tire scrubbing in low speed full lock turn. It takes the tires longer to heat up, so it can actually make you slower in autocross scenarios. It makes the tires heavier which makes you slower.

Beware copying racers' setups.

Racers run way more negative camber, like -3 to -5 degrees. That's partly because they are taking hard turns all the time, with few straights, but it's also because they are running true competition slicks, which are a very different tire compound and need/want the greater camber.

Racers run much wider tires. This is good for them for a few reasons. One is they never actually make very sharp turns (most race cars have horrible turning radii), they only ever turn the wheels slightly, so they don't need the nimbleness of narrow tires. The other is that they are going so fast that they can warm up the wide tires - under street or autocross type use very wide tires never come up to temp and thus can actually have less grip than narrow tires.

Also more generally, race cars are usually set up because of the weird specs and rules of their class, not because it's the best way for them to be set up.

A collection of quotes that I found informative :

"The way you take a corner in a 911 is brake in a straight line before
entering the corner and get your right foot on the throttle before
turning into the corner. Use light changes on the throttle to keep the
rear end stable and use weight transfer to control understeer/oversteer.
The front end will bite and turn in. At or before Apex, start rolling in
throttle. Most corners you can be at full throttle before corner exit.
There is not another car out there that can come off a corner as fast as
these cars, but a lot of cars that can enter faster. You do not want to
drive it just trying to push through a corner like a front engine
understeering car."

"I race a FWD 1999 Honda Civic in the SSC class in SCCA
races. The trick to get FWD cars to rotate is to pump up the rear tires.
The rules won't let us modify suspensions and, in the Civic, there's
very little camber available, so we run with cold air pressures of 33F
and 37R. In a Neon I used to race, I would run with over 40 psi in the

"When I got my 2004 M3 I played with air pressures and ended up setting
them at 35F and 42R. This was on a car with the factory alignment and
Michelin Pilot Sports. At the Summit Point track in WV, with these air
pressures, I got NO understeer in the M3. Mr. B"

"To use oversteer to rotate your car prior to the apex you turn in early
and trail brake hard. The heavy braking while turning shifts your weight
forward reducing traction in the rear which induces oversteer. You
better not plan on lifting to catch the car though or you will be in the
weeds. The right thing to do is to transition from trail braking to
fairly heavy throttle (depending on the corner) to shift the weight back
to the rear taking the car from oversteer to neutral at the apex.

Because you come into the corner early and fast and brake very late this
can be very fast but it's not for beginners. One false move and you are
in a world of hurt. 

If you really feel the need to change your car to help with this the
best way to help the weight transfer. If you have 2 way adjustable
shocks increase rebound in back allowing the rear to lift more under
braking or decrease compression in front allowing the it to dip more or

If you don't have adjustable shocks do something to increase turn in
like softening the front bar or stiffening the rear. The downside is
that this will also increase oversteer under acceleration at and after
the apex. "

"Same principles as the M3 game, Ron. For less understeer: more camber
in front, less camber in rear, higher pressure in rear, less pressure in
front. Anything to increase grip in front and reduce grip in the rear
will result in more neutral handling. Of course, you can go too far in
one direction and create an oversteering monster a la 930s, et al."

"My car is my daily driver, and I care how quiet and comfortable it is.
I think that all camber plates have monoball mountings, so there is NO
rubber bushing at the top of the strut."

"There is almorst no increase in noise & rattle with the Tarett camber
plates. removing the rubber makes a big difference in turn in and
getting the car to take a set."

"If I remember right when we built the Spec Boxster we were
after 3.5 degrees in the front. You will pick up lots of camber when you
lower the car. I thick we were at 1.9 degrees without any shims. After
that the rule of thumb was .1 degrees for every mm of shim. I thick we
went with 16mm of shim to get our 3.5 degrees."

"I'm an instructor with LCAs and only use 1.8 in front. It's enough to
really help the tire life. I'd be faster with more camber but I drive
the car a lot on regular roads and don't want to muck up the everyday

"Track driving is a different story with different settings. Don't fall
into the "Negative-camber-mania" accompanied by excessive lowering"

"Worst case is a lowered car on stock bars, which will have lots of body
roll and the camber will go positive (or less negative) with only a bit
of compression. Add that to the positive camber from body roll and your
outside front tire could go several degrees positive in a hard turn. "

"On my car--light weight with fairly stiff suspension and not overly
lowered--I couldn't even use -1.5° of static front camber at street
speeds. It was cornering so flat that it wasn't "using up" the camber it
had. Front end bite was actually much better with -.8° front camber. At
the track, I'd probably want that -1.5° or even more."

"collectively- we spend a lot of time tuning our cars for the "ultimate
set up" with high amounts of camber- stiffness in the sway bars etc.
this is a DRY set up and will in fact get you in trouble on a wet

"You'll find that you lose a lot of feel with the 26/24mm sways because
the car isn't rolling at all. That probably makes it faster, but IMHO
it's not as fun. And without body roll the car can seem a bit more
unpredictable /unstable."

"My understanding, and I'll confirm once I get lsd, is that you should
run without a rear sway if you don't have lsd, run with rear sway if you

" Take out that front sway and put the stock one back in. Then put the
rear sway on full stiff. I don't understand why so many people continue
to put stiffer front sways on the front of the Turbo's. Even on full
soft, it's got to be well stiffer than the stock unit. You just
increased your front spring rate (as springs and sways work together)
and added understeer. Especially H&R's which are pretty darn stiff.

"FWIW I ran the GT3 rear bar (full stiff) with an OEM M030 front on my
car when I was on the stock M030 dampers and H&R springs and the car
drove great compared to when the stock rear bar was on there. Helped the
car rotate, reduces the understeer some and was a cheap solution to
making the car handle better with what was on the car at the time.
Totally worth the time and money if you ask me"

" Steve and others, I've not had good results running less negative
camber up front. My car is AWD still, and I played around with a
different setting earlier this year with horrible "Put it on the
trailer" results... Setting the car up with as little as 1/4 less
negative camber up front than the rear made the car a handful to drive,
to the point that I was off course twice on highspeed corners... I am
having my best results running about .4-.5 degrees MORE negative camber
up front than in the rear, and tend to float around at -3.0 to -2.8
upfront and -2.5 to -2.2 in the rear. This is on my 18X9 and 18X12 CCWs
with NT01s on. I also run the front swaybar in the softest setting and
the rear in the middle of three settings... If I want more rear
rotation, I tend to go stiffer to the inside hole. 

I've also raised my rear ride height by about 3/8th inch and that has
helped with high speed braking a little bit... I think that additional
height may have put the rear wing in the air a little more to help with
some downforce out back... Can't wait to get the GT2 nose on the car to
see how it balances things out... " 

"My observation is that the rear sway setting affects your ride quality
quite a bit, more than the front. And the front setting affects steering

05-15-11 | SSD's

There's been some recent rumor-mongering about high failure rates of SSD's. I have no idea if it's true or not, but I will offer a few tips :

1. Never ever defrag an SSD. Your OS may have an automatic defrag, make sure it is off.

2. Make sure you have a new enough driver that supports TRIM. Make sure you aren't running your SSD in RAID or something like that which breaks TRIM support for some drivers/chipsets.

3. If you use an Intel SSD you can get the Intel SSD Toolbox ; if you have your computer set up correctly, it should not do anything, eg. if you run the "optimize" it should just finish immediately, but it will tell you if you've borked things up.

4. If you have one of the old SSD's with bad firmware or drivers (not Intel), then you may need to do more. Many of them didn't balance writes properly. Probably your best move is just to buy a new Intel and move your data (never try to save money on disks), but in the mean time, most of the manufacturers offer some kind of "optimize" tool which will go move all the sectors around. For example SuperTalent's Performance Refresh Tool is here .

5. Unnecessary writes *might* be killers for an SSD. One thing you can do is to check out the SMART info on your drive (such as CrystalDiskInfo, or the Intel SSD Optimizer works too), which will tell you the total amount of writes in its lifetime. So far my 128 G drive has seen 600 G of writes. If you see something like 10 TB of writes, that should be a red flag that data is getting rewritten over and over for no good reason, thrashing your drive. So then you might proceed to take some paranoid steps :

Disable virtual memory. Disable superfetch , indexing service, etc. Put firefox's SQL db files on a ram disk. Run "filemon" and watch for any file writes and see who's doing it and stop them. Now it's certainly true that *in theory* if the SSD's wear levelling is working correctly, then you should never have to worry about write endurance with a modern SSD - it's simply not possible to write enough data to overload it, even if you have a runaway app that just sits and writes data all the time, the SSD should not fail for a very long time ( see here for example ). But why stress the wear leveller when you don't need to? It's sort of like crashing your car into a wall because airbags are very good.

I'm really not a fan of write caching, because I've had too many incidents of crashes causing partial flushes of the cache to corrupt the file system. That may be less of an issue now that crashes are quite rare.

What would really be cool is a properly atomic file system, and then you could cache writes in atoms.

Normally when you talk about an atomic file system you mean the micro-operation is atomic, eg. a file delete or file rename or whatever will either commit correctly or not change anything. But it would also be nice to have atomicity at a higher level.

For example when you run a program installer, it should group all its operations as one "changelist" and then the whole changelist is either committed to the filesystem, or if there is an error somewhere along the line the whole thing fails to commit.

However, not everything should be changelisted. For example when you tell your code editor to "save all" it should save one by one, so that if it crashes during save you get as much as possible. Maybe the ideal thing would be to get a GUI option where you could see "this changelist failed to commit properly, do you want to abort it or commit the partial set?".

05-14-11 | Pots 3

Potting becomes a bit less fun as I start to take it more seriously. It's really nice to do something where you just feel like a noob and everything is learning and you have no expectations. If you fuck up, no big deal, you learned something, and you didn't expect to succeed anyway.

Doing something creative and being a beginner makes me aware of what a dick I've been to other people in similar situations. When people say "hey look what I made, do you like it?" , my usual response is "erm, yeah it's okay". I should be like "yeah, it's wonderful!". I'm always careful about showing too much approval because I think of my approval as a way of directing behavior, but that's just being a dick.

It's getting annoying to go into the studio, so I feel like I have to either get a wheel at home or quit.

Some pots, in roughly chronological order over the last two months or so. Notes under each image.

Oribe on the left, with grooves cut in to try to accentuate the glaze variation. Ox blood on the right, makes a really nice red. The foot of the little red bowl is unpleasantly rough, I need to burnish the feet on the rough stoneware pots.

On the left is yellow salt base over dakota red clay, then I turned it upside down and poured temmoku to make the drippy look; I used a wire rack for that which was not a good method, the glaze bunches up on the rack. On the right is a bowl that I threw way too thick so then I cut away lots of facets; glazed temmoku.

On the left is cobalt stain under clear glaze; the cobalt on white clay body is a great blue; my drawing is horrible but I'd like to try that color again in a non-hand-drawn way. On the right is a crappy cup that I tried the "rub ink into crackle" method; it's okay but it's a pain in the ass.

Some experiments with leaving portions of the piece unglazed. I sanded the bowl a bit, but it's still quite rough, a smooth burnished outside would be better. I was hoping the inside of the bowl would pop with color more, maybe I'll try this idea again.

Experiment with making glaze run. Trying to throw some classical vase shapes. Base dip in yellow salt (white clay body). Then I dipped the rim 4 times in black glaze, dip, wait a bit for it to dry, dip again. Before firing it was a clean line on the rim, the idea is to get it to run in the firing. Pretty successful I think; I really like the unnexpected organic things that happen in the firing to lock fluid flow patterns into color, so I'm trying to find more ways to get that. It's crazy how much the pot shrinks between throwing and finished - it shrinks in bisque, then shrinks again in glaze firing ; I thought this pot was a nice big size when I threw it, but it came out awkwardly small.

This is some crap that didn't come out great. I do like the symmetrical shape on the right, might try that again, but taller, and better glaze.

Trying a band of unglazed clay; first iron oxide stain, then wax, then glazes.

I tried a funny technique on this one to try to get some irregular patterns; I dipped the pot in slip after it was bisqued, which you normally wouldn't do, because it doesn't stick well. Glazed in yellow salt.

This one I painted some wax around in a wavy band before glazing, then sand-papered off most of the wax, leaving a thin irregular bit of wax. Then glazed. At that time it looked like it was all covered, the spots only revealed themselves in the kiln. Shino glaze with the pour-on method to create irregularities in thickness.

05-13-11 | Steady State Fallacy

One of the big paradigm shifts in data compression was around the mid 90's, people stopped talking about the steady state.

All the early papers had proofs of "asymptotic optimality" , that given infinite data streams from a static finite model, the compressor would converge to the entropy. This was considered an important thing to prove and it was done for lots of algorithms (LZ7x,PPM,CONTEXT, etc).

But in the real world, that asymptote never arrives. Most data is simply too small, and even huge data sets usually don't have static generators, rather the generating model changes over the course of the data, or perhaps it is a static model but it has internal hidden switching states that are very hard for the compressor to model so in practice you get a better result by treating it as a non-static model. We now know that the performance during the "learning phase" is all that matters, since all data is always in transition (even if the majority of the model becomes stable, the leaves are always sparse - I wrote about this before, generally you want the model to keep growing so that the leaves are always right on the boundary of being too sparse to make a reliable model from).

So modern papers rarely bother to prove aymptotic optimality.

Today I realized that I make the same mistake in decision making in life.

In general when something does not go well, my solution is to just not go back. Say some friend is shitty to me, I just stop seeing them. Some restaurant offers me a horrible table even though I see better ones sitting epty, I just won't go to that restaurant again. Some car mechanic way over-charges me, I just won't go to that mechanic again. My landlord is a nutter, I'll just move out. The idea of this behavior is that by cutting out the bad agent eventually you get into a state with only good agents in your life.

But in reality that steady state never arrives. Unless you lock yourself in a cave, you are always getting into new situations, new agents are injected into your life, and the good agents drop out, so you are always in the transition phase.

Or you know, during unpleasant phases, you think "ok I'll just work a lot right now and life will suck but then it will be better after" or "ok I'm injured it sucks I'll just do a lot of rehab and it will be better after" or "it's gray winter and life will suck but it will be better after". But the after never actually comes. There's always some new reason why now is the "hard time".

05-13-11 | Avoiding Thread Switches

A very common threading model is to have a thread for each type of task. eg. maybe you have a Physics Thread, Ray Cast thread, AI decision thread, Render Thread, an IO thread, Prefetcher thread, etc. Each one services requests to do a specific type of task. This is good for instruction cache (if the threads get big batches of things to work on).

While this is conceptually simple (and can be easier to code if you use TLS, but that is an illusion, it's not actually simpler than fully reentrant code in the long term), if the tasks have dependencies on each other, it can create very complex flow with lots of thread switches. eg. thread A does something, thread B waits on that task, when it finishes thread B wakes up and does something, then thread A and C can go, etc. Lots of switching.

"Worklets" or mini work items which have dependencies and a work function pointer can make this a lot better. Basically rather than thread-switching away to do the work that depended on you, you do it immediately on your thread.

I started thinking about this situation :

A very simple IO task goes something like this :

Prefetcher thread :

  issue open file A

IO thread :

  execute open file A

Prefetcher thread :

  get size of file A
  malloc buffer of size
  issue read file A into buffer
  issue close file A

IO thread :

  do read on file A
  do close file A

Prefetcher thread :

  register file A to prefetched list

lots of thread switching back and forth as they finish tasks that the next one is waiting on.

The obvious/hacky solution is to create larger IO thread work items, eg. instead of just having "open" and "read" you could make a single operation that does "open, malloc, read, close" to avoid so much thread switching.

But that's really just a band-aid for a general problem. And if you keep doing that you wind up turning all your different systems into "IO thread work items". (eg. you wind up creating a single work item that's "open, read, decompress, parse animation tree, instatiate character"). Yay you've reduced the thread switching by ruining task granularity.

The real solution is to be able to run any type of item on the thread and to immediately execute them. Instead of putting your thread to sleep and waking up another one that can now do work, you just grab his work and do it. So you might have something like :

Prefetcher thread :

  queue work items to prefetch file A
  work items depend on IO so I can't do anything and go to sleep

IO thread :

  execute open file A

  [check for pending prefetcher work items]
  do work item :

  get size of file A
  malloc buffer of size
  issue read file A into buffer
  issue close file A

  do IO thread work :

  do read on file A
  do close file A

  [check for pending prefetcher work items]
  do work item :

  register file A to prefetched list

so we stay on the IO thread and just pop off prefetcher work items that depended on us and were waiting for us to be able to run.

More generally if you want to be super optimal there are complicated issues to consider :

i-cache thrashing vs. d-cache thrashing :

If we imagine the simple conceptual model that we have a data packet (or packets) and we want to do various types of work on it, you could prefer to follow one data packet through its chain of work, doing different types of work (thrashing i-cache) but working on the same data item, or you could try to do lots of the same type of work (good for i-cache) on lots of different data items.

Certainly in some cases (SPU and GPU) it is much better to keep i-cache coherent, do lots of the same type of work. But this brings up another issue :

Throughput vs. latency :

You can generally optimize for throughput (getting lots of items through with a minimum average time), or latency (minimizing the time for any one item to get from "issued" to "done"). To minimize latency you would prefer the "data coherent" model - that is, for a given data item, do all the tasks on it. For maxmimum throughput you generally preffer "task coherent" - that is, do all the data items for each type of task, then move on to the next task. This can however create huge latency before a single item gets out.


Let me say this in another way.

Say thread A is doing some task and when it finishes it will fire some Event (in Windows parlance). You want to do something when that Event fires.

One way to do this is to put your thread to sleep waiting on that Event. Then when the event fires, the kernel will check a list of threads waiting on that event and run them.

But sometimes what you would rather do is to enqueue a function pointer onto that Event. Then you'd like the Kernel to check for any functions to run when the Event is fired and run them immediately on the context of the firing thread.

I don't know of a way to do this in general on normal OS's.

Almost every OS, however, recognizes the value of this type of model, and provides it for the special case of IO, with some kind of IO completion callback mechanism. (for example, Windows has APC's, but you cannot control when an APC will be run, except for the special case of running on IO completion; QueueUserAPC will cause them to just be run as soon as possible).

However, I've always found that writing IO code using IO completion callbacks is a huge pain in the ass, and is very unpopular for that reason.

05-08-11 | Torque vs Horsepower

This sometimes confuses me, and certainly confuses a lot of other people, so let's go through it a bit.

I'm also motivated by this page : Torque and Horsepower - A Primer in which Bruce says some things that are slightly imprecise in a scientific sense but are in fact correct. Then this A-hole Thomas Barber responds with a dickish pedantic correction which adds nothing to our understanding.

We're going to talk about car engines, the goal is to develop sort of an intuition of what the numbers mean. If you look on Wikipedia or whatever there will be some frequently copy-pasted story about James Watt and horses pulling things and it's all totally irrelevant. We're not using our car engine to power a generator or grind corn or whatever. We want acceleration.

The horizontal acceleration of the car is proportional to the angular acceleration of the wheels (by the circumference of the wheels). The angular acceleration of the wheels is proportional to the angular acceleration of the flywheel, modulo the gear ratio in the transmission. The angular acceleration of the flywheel is proportional to the torque of the engine, modulo moment of inertia.

For a fixed gear ratio :

torque (at the engine) ~= vehicle acceleration

(where ~= means proportional)

So if we all had no transmission, then all we would care about is torque and horsepower could go stuff itself.

But we do have transmissions, so how does that come into play?

To maximize vehicle acceleration you want to maximize torque at the wheels, which means you want to maximize

vehicle acceleration ~= torque (at the engine) * gear ratio

where gear ratio is higher in lower gears, that is, gear ratio is the number of times the engine turns for one turn of the wheels :

gear ratio = (engine rpm) / (wheel rpm)

which means we can write :

vehicle acceleration ~= torque (at the engine) * (engine rpm) / (wheel rpm)

thus at any given vehicle speed (eg. wheel rpm held constant), you maximize acceleration by maximizing [ torque (at the engine) * (engine rpm) ] . But this is just "horsepower" (or more generally we should just say "power"). That is :

horsepower ~= torque (at the engine) * (engine rpm)

vehicle acceleration ~= horsepower / (wheel rpm)

Note that we don't have to say that the power is measured at the engine, because due to conservation of energy the power production must be the same no matter how you measure it (unlike torque which is different at the crank and at the wheels). Power is of course the energy production per unit time, or if you like it's the rate that work can be done. Work is force over distance, so Power is just ~= Force * velocity. So if you like :

horsepower ~= torque (at the engine) * (engine rpm)

horsepower ~= torque (at the wheels) * (wheel rpm)

horsepower ~= vehicle acceleration * vehicle speed

(note this is only true assuming no dissipative forces; in the real world the power at the engine is greater than the power at the wheels, and that is greater than the power measured from motion)

Now, let's go back to this statement : "any given vehicle speed (eg. wheel rpm held constant), you maximize acceleration by maximizing horsepower". The only degree of freedom you have at constant speed is changing gear. So this just says you want to change gear to maximize horsepower. On most real world engines this means you should be in as low a gear as possible at all times. That is, when drag racing, shift at the red line.

The key thing that some people miss is you are trying to maximize *wheel torque* and in almost every real world engine, the effect of the gear ratio is much more important that the effect of the engine's torque curve. That is, staying in as low a gear as possible (high ratio) is much more important than being at the engine's peak torque.

Let's consider some examples to build our intuition.

The modern lineup of 911's essentially all have the same torque. The Carrera, the GT3, and even the RSR all have around 300 lb-ft of torque. But they have different red lines, 7200, 8400 and 9400.

If we pretend for the moment that the masses are the same, then if you were all cruising along side by side in 2nd gear together and floored it - they would accelerate exactly the same.

The GT3 and RSR would only have an advantage when the Carrera is going to hit red line and has to shift to 3rd, and they can stay in 2nd - then their acceleration will be better by the factor of gear ratios (something like 1.34 X on most 2nd-3rd gears).

Note the *huge* difference in acceleration due to gearing. Even if the upshift got you slightly more torque by putting you in the power band of the engine, the 1.34 X from gearing is way too big to beat.

(I should note that in the real world, not only are the RSR/R/Cup (racing) versions of the GT3 lighter, but they also have a higher final drive ratio and some different gearing, so they are actually faster in all gears. A good mod to the GT3 is to get the Cup gears)

Another example :

Engine A has 200 torques (constant over the rpm range) and revs to 4000 rpm. Engine B has 100 torques and revs to 8000 rpm. They have the exact same peak horsepower (800 torques*krpm) at the top of their rev range. How do they compare ?

Well first of all, we could just gear down Engine B by 2X so that for every two turns it made the output shaft only made one turn. Then the two engines would be exactly identical. So in that sense we should see that horsepower is really the rating of the potential of the engine, whereas torque tells you how well the engine is optimized for the gearing. The higher torque car is essentially steeper geared at the engine.

How do they compare on the same transmission? In 1st gear Car A would pull away with twice the acceleration of Car B. It would continue up to 4000 rpm then have to change gears. Car B would keep running in 1st gear up to 8000 rpm, during which time it would have more acceleration than car A (by the ratio of 1st to 2nd gear).

So which is actually faster to 100 mph ?

You can't answer that without knowing about the transmission. If gear changes took zero time (and there was no problem with traction loss under high acceleration), the faster car would be the higher torque car. In fact if gear changes took zero time you would want an infinite number of gears so that you could keep the car at max rpm at the time, not because you are trying to stay in the "power band" but simply because max rpm means you can use higher gearing to the wheels.

I wrote a little simulator. Using the real transmission ratios from a Porsche 993 :

Transmission Gear Ratios: 3.154, 2.150, 1.560, 1.242, 1.024, 0.820 
Rear Differential Gear Ratio: 3.444 
Rear Tire Size: 255/40/17  (78.64 inch cirumference)
Weight : 3000 pounds

and 1/3 of a second to shift, I get :

200 torque, 4000 rpm redline :

time_to_100 = 15.937804

100 torque, 8000 rpm redline :

time_to_100 = 17.853252

higher torque is faster. But what if we can tweak our transmission for our engine? In particular I will make only the final drive ratio free and optimize that with the gear ratios left the same :

200 torque, 4000 rpm redline :

c_differential_ratio = 3.631966
time_to_100 = 15.734542

100 torque, 8000 rpm redline :

c_differential_ratio = 7.263932
time_to_100 = 15.734542

exact same times, as they should be, since the power output is the same, with double the gear ratio.

In the real world, almost every OEM transmission is geared too low for an enthusiast driver. OEMs offer transmission that minimize the number of shifts, offer over-drive gears for quiet and economy, etc. If you have a choice you almost always want to gear up. This is one reason why in the real world torque is king ; low-torque high-power engines could be good if you had sufficiently high gearing, but that high gearing just doesn't exist (*), so the alternative is to boost your torque.

(* = drag racers build custom gear boxes to optimize their gearing ; there are also various practical reasons why the gear ratios in cars are limitted to the typical range they are in ; you can't have too many teeth, because you want the gears to be reasonably small in size but also have a minimum thickness of teeth for strength, high gear ratios tend to produce a lot of whine that people don't like, etc. etc.)

One practical issue with this these days is that more and more sports cars use "transaxles". Older cars usually had the transmission up front and then a rear differential. It was easy to change the final drive ratio in the rear differential so all the old American muscle cars talk about running a 4.33 or whatever different ratios. Nowadays lots of cars have the transmission and rear differential together in the back to balance weight (from the Porsche 944 design). While that is mostly a cool thing, it makes changing the final drive much more expensive and much harder to find gears for. But it is still one of the best mods you can do for any serious driver.

(another reason that car gear ratios suck so bad is the emphasis on 0-60 times means that you absolutely have to be able to reach 60 in 2nd gear. That means 1st and 2nd can't be too high ratio. Without that constraint you might actually want 2nd to max out at 50 mph or something. There are other stupid goals that muck up gearings, like trying to acheive a high top speed).

Let's look at a final interesting case. Drag racers often use a formula like :

speed at end of 1/4 mile :

MPH = 234 * (Horsepower / Pounds) ^ .3333

and it is amazingly accurate. And yet it doesn't contain anything about torque or gear ratios. (they of course also use much more complex calculators that take everything into account). How does this work ?

A properly set up drag car is essentially running at power peak the whole time. They start off the line at high revs, and then the transmission is custom geared to keep the engine in power band, so it's a reasonable approximation to assume constant power the entire time.

So if you have constant power, then :

  d/dt E = P

  d/dt ( 1/2 mv^2 ) = P

  integrate :

  1/2 mv^2 = P * t

  v^2 = 2 * (P/m) * t 

  distance covered is : 
  x = 1/2 a t^2

  and P = m a v

  a = (P/m) / v


  t = sqrt( 2*x*v / (P/m) )

  sqrt( 2*x*v / (P/m) ) = v^2 / ( 2 * (P/m) )

  simplify :

  v = 2 * ( x * (P/m) ) ^(1/3)

which is the drag racer's formula. Speed is proportional to distance covered times power-to-weight to the one third power.

If you're looking at "what is the time to reach X" (X being some distance or some mph), the only thing that matters is power-to-weight *assuming* the transmission has been optimized for the engine.

I think there's more to say about this, but I'm bored of this topic.


Currently the two figures that we get to describe a car's engine are Horsepower (at peak rpm) and Torque (at peak rpm) (we also get 0-60 and top speed which are super useless).

I propose that the two figures that we'd really like are : Horsepower/weight (at peak rpm) and Horsepower/weight (at min during 10-100 run).

Let me explain why :

(Power/weight) is the only way that power ever actually shows up in the equations of dynamics (in a frictionless world). 220 HP in a 2000 pound car is better than 300 HP in a 3000 pound car. So just show me power to weight. Now, in the real world, the equations of dynamics are somewhat more complicated, so let's address that. One issue is air drag. For fighting air, power (ignoring mass) is needed, so for top speed you would prefer a car with more power than just power to weight. However, for braking and turning, weight is more important. So I propose that it roughly evens out and in the end just showing power to weight is fine.

Now, what about this "Horsepower/weight (at min during 10-100 run)" ? Well let's back up a second. The two numbers that we currently get (Power and Torque both at their peak) give us some rough idea of how broad the power band of an engine is, because Power is almost always at peak near the max rpm, and Torque is usually at peak somewhere around the middle, so a higher torque number (power being equal) indicates a broader power band. But good gearing (or bad gearing) can either hide or exagerate that problem. For example a tuned Honda VTEC might have a narrow power band that's all in the 7k - 10k RPM range, but with a "crossed" transmission you might be perfectly happy never dropping out of that rev range. Another car might have a wide power band, but really huge gear steps so that you do get a big power drop on shifts. So what I propose is you run the cars from 10mph-100 , shifting at red line, and measure the *min* horsepower the engine puts out. This will tell you what you really want to know, which is when doing normal upshifts do you drop out of the power band, and how bad is it? eg. what is the lowest power you will experience.

Of all the numbers that we actually get, quarter mile time is probably the best.

05-03-11 | Image Dehancement

So many of the real estate ads now feature images that look like this :

Vomit. Quit it. Turning the saturation and contrast up to 11 does not make things look better.

05-03-11 | Some Documentaries

Stuff not on Netflix :

"Synth Britannia" was amusing. Search Piratebay for "Synth Britannia" and you will find two different things - one is a collection of music videos, the other is the documentary. Get both. The first significant synth album was by Wendy (born Walter) Carlos. WTF !? And Annie Lennox actually looked great with long hair.

("Prog Britannia" is good too).

"Kochuu" - about Japanese architecture and it's relationship to Sweden. Really beautifully made movie, captures the slow contemplative mood of the architecture in the filming itself. Very well done.

"Blowing up Paradise". God damn French.

05-03-11 | Pots 2

Finally got some more stuff out of the kiln. Some of this is months old because the firing is very much not FIFO.

Small bowl (B-mix body, lung chuan glaze, blue celadon drips) :

I like how the lung chuan gets pale blue where it's thick; I also think the drip-on spot technique was pretty successful so I'll play with that more in the future. One bad spot where I had my finger when I dipped it, need to be more careful of that. Lung chuan looks gross on red clay bodies but looks nice on the white body.

Some more practice with bottle / bud vase shapes :

Getting better. I like the red splash, I think I did that with a spoon but I forget, that's an old one.

Cups to practice throwing off the hump :

Unfortunately these cups are all junk because I wasn't careful about the shape of the rim. To be able to drink from a cup comfortably, the rim has to be either perfectly straight up and tapered (ideally straight up on the outside and tapered on the inside), or slightly tipped outward. It's safer to always flare out a tiny bit, because when it dries or fires a straight up lip might shrink inward a bit.

Some notes to self on throwing off the hump : it's useful to at least approximately center the whole lump so it doesn't disturb your eyes as you throw ; once you center a handful, press in a dent below it to clearly define your bottom ; open and then compress the bottom well ; throw as usual but remember you can always move clay from the hump up to your piece if necessary ; to remove : find the bottom of your piece and cut in a dent where you want to cut off, either a good ways down if you want to trim, or directly under and you can avoid trimming ; use the wheel to wrap the wire and pull it through ; now dry your fingers, use first two fingers on each side in a diamond shape, make contact and then use a spin of the wheel to break it free and lift up.

Some bowls :

This is my first batch of bowls that are correctly made in the sense of being reasonably thin and round. I'm horrible at fixing a pot once it's mostly thrown, so the way I get perfectly round bowls is to start perfectly centered and to open perfectly on center. If you can do that, then you just pull up and you wind up with a round bowl. This is not what pro potters do, it takes too long, they tend to just center real approximately and open quickly, and then they can man-handle the pot or cut off a bit to get it into shape.

To open perfectly on center, don't try to plunge your fingers in right at the center, it's impossible, you will always accidentally wiggle to one side or another. Instead, open by pulling down towards yourself as you plunge, and that will be centered automatically.

The other thing of course is to start with the clay centered. I think I wrote this before but the main issue for me is using lots of water and releasing the pressure very gradually. Also I'm now using plastic bats because they have precise bolt holes, the wooden bats all wobble and it's impossible to get a perfect center that way. I've noticed that most pro potters don't use bolts like we do, they attach the bat with a disc of clay, and if you do that you don't get any wobble.

04-27-11 | Things we need

Things that the world needs :

1. A real privacy law in America.

It should be illegal to give my personal information away without my express permission. It's absolutely sickening that my banks and credit cards are selling information about me.

Furthermore, it should be illegal to require personal information for marketting purposes in exchange for discounts. eg. stores that ask for your phone number when you check out, stores that use "club cards" to compell you to give your personal info, etc.

Nobody should be able to check your credit report without your explicit permission. eg. when some credit card company goes to ping your credit info, it should email you and say "is this allowed?" and you can just say no.

2. An academic journal that's free, peer reviewed, and gives the writers full ownership of their own work.

2.A. Information should be free. There's no reason for technical papers to be anything but online, so the costs are near zero, so there is absolutely no excuse for charging money for it. The only paid position is the editor, and that could easily be supported through donations.

2.B. Peer review is important to weed out junk. I would go further actually, I don't think the typical academic peer review structure is good enough. I'd like an organized way for other experts to post official counter arguments or addenda. The best journal I've ever seen is the Physical Review, in which you would frequently see a paper, and then a small counter-paper right after it.

2.C. It's absolutely sickening that major publishers take away the rights of the author. Authors would retain all rights to reproduce, distribute, whatever they want (however, the full free access to the paper could never be revoked).

2.D. I would like to see every paper start with a declaration of whether the authors or their organization have (or will try to get) patents on the described technique. This would also be a spot where they could state that they will not ever patent the work described, or they could donate the work to the defensive patent collection described next.

3. A viral defensive patent collection. I've written about this before, but the idea in brief is to create a pool of patents which is "viral" like the GPL is. That is, you can use absolutely any patent in the pool, if and only if you do not own any patent that is outside the pool. If you don't comply, then the patents in the pool cost money, and that money pays for administration of the pool and law suits against violators and so on. This is probably not realistic due to the necessity of corporations to cross-license, eg. even if someone like Google wanted be altruistic and do this, they can't because they need a patent portfolio to make cross-license arrangements with the other fuckers.

04-27-11 | Mexico

Just back from Mexico and a few random thoughts I want to record.

We were there during Semana Santa which is this huge Mexican holiday where everyone goes to the beach and parties. Lots of people told us not to go during that time, it was horrible, blah blah blah. Well those people suck, they're the kind of people who say things like "Mexico would be great if it wasn't full of Mexicans". I thought it was very cool to see. For one thing it meant that these gringo tourist towns were actually full of Mexicans, which hides some of their horribleness. It's also just a wonderful lively atmosphere, it's like being in New Orleans for Mardi Gras or something. Obviously you have to be smart about where you book your hotel - not right on a busy beach, not right on the town square - but other than that it's fine.

It was really cool seeing all the families. So the families come to the beach and find a table at a palapa and just sit there all day. Mexican restaurant hospitality believes in never rushing the patron or making them feel like they have to leave, so you can just sit there all day. The families even often bring their own food and drink, and you just have to order a tiny bit from the restaurant over the course of the day. Then the parents just sit there and the kids run around and play all day. It's an absolutely wonderful system.

It's so good for me just to be away from computers and television. And it's lovely to be somewhere warm where you can be outside all the time. I love the evening promenade, after the sun is down and it's cool enough, everyone strolls around town or hangs out in the square. It's tragic that as soon as I get home I immediately go for the TV and computer and fuck myself over. I have such peace when I wake up in the morning and sit on the porch and I know that all I can do right then is sit there because there is no computer, there is no TV.

Whenever I get back from a third world country, it strikes me how little freedom we actually have in the US. We are an incredibly repressed people. For example, during Semana Santa tons of people descend on the beaches and camp. Many of the places that they camp are technically private property or even public beaches but with forbidden camping; it's allowed because there's just not very much enforcement and because it's tradition. If a house is abandoned or something of course you move into it. If you're poor, you can make work by selling stuff on the beach, or opening up a food cart.

The latest trend I'm seeing in America that really bothers me is that even parking lots are gated or chained off. God forbid somebody goes on your parking lot when you're not using it. In Seattle city area I don't think there's a single parking lot that isn't fenced/chained at night. There's this oppressive feeling that you have to stay within the lines all the time.

We spent most of the time in the Costa Alegre area, a huge, empty, dry, hot space. It sort of reminded me of the trips to Mexico we used to take when I was a kid. We went to Isla Mujeres primarily, but it was 20 years ago and it was very different than it is now. There was only one big hotel for gringos, and the rest of the town was still local fisherman or a few small Mexican hotels (concrete boxes with no AC, very cheap). We went every summer for many years, and it was gradually changing, getting more developed, but the big change happened when some huge hurricane wiped the island clean, and of course it was rebuilt by the big money tourism developers, and now it's like a Disneyland styrofoam facsimile of a "Mexican island village".

There are lots of little farm towns around the Costa Alegre area. They're dusty and poor and feel like time moves very slowly there. It occurs to me that little farm towns like these are all over Mexico; I have no idea if this is true, but it feels like the towns are full of children and old people, like the middle aged have all left to seek work. It's sort of tragic for the world that it's not possible for people to make a living in these little farm towns. The amount of money they need to make to stay in the town is so tiny too, probably only a dollar a day or so, and they can't even make that.

It seems to me that it's in the best interest of the whole world to make subsistence farming viable. It would prevent mass immigrations and refugee problems. If you are anti-illegal-immigrant in the US, forget building a wall, what you need to do is stop the US government from subsidizing crop prices and domestic agribusiness.

The other issue is that the wonderful handicrafts they make in these towns just can't support them. There's lots of wonderful weavers, leather workers, etc. with great skills, and in theory they could make that stuff and sell it to the city folk, but they just can't sell enough of it for high enough prices. The problem is that western consumers just don't want quality hand made stuff, we want stuff that's the latest trend, and we don't care if it's cheap factory made junk. Certainly part of the problem is that the country handiworkers don't cater to modern tastes enough; I feel like there is an opportunity there for some charity to work with the small town craftsmen to teach them how to make things more aligned with what western consumers want. But it's not clear that would make a big difference, there just aren't enough consumer who care about getting a hand made belt vs a machine made one.

Anyway, I'm just amazed by the tacos you get for fifty cents. The tortillas are always made to order, even at the most ghetto road side cart, of course they make the tortillas to order! Someone (usually an older woman) grabs a handful of masa, slaps it around to shape it, presses it, and tosses it on the griddle (comal). The cooked tortilla has a subtle flavor of malty browned corn. My favorite thing was probably the quesadillas (which aren't really like our quesadillas at all), because they use a larger thicker tortilla which lets you get more of the flavor of the masa.

Some random impressions of my youth in Mexico :

Packs of wild dogs. On Isla Mujeres there was a pack of some 20 dogs or so that roamed the streets, constantly barking, stopping to smell things then running to catch up, breaking into sudden vicious fights with each other, fucking each other, snarling. It was variable in size, because dog ownership in Mexico is sort of a gray area; some people own dogs like Americans do, coddling them and keeping them indoors and feeding them, but others own them in a looser way, sort of just allowing a semi-wild dog to hang around, and the semi-wild dog might run off and hang with the pack for a while before coming home. We would chase them around town, or be chased by them, always sort of afraid of their wildness and fascinated by the way they could roam the streets and nobody seemed to be bothered much by it.

Rebar. Every building is ready to have another floor added. There's a shop on the first floor, the second floor is poured concrete, but unfinished and empty, and above that rebar juts into the sky. Everywhere, even on buildings that look perfectly finished, there's odd bits of rebar sticking out. Concrete walls always have a bit of extra rebar sticking out in case you want to go higher.

Wonderful loose hand-woven hammocks. I think the best hotel we ever stayed at was a desperation find after something else didn't work out. It was a concrete box with one wall missing facing the sea. There were beds, but also piles of giant cockroaches, cockroaches the size of a deck of playing cards, and it was hot as hell, so nobody slept in the beds. You can always tell you're in a real proper Mexican hotel when there are hammock hooks drilled into the walls. We all slept in hammocks and the sea air blew in through the side of the room that had no wall and it was delightful. (delightful other than the fact that you had to walk down to the ground floor to operate a hand pump to get water pressure up to the third floor).

I remember eating hamburgers a lot. And the great Cristal sodas (Toronja por favor, to which they would always say "Naranja?" and I would have to say "no, toe-ron-ja"). In the summer a little travelling carnival would come to town for a few weeks; the rides were all hand made, rickety, mostly human-powered. One of my favorites was this large wooden box that was held off the ground by an axle through the middle of the box. You would sit inside and then two strong men would just grab the outside of the box and rock it back and forth increasingly hard until it spun all the way around.

04-12-11 | Some TV Reviews

Circus - PBS reality show about a shitty circus ; they don't really do a good job of tapping into the crack-rock capabilities of reality TV, it's too much like an interview/documentary. The biggest problem is that the Circus managers and the head clown and the all the people in power are just giant douchebags, they're literally like the Office Space manager, "I'm going to need you to work through the weekend mmmkay" and you just hate them, but not in a compelling reality TV way, more in just a "I don't want to watch this any more way". Anyhoo, it's saved by some of the Circus performers, who are just such sympathetic characters, so full of art and passion, such as Sarah Schwarz . Meh, can't recommend it, but if you fast-forward extensively it's okay.

Luther - Idris Elba (Stringer Bell) as the hammy over-acted ever-so-intense John Luther. It's enjoyable and well made, though rather heavy on the gruesomeness and trying too hard to be serious all the time. Yeah it's ridiculous, over the top, cliched, but Idris has an awful lot of charisma and carries it pretty well.

Sherlock - modern resetting of some Holmes stories ; at first it seems appealing, because it is at least trying to be intellectual; unfortunately it's only "smart" on a wafer-thin vapid surface level. Tolerable but not worth seeking out.

Long Way Round / Long Way Down - Ewan McGregor motorbikes around the world. Actually, to be more accurate I should say, "Ewan McGregor plans to motorbike around the world in agonizing detail, whines about every little inconvenience, travels with a massive support crew, and then goes on and on about what a difficult thing it is". The main thing that struck me was how someone who's so rich and outgoing can be so bad at life. They seem to have no concept of how to travel. Point of advice #1 would be don't bring a film crew. And then #2 is don't plan out an hourly itenerary for yourself in advance. The whole show is just sort of upsetting because you watch them botching their journey so badly, just constantly worrying about getting to the next checkpoint (or coming across a small puddle and calling in the support team to help them across it). It's pathetic.

I did find watching Ewan somewhat inspiring though. Actually I got the same thing from Michael Palin in his travel shows. You can watch both of them turn on the performance personality when it's called for, even when they don't really want to, they're tired, they're embarassed, it's a situation where a normal person like me would just beg off "no no, I can't" but you watch them sort of take a deep breath and steel themselves and get on with it, make a speech or sing a song or whatever it is that's called for. Performance situations happen constantly in life, not just on stage, you come across a moment when the appropriate thing to do is to tell a funny story, or to dance, or whatever you should do to smooth out the situation or make it fun for everyone in that moment, and unless you're a psychopath, much of the time you won't really want to do it or you'll be embarassed or afraid or whatever, so the question is whether you just get on with it, or whether you wuss out and be bland and boring.

They both also do a very good job of talking to people in a nice way but without sacrificing their own dignity. You know like when you're in a weird place and you talk to some guy and he turns out to be a rabid racist or some awkward shit, there are two easy ways to deal with it which most people fall back on (1) is to just pretend that you agree with what he's saying, lots of nodding and mm-hmm, (2) is to just go "you're a nutter" or get sarcastic or just generally not engage with him any more. The hard thing is to stay engaged and talk to the guy in a joking, friendly manner, but without agreeing, and even making it clear that you disagree. I'm very impressed with that.

04-08-11 | Friday Filters

So in the last post we made the analysis filters for given synthesis filters that minimize L2 error.

Another common case is if you have a set of discrete data and wish to preserve the original values exactly. That is, you want to make your synthesis interpolate the original points.

(obviously some synthesis filters like sinc and box are inherently interpolating, but others like gauss and cubic are not).

So, the construction proceeds as before, but is actually simpler. Rather than using the hat overlaps we only need the values of the synthesis at the lattice points. That is :

Given synthesis/reconstruction filter S(t)
and discrete points P_i
Find points Q_i such that :

f(t) = Sum_i Q_i * S(t - i)

f(j) = P_j for all integers j

We can write this as a matrix equation :

Sij = S(j - i) = S(i - j)

S * Q = P

note that S is band-diagonal. This is the exact same kind of matrix problem that you get when solving for B-splines.

In general, if you care about the domain boundary effects, you have to construct this matrix and solve it somehow. (matrix inversion is a pretty bad way, there are fast ways to solve sparse band-diagonal matrices).

However, if the domain is large and you don't care too much about the boundary, there's a much simpler way. You just find S^-1 and look at a middle row. As the domain size goes to infinity, all the rows of S^-1 become the same. For finite domain, the first and last rows are weird and different, but the middle rows are about the same.

This middle row of S^-1 is the analysis filter.

A = middle row of S^-1

Q = A <conv> P

Q_i = Sum_j A(i-j) P_j

note A is only defined discretely, not continuously. Now our final output is :

f(t) = Sum_i Q_i * S(t - i)

f(t) = Sum_i Sum_j A(i-j) P_j * S(t - i)

f(t) = Sum_j C(t-j) * P_j

C(t) = Sum_i A(i) * S(t-i)

where C is the combined analysis + synthesis filter. The final output is just a simple filter applied to the original points.

For example, for the "canonical cubic" synthesis , (box convolved thrice), we get :

cubic analysis :
const float c_filter[11] = { -0.00175, 0.00876, -0.03327, 0.12434, -0.46410, 1.73205, -0.46410, 0.12434, -0.03327, 0.00876, -0.00175 };
chart : 

(A is 11 wide because I used an 11x11 matrix; in general it's infinitely wide, but gets smaller as you go out)

The synthesis cubic is piece-wise cubic and defined over four unit intervals from [-2,2]
The combined filter is piece-wise cubic; each unit interval is a linear combo of the 4 parts of synthesis

{ ADDENDUM : BTW this *is* the B-spline cubic; see for example : Neil Dodgson Image resampling page 128-129 ; the coefficients are exactly the same }

So, you could do all this work, or you could just use a filter that looks like "combined" from the start.

gaussian analysis :
const float c_filter[11] = { -0.00001, 0.00008, -0.00084, 0.00950, -0.10679, 1.19614, -0.10679, 0.00950, -0.00084, 0.00008, -0.00001 };
chart : 

mitchell1 analysis :
const float c_filter[11] = { -0.00000, 0.00002, -0.00028, 0.00446, -0.07115, 1.13389, -0.07115, 0.00446, -0.00028, 0.00002, -0.00000 };
chart : 

Now, not surprisingly all of the "combined" filters look very similar, and they all look rather a lot like windowed sincs, because there simply aren't that many ways to make interpolating filters. They have to be = 1.0 at 0, and = 0.0 at all other integer locations.

ADDENDUM : well, I finally read the Nehab/Hoppe paper "Generalized Sampling" , and guess what, it's this. It comes from a 1999 paper by Blu et.al. called "Generalized Interpolation: Higher Quality at no Additional Cost".

The reason they claim it's faster than traditional filtering is that what we have done is to construct a sinc-like interpolating filter with wide support, which I call "combined", which can be separated into a simple compact "synthesis" filter, and a discrete matrix (S^-1). So for the case where you have only a few sets of data and you sample it many times (eg. texture in games), you can obviously implement this quickly by pre-applying the discrete matrix to the data, so you no longer have samples of the signal as your pixels, instead you have amplitudes of synthesis basis functions (that is, you store Q in your texture, not P). Now you can effectively sample the "combined" filter by applying only the "synthesis" filter. So for example Nehab/Hoppe suggests that you can do this fast on the GPU by using a GPU implementation of the cubic basis reconstruction.

Both of the papers are somewhat misleading in their claims of "higher quality than traditional filtering". You can of course compute a traditional linear filter that gives the same result, since their techniques are still just linear. They are higher quality *for the same complexity of filter* , and only if you have pre-done the S^-1 multiplication. The speed advantage is in being able to precompute the S^-1 multiplication.

04-07-11 | Help me I can't stop

Some common synthesis filters and their corresponding analysis filter :
BEGIN review

if you want to approximate f(t) by

Sum_i P_i * synthesis(t-i)

you can find the P's by :

P_i = Convolve{ f(t) analysis(t-i) }

END review
a note on the method :

BEGIN note

the H overlap matrix was computed on a 9x9 domain
because my matrix inverse is ungodly slow

for sanity checking I compared to 11x11 a few times and found the difference to be small
for example :

linear filter invH 9x9 :


linear filter invH 11x11 :


(ideally I would use a very large matrix and then look at the middle row, because that is
where the boundary has the least effect)

for real use in a high precision environment you would have to take the domain boundary more seriously

also, I did something stupid and printed out the invH rows with the maximum value scaled to 1.0 ; the
unscaled values for linear are :


but, I'm not gonna redo the output to fix that, so the numbers below have 1.0 in the middle.

END note

For box synthesis, analysis is box.

linear : invH middle row : = 

(note: we've see this linear analysis filter before when we talked about how to find the optimum image such that when it's bilinear interpolated you match some original as well as possible)

quadratic invH middle row : = 

gauss-unity : invH middle row : = 

note : "unity" means no window, but actually it's a rectangular window with width 5 ; the gaussian has sdev of 0.5

sinc half-width of 20 :

sinc-unity : invH middle row : = 

note : obviously sinc is its own analysis ; however, this falls apart very quickly when you window the sinc at all, or even just cut it off when the values get tiny :

sinc half-width of 8 :

sinc-unity : invH middle row : = 

lanczos6 : invH middle row : = 

lanczos4 : invH middle row : = 

Oh, also, note to self :

If you print URLs to the VC debug window they are clickable with ctrl-shift, and it actually uses a nice simple internal web viewer, it doesn't launch IE or any such shite. Nice way to view my charts during testing.

ADDENDUM : Deja vu. Rather than doing big matrix inversions, you can get these same results using Fourier transforms and Fourier convolution theorem.

cbloom rants 06-16-09 - Inverse Box Sampling
cbloom rants 06-17-09 - Inverse Box Sampling - Part 1.5
cbloom rants 06-17-09 - Inverse Box Sampling - Part 2

04-06-11 | And yet more on filters

In the comments of the last post we talked a bit about reconstruction/resampling. I was a bit confounded, so I worked it out.

So, the situation is this. You have some discrete pixels, P_i. For reconstruction, each pixel value is multiplied by some continuous impulse function which I call a "hat", centered at the pixel center. (maybe this is the monitor's display response and the output value is light, or whatever). So the actual thing you care about is the continuous output :

Image(t) = Sum_i P_i * hat( t - i)

I'll be working in 1d here for simplicity but obviously for images it would be 2d. "hat" is something like a box, linear, cubic, gaussian, whatever you like, presumably symmetric, hopefully compact.

Okay, so you wish to resample to a new set of discrete values Q_i , which may be on a different lattice spacing, and may also be reconstructed with a different basis function. So :

Image'(t) = Sum_j Q_j * hat'( t - r*j )

(where ' reads "prime). In both cases the sum is on integers in the image domain, and r is the ratio of lattice spacings.

So what you actually want to minimize is :

E = Integral_dt {  ( Image(t) - Image'(t) )^2 }

Well to find the Q_j you just do derivative in Q_j , set it to zero, rearrange a bit, and what you get is :

Sum_k H'_jk * Q_k = Sum_i R_ij * P_i

H'_jk = Integral_dt { hat'(t-r*j) * hat'(t-r*k) }

R_ij = Integral_dt { hat'(t-r*j)) * hat(t-i) }

or obviously in matrix notation :

H' * Q = R * P

"H" (or H') is the hat self-overlap matrix. It is band-diagonal if the hats are compact. It's symmetric, and actually

H_ij = H( |i-j| )

that is, it only depends on the absolute value of the difference :

H_ij = 
H_0 H_1 H_2 ...
H_1 H_0 H_1 H_2 ...
H_2 H_1 H_0 ...

and again in words, H_i is the amount that the hat overlaps with itself when offset by i lattice steps. (you could normalize the hats so that H_0 is always 1.0 ; I tend to normalize so that h(0) = 1, but it doesn't matter).

If "hat" is a box impulse, then H is the identity matrix (H_0 = 1, else = 0). If hat is the linear tent, then H_0 = 2/3 and H_1 = 1/6 , if hat is Mitchell1 (compromise) the terms are :

H_0 : 0.681481
H_1 : 0.176080
H_2 : -0.017284
H_3 : 0.000463

While the H matrix is very simple, there doesn't seem to be a simple closed form inverse for this type of matrix. (is there?)

The R matrix is the "resampling matrix". It's the overlap of two different hat functions, on different spacings. We can sanity check the trivial case, if r = 1 and hat' = hat, then H = R, so Q = P is the solution, as it should be. R is also sparse and sort of "band diagonal" but not along the actual matrix diagonal, the rows are shifted by steps of the resize ratio r.

Let's try a simple case. If the ratio = 2 (doubling), and hat and hat' are both linear tents, then :

H_0 = 2/3 and H_1 = 1/6 , and the R matrix has sparse rows made of :

...0..., 0.041667, 0.250000, 0.416667, 0.250000, 0.041667 , .... 0 ....

Compute  H^-1 * R =

makes a matrix with rows like :

0 .. ,0.2500,0.5000,0.2500,0 ..

which is a really complicated way to get our triangle hat back, evaluated at half steps.

But it's not always that trivial. For example :

if the hat and hat' are both cubic, r = 2, 

H (self overlap) is :

0 : 0.479365
1 : 0.236310
2 : 0.023810
3 : 0.000198

R (resize overlap) is :

0 : 0.300893
1 : 0.229191
2 : 0.098016
3 : 0.020796
4 : 0.001538
5 : 0.000012

and H^-1 * R has rows of :


which are actually the values of the standard *quadratic* filter evaluated at half steps.

So the thing we get in the end is very much like a normal resampling filter, it's just a *different* one than if you just evaluated the filter shape. (eg. the H^-1 * R for a Gaussian reconstruction hat is not a Gaussian).

As noted in the previous comments - if your goal is just to resize an image, then you should just choose the resize filter that looks good to your eyes. The only place where this stuff might be interesting is if you are trying to do something mathematical with the actual image reconstruction. Like maybe you're trying to resample from monitor pixels to rod/cone pixels, and you have some a-priori scientific information about what shape reconstruction functions each surface has, so your evaluation metric is not ad-hoc.

.. anyhoo, I'm sure this topic has been covered in academic papers so I'm going to leave it alone.

ADDENDUM : another example while I have my test app working.

Gaussian filter with sdev = 1/2, evaluated at half steps :


The rows of (H^-1 * R) provide the resizing filter :


which comes from :

filter self overlap (H) :
0 : 0.886227
1 : 0.326025
2 : 0.016231
3 : 0.000109
4 : 0.000000
5 : 0.000000

filter resample overlap (R) :
0 : 1.120998
1 : 0.751427
2 : 0.226325
3 : 0.030630
4 : 0.001862

Let me restart from the beginning on the more general case :

Say you are given a continuous function f(t). You wish to find the discrete set of coefficients P_i such that under simple reconstruction by the hats h(t-i) , the L2 error is minimized (we are not doing simple sampling such that P_i = f(i)). That is :

the reconstruction F is :

F(t) = Sum_i P_i * h(t-i)

the error is :

E = Integral_dt { ( f(t) - F(t) ) ^2 }

do derivative in P_i and set to zero, you get :

Sum_j H_ij * P_j = Integral_dt { f(t) * h(t-i) }

where H is the same hat self-overlap matrix as before :

H_ij = h_i <conv> h_j

(with h_i(t) = h(t-i) and conv means convolution obviously)

or in terse notation :

H * P = f <conv> h

(H is a matrix, P is a vector )

rearranging you can also say :

P_j = f <conv> g


g_i(t) = Sum_j [ H^-1_ij * h(t-j) ]

what we have found is the complementary basis function for h. h (the hat) is like a "synthesis wavelet" and g is like an "analysis wavelet". That is, once you have the basis set g, simple convolution with the g's produces the coefficients which are optimal for reconstruction with h.

Note that typically H^-1 is not compact, which means g is not compact - it has significant nonzero value over the entire image.

Also note that if there is no edge to your data (it's on infinite domain), then the g's are just translations of each other, that is, g_i(t) = g_0(t-i) ; however on data with finite extent this is not the case (though the difference is compact and occurs only at the edges of the domain).

It should be intuitively obvious what's going on here. If you want to find pixel P_i , you take your continuous signal and fit the hat at location i and subtract that out. But your neighbors' hats also may have overlapped in to your domain, so you need to compensate for the amount of the signal that they are taking, and your hat overlaps into your neighbors, so choosing the value of P_i isn't just about minimizing the error for that one choice, but also for your neighbors. Hence it becomes non-local, and very much like a deconvolution problem.

04-04-11 | Yet more notes on filters

Topics for today :

N-way filters

symmetry of down & up

qualitative notes

comb sampling

1. N-way filters. So for a long time cblib has had good doublers and halvers, but for non-binary ratios I didn't have a good solution and I wasn't really sure what the solution should be. What I've been doing for a long time has been to use doublers/halvers to get the size close, then bilinear to get the rest of the way, but that is not right.

In fact the solution is quite trivial. You just have to go back to the original concept of the filters as continuous functions. This is how arbitrary float samplers work (see the Mitchell papers for example).

Rather than a discrete filter, you use the continuous filter impulse. You put a continuous filter shape at each pixel center, multiplied by the pixel value. Now you have a continuous function for your whole image by just summing all of these :

Image(u,v) = Sum[ all pixels ] Pixel[i,j] * Filter_func( u - i, v - j )

So to do an arbitrary ratio resize you just construct this continuous function Image, and then you sample it at all the fractional u,vs.

Now, because of the "filter inversion" principle that I wrote about before, rather than doing this by adding up impulse shapes, you can get the exact same output by constructing an impulse shape and convolving it with the source pixels once per output pixel. The impulse shape you make should be centered on the output pixel's position, which is fractional in general. This means you can't precompute any discrete filter taps.

So - this is cool, it all works. But it's pretty slow because you're evaluating the filter function many times, not just using discrete taps.

There is one special case where you can accelerate this : integer ratio magnifications or minifications. Powers of two obviously, but also 3x,5x, etc. can all be fast.

To do a 3x minification, you simply precompute a discrete filter that has a "center width" of 3 taps, and you apply it in steps of 3 to make each output pixel.

To do a 3x magnification, you need to precompute 3 discrete filters. The 3 filters will be applied at each source pixel location to produce 3 output pixels per source pixel. They correspond to the impulse shape offset by the correct subpixel amounts (for 3X the offets are -1/3,0,1/3). Note that this is just the same as the arbitrary ratio resize, we're just reusing the computation when the subpixel part repeats.

(in fact for any rational ratio resize, you could precompute the filters for the repeating sub-integer offsets ; eg. to resize by a ratio of 7/3 you would need 21 filters ; this can save a lot of work in some cases, but if your source and dest image sizes are relatively prime it doesn't save any work).

2. Symmetry of down & up . If you look at the way we actually implement minification and magnification, they seem very different, but they can be done with the same filter if you like.

That is, the way we actually implement them, as described above for 3X ratio for example :

Minify : make a filter with center width 3, convolve with source at every 3rd pixel to make 1 output

Magnify : make a filter with center width 1, convolve with source at every 1/3rd pixel to make 1 output

But we can do magnify another way, and use the exact same filter that we used for minify :

Magnify : make a filter with center with 3, multiply with each source pel and add into output

Magnify on the left , Minify on the right :

As noted many times previously, we don't actually implement magnify this way, but it's equivalent.

3. Qualitative notes. What do the filters actually look like, and which should you use ?

Linear filters suffer from an inherent trade-off. There is no perfect filter. (as noted previously, modern non-linear techniques are designed to get around this). With linear filters you are choosing along a spectrum :

blurry -> sharp / ringy

The filters that I've found useful, in order are :

[blurry end]

gauss - no window
gauss - cos window (aka "Hann")
gauss - blackman

Mitchell blurry (B=3/2)
Mitchell "compromise" (B=1/3)
Mitchell sharp (B=0)

sinc - blackman
sinc - cos
sinc - sinc (aka Lanczos)

[sharp end]

there are lots of other filters, but they are mostly off the "pareto frontier" ; that is, one of the filters above is just better.

Now, if there were never any ringing artifacts, you would always want sinc. In fact you would want sinc with no window at all. The output from sinc resizing is sometimes just *amazing* , it's so sharp and similar to the original. But unfortunately it's not reliable and sometimes creates nasty ringing. We try to limit that with the window function. Lanczos is just about the widest window you ever want to use with sinc. It produces very sharp output, but some ringing.

Note that sinc also has a very compelling theoretical basis : it reproduces the original pixels if you resize by a factor of 1.0 , it's the only (non-trivial) filter that does this. (* not true - see later posts on this topic where we construct interpolating versions of arbitrary filters)

If you are resizing in an interactive environment where the user can see the images, you should always start with the sharp filters like Lanczos, and the user can see if they produce unnacceptable artifacts and if so go for a blurrier filter. In an automated environment I would not use Lanczos because it is too likely to produce very nasty ringing artifacts.

The Mitchell "compromise" is a very good default choice in an automated environment. It can produce some ringing and some blurring, but it's not too horrible in either way. It's also reasonably compact.

The Gauss variants are generally more blurring than you need, but have the advantage that all their taps are positive. The windowed gaussians generally look much better than non-windowed, they are much sharper and look more like the original. They can produce some "chunkiness" (raster artifacts), that is under magnification they can make edges have visible stair steps. Usually this is preferrable to the extreme blurriness of the true non-windowed gaussian.

4. comb sampling

Something that bothers me about all this is the way I make the discrete filters from the continuous ones is by comb sampling. That is, I evaluate them just at the integer locations and call that my discrete filter.

I have some continuous mother filter function, f(t) , and I make the discrete filter by doing :

D[] = { f(-2), f(-1), f(0), f(1), f(2) }

and then I apply D to the pixels by just doing mult-adds.

But this is equivalent to convolving the continuous filter f(t) with the original image if the original pixels has dirac delta-functions at each pixel center.

That seems kind of whack. Don't I want to imagine that the original pixels have some size? Maybe the are squares (box impulses), or maybe they are bilinear (triangle hats), etc.

In that case, my discrete filter should be made by convolving the base pixel impulse with the continuous filter, that is :

D[] = { f * hat(-2) , f * hat(-1) , .. }

(* here means convolve)

Note that this doesn't really apply to resizing, because in resizing what we are doing when we apply a filter is we are representing the basic pixel impulses. This applies if I want to apply a filter to my pixels.

Like say I want to apply a Gaussian to an image. I shouldn't just evaluate a gaussian function at each pixel location - I should convolve pixel shapes with the gaussian.

Note that in most cases this only changes the discrete filter values slightly.

Also note that this is equivalent to up-sizing your image using the "hat" as the upsample filter, and then doing a normal comb discrete filter at the higher res.

04-03-11 | Some more notes on filters

Windowing the little discrete filters we use in image processing is a bit subtle. What we want from windowing is : you have some theoretical filter like Gauss or Sinc which is non-zero over an infinite region and you wish to force it to act only on a finite domain.

First of all, for most image filtering operations, you want to use a very small filter. A wider filter might look like it has a nicer shape, and it may even give better results on smooth parts of the image, but near edges wide filters reach too much across the edge and produce nasty artifacts, either bad blurring or ghosting or ringing.

Because of that, I think a half-width of 5 is about the maximum you ever want to use. (in the olden days, Blinn recommended a half-width of 3, but our typical image resolution is higher now, so I think you can go up to 5 most of the time, but you will still get artifacts sometimes).

(also for the record I should note that all this linear filtering is rather dinosaur-age technology; of course you should use some sort of non-linear adaptive filter that is wider in smooth areas and doesn't cross edges; you could at least use steam-age technology like bilateral filters).

So the issue is that even a half-width of 5 is actually quite narrow.

The "Windows" that you read about are not really meant to be used in such small discrete ranges. You can go to Wikipedia and read about Hann and Hamming and Blackman and Kaiser and so on, but the thing they don't really tell you is that using them here is not right. Those windows are only near 1.0 (no change) very close to the origin. The window needs to be at least 4X wider than the base filter shape, or you will distort it severely.

Most of the this stuff comes from audio, where you're working on 10000 taps or something.

Say you have a Gaussian with sigma = 1 ; most of the Gaussian is inside a half-width of 2 ; that means your window should have a half-width of 8. Any smaller window will strongly distort the shape of the base function.

In fact if you just look at the Blackman window : (from Wikipedia)

The window function itself is like a cubic filter. In fact :

Odd Blackman window with half-width of 3 :

const float c_filter[7] = { 0.00660, 0.08050, 0.24284, 0.34014, 0.24284, 0.08050, 0.00660 };

Odd Nutall 3 :

const float c_filter[7] = { 0.00151, 0.05028, 0.24743, 0.40155, 0.24743, 0.05028, 0.00151 };

can be used for filtering by themselves and they're not bad. If you actually want it to work as a *window* , which is just supposed to make your range finite without severely changing your action, it needs to be much wider.

But we don't want to be wide for many reasons. One is the artifacts mentioned previously, the other is efficiency. So, I conclude that you basically don't want all these famous windows.

What you really want for our purposes is something that's flatter in the middle and only get steep at the very edges.

Some windows on 8 taps :

from sharpest to flattest :

//window_nutall :
const float c_filter[8] = { 0.00093, 0.02772, 0.15061, 0.32074, 0.32074, 0.15061, 0.02772, 0.00093 };

//blackman :
const float c_filter[8] = { 0.00435, 0.05122, 0.16511, 0.27932, 0.27932, 0.16511, 0.05122, 0.00435 };

//cos : (aka Hann)
const float c_filter[8] = { 0.00952, 0.07716, 0.17284, 0.24048, 0.24048, 0.17284, 0.07716, 0.00952 };

//window_blackman_sqrt :
const float c_filter[8] = { 0.02689, 0.09221, 0.16556, 0.21534, 0.21534, 0.16556, 0.09221, 0.02689 };

//window_sinc :
const float c_filter[8] = { 0.02939, 0.09933, 0.16555, 0.20572, 0.20572, 0.16555, 0.09933, 0.02939 };

//sin :
const float c_filter[8] = { 0.03806, 0.10839, 0.16221, 0.19134, 0.19134, 0.16221, 0.10839, 0.03806 };

I found the sqrt of the blackman window is pretty close to a sinc window. But really if you use any of these on a filter which is 8-wide, they are severely distorting it.

You can get flatter windows in various ways; by distorting the above for example (stretching out their middle part). Or you could use a parameterized window like Kaiser :

Kaiser : larger alpha = sharper

// alpha = infinite is a delta function
//window_kaiser_6 :
const float c_filter[8] = { 0.01678, 0.07635, 0.16770, 0.23917, 0.23917, 0.16770, 0.07635, 0.01678 };
//window_kaiser_5 :
const float c_filter[8] = { 0.02603, 0.08738, 0.16553, 0.22106, 0.22106, 0.16553, 0.08738, 0.02603 };
//window_kaiser_4 :
const float c_filter[8] = { 0.03985, 0.09850, 0.16072, 0.20093, 0.20093, 0.16072, 0.09850, 0.03985 };
//window_kaiser_3 :
const float c_filter[8] = { 0.05972, 0.10889, 0.15276, 0.17863, 0.17863, 0.15276, 0.10889, 0.05972 };
// alpha = 0 is flat line

Kaiser-Bessel-Derived :

//kbd 6 :
const float c_filter[8] = { 0.01763, 0.10200, 0.17692, 0.20345, 0.20345, 0.17692, 0.10200, 0.01763 };
//kbd 5 :
const float c_filter[8] = { 0.02601, 0.10421, 0.17112, 0.19866, 0.19866, 0.17112, 0.10421, 0.02601 };
//kbd 4 :
const float c_filter[8] = { 0.03724, 0.10637, 0.16427, 0.19213, 0.19213, 0.16427, 0.10637, 0.03724 };
//kbd 3 :
const float c_filter[8] = { 0.05099, 0.10874, 0.15658, 0.18369, 0.18369, 0.15658, 0.10874, 0.05099 };

these are interesting, but they're too expensive to evaluate at arbitrary float positions; you can only use them to fill out small discrete filters. So that's mildly annoying.

There's something else about windowing that I should clear up as well : Where do you put the ends of the window?

Say I have a discrete filter of 8 taps. Obviously I don't put the ends of the window right on my last taps, because the ends of the window are zeros, so they would just make my last taps zero, and I may as well use a 6-tap filter in that case.

So I want to put the end points somewhere past the ends of my discrete filter range. In all the above examples in this post, I have put the end points 0.5 taps past each end of the discrete range. That means the window goes to zero half way between the last tap inside the window and the first tap outside the window. That's on the left :

On the right is another option, which is the window could go to zero 1.0 taps past the end of the discrete range - that is, it goes to zero exactly on the next taps. In both cases this produces a finite window of the desired size, but it's slightly different.

For a four tap filter, the left-side way uses a window that is 4.0 wide ; the right side uses a window that is 5.0 wide. If you image the pixels as squares, the left-side way only overlaps the 4 pixels cenetered on the filter taps; the right side way actually partially overlaps the next pixels on each side, but doesn't quite reach their centers, thus has zero value when sampled at their center.

I don't know of any reason to strongly prefer one or the other. I'm using the 0.5 window extension in my code.

Let me show some concrete examples, then we'll talk about them briefly :

//filter-windower-halfwidth :

//gauss-unity-5 :
const float c_filter[11] = { 0.00103, 0.00760, 0.03600, 0.10936, 0.21301, 0.26601, 0.21301, 0.10936, 0.03600, 0.00760, 0.00103 };

//gauss-sinc-5 :
const float c_filter[11] = { 0.00011, 0.00282, 0.02336, 0.09782, 0.22648, 0.29882, 0.22648, 0.09782, 0.02336, 0.00282, 0.00011 };

//gauss-blackman-5 :
const float c_filter[11] = { 0.00001, 0.00079, 0.01248, 0.08015, 0.23713, 0.33889, 0.23713, 0.08015, 0.01248, 0.00079, 0.00001 };

//gauss-blackman-7 :
const float c_filter[15] = { 0.00000, 0.00000, 0.00015, 0.00254, 0.02117, 0.09413, 0.22858, 0.30685, 0.22858, 0.09413, 0.02117, 0.00254, 0.00015, 0.00000, 0.00000 };

//gauss-blackman-7 , ends cut :
const float c_filter[11] = { 0.00015, 0.00254, 0.02117, 0.09413, 0.22858, 0.30685, 0.22858, 0.09413, 0.02117, 0.00254, 0.00015 };

//gauss-blackman-13, ends cut :
const float c_filter[11] = { 0.00061, 0.00555, 0.03085, 0.10493, 0.21854, 0.27906, 0.21854, 0.10493, 0.03085, 0.00555, 0.00061 };

1. The gaussian, even though technically infinite, goes to zero so fast that you really don't need to window it *at all*. All the windows distort the shape quite a bit. (technically this is a rectangular window, but the hard cut-off at the edge is invisible)

2. Windows distorting the shape is not necessarily a bad thing! Gauss windowed by Sinc (for example) is actually a very nice filter, somewhat sharper than the true gaussian. Obviously you can get the same thing by playing with the sdev of the gaussian, but if you don't have a parameterized filter you can use the window as a way to adjust the compactness of the filter.

3. Cutting off the ends of a filter is not the same as using a smaller window! For example if you made gauss-blackman-7 you might say "hey the ends are really close to zero, it's dumb to run a filter over an image with zero taps at the end, I'll just use a width of 5!". But gauss-blackman-5 is very different! Making the window range smaller changes the shape.

4. If you want the original filter shape to not be damaged too much, you have to use wider windows for sharper window functions. Even blackman at a half-width of 13 (full width of 27) is distorting the gaussian a lot.

04-02-11 | House rambling

There are a lot of short sales and a few foreclosures popping up in the area. (one of the houses we stopped by had a lovely sign saying "this house is the property of xx bank and if you enter you will be prosecuted"). Some of the short sales look like good deals, but everything I've read says to stay the fuck away from them. Apparently what's happening is neither the owner nor the banks really want to sell their short sales. While the short sale is listed, the owner can live their for free, so they don't actually want to sell. And, the banks would have to write down a loss if the sale went through, so they want to delay it to keep it off their books.

So, no short sales for me I think. Foreclosures might be okay though, dunno. I have seen a few places that are cheap enough that I believe you could buy them and immediately have positive cash flow from rent, which is where you want to be with real estate investing. (mainly places that have "Mother in Law" units because you get two rents; I can't believe how much those things rent for; the one in my house is rented out for $850 and it's an absolute hole). (you wouldn't actually be cash+ because that's not counting property tax or upkeep or months when it's not rented, etc, etc, but then it's also not counting appreciation; in any case rent = mortgage is a rough sign of price sanity, or maybe just a sign that rents are awfully high around here).

There are houses in the Beacon Hill area that are under $200k. I could buy that outright and have no mortgage! My depression-era financial mentality finds that rather appealing, but it's really bad investment. One of the biggest advantages that houses have as an investment vehicle is leverage (the other of course is tax benefits) (and another is that the market for houses is actually propped up by the government, they actually subsidize your risk).

My understanding on fees is that the 6% is usually locked into the seller's contract, then they give 3% to the buyer's agent. This is an obviously illegal way of making you take a buyer's agent.

One option is to use Redfin, which gives you back half their fee (1.5% back). On a 500k house that's 7500, so it's not trivial, but maybe a good buyer's agent would save you 7500 worth of time and haggling and fixes on the house, so it's not a clear win. I've read good and bad things about buying through Redfin. It seems like sort of a shoddy operation, but I do like doing things online and I hate talking to humans, so that aspect is right up my alley.

The advantage of Redfin is that they will at least make appointments for seeing houses and take you around, so you don't have to call up agents. There's another one here called "$500 Realty" which gives you back 75% of the 3%, but they literally do nothing, you have to do all the touring yourself.

On top of the 6% real estate agent fee you get various closing costs which I understand are 3-5% typically. WTF WTF. 10% in fees on this transaction. This is such a fucking scam.

And I was thinking about mortgages a bit. Obviously fixed rate is the way to go right now to lock in the low interest rates. I'm kind of tempted by 15-year mortgages because I like the idea of paying it off (depression era youth, there you are again!), but I'm pretty sure the 30 year is actually a better deal. For one thing, the spread between 15 and 30 is very small right now; apparently sometimes the spread is much bigger and that makes more of a case for the 15. If you look at the total amount of payment difference, it looks like a big difference (almost twice as much), but that's a lie. You have to account for inflation - the dollars paid after the first 15 years are worth *way* less, and you have to account for investment income on all the dollars you didn't put towards the mortgage, and you have to count the larger tax deduction with the 30 year, and I believe the result of all that is a decent positive on the side of the 30 year. One thing I spotted that I wasn't aware of is pre-payment penalties. WTF, you fucking scammers. So, have to watch out for that.

I dunno.

04-01-11 | Dirty Coder Tricks

A friend reminded me recently that one of the dirtiest tricks you can pull as a coder is simply to code something up and get it working.

Say you're in a meeting talking about the tech for your game, and you propose doing a new realtime radiosity lighting engine (for example). Someone else says "too risky, it will take too much time, etc. let's go with a technique we know".

So you go home and in your night hours you code up a prototype. The next day you find a boss and go "look at this!". You show off some sexy realtime radiosity demo and propose now it should be used since "it's already done".

Dirty, dirty coder. The problem is that it's very hard for the other person to win the argument that it's too hard to code once you have a prototype working. But in reality a sophisticated software engineer should know that a demo prototype proves nothing about whether the code is a sensible thing to put into a big software product like a game. It's maybe 1% of the total time you will spend on that feature, and it doesn't even prove that it works, because it can easily work in the isolation of a demo but not be feasible in real usage.

I used to pull this trick all the time, and usually got away with it. I would be in a meeting with my lead and the CTO, the lead would not want some crazy feature I was pushing, and then I would spring it on him that it was "already working" and I'd show a demo to the CTO and there you go, the feature is in, and the lead is quietly smoldering with futile anger. (the most dickish possible coder move (and my favorite in my youth) is to spring the "it's already done, let me show you" as a surprise in the green light meeting).

But I also realize now that my leads had a pretty good counter-play to my dirty trick. They would make me put the risky feature they didn't like in an optional DLL or something like that so it wasn't a crucial part of the normal production pipeline, that way my foolishness didn't impact lots of other people, and the magnified workload mainly fell only on me. Furthermore, optional bits tend to decay and fall off like frostbitten toes every time you have to change compiler or platform for directx version. In that way, the core codebase would clean itself of the unnecessary complicated features.

04-01-11 | Pots 1

This is sort of a test of putting some images in my blog. I'm not sure how the cbloom.com bandwidth will handle it...

I've been doing pottery classes for the last number of weeks. It's quite pleasant. We're finally getting some finished pieces out of the kiln so I'll post some pics and notes.

It's very good meditation for me. You sit with the clay and stare into it; you have to really relax and go slow, because pots don't like to be forced around. I like to watch a point on the pot as it goes around, which gets you into a rocking rhythm with the wheel.

I really like the way any one pot is no big deal. There are a million ways you can ruin a pot, and they can happen at any time (surprise air bubbles, off centering, dry spots in throwing, cutting through the bottom trimming, S cracks, etc etc etc). So you can't get too attached to any one pot, you have to be ready to just throw it away and not care. At first I thought that was stressful because I cared too much about each pot, but once you change your viewpoint and let go, it's actually really relaxing and liberating. If you screw up a throw, no biggie, do another, you can experiment with new techniques and get it wrong many times, no biggie.

For one thing, it's important to get out of the house. If you live with someone, it's extremely inconsiderate to be home all the time. Everybody deserves some alone time in their own house, and it's just tacky to impose your presence on them constantly. It's especially good to have a weekly scheduled time when you will be gone, so that they can count on it and look forward to it, so that they can crossdress and eat a gallon of ice cream, or whatever it is they want to do. I know that I am generally pretty bad about being home too much, and I feel bad about that, but I just hate the outside world (when it's gray and wet and cold), so this is my way to get out a bit.

These are the first pieces of crap I made :

You have to make a bunch of crap and get it in the pipeline. Pottery is a lot like a deeply pipelined CPU - the latency from grabbing a hunk of clay to getting a finished pot might take a month, but your throughput could be very high; you can easily throw 10 pots an hour, but not see them come out for a month. So in the beginning my goal was just to get some stuff in the pipeline.

Pro tip : centering : if you are even slightly off center it will be annoying to deal with. The things that really help me are : 1. putting my elbow in my groin, so that the forward pressure is braced against my whole body, and 2. when you release the pressure make sure it's very gradual, and equal with both your hands - you can get something perfectly centered under pressure, and then you release unevenly and it's back off center.

Pro tip : wedging : don't fold the clay over itself, you can create air bubbles; it's not really aggressive like kneading, it's more a gentle motion like massaging. Coning up and down can substitude for good wedging.

Pro tip : fixing to the wheel ; a slightly convex bottom of your lump is better than flat or concave when you slam it down, because it ensures the middle touches. You can press the edge onto the wheel before centering to help it stick.

This is my first bowl :

It's not bad, thought I was messing around with different glazes too much and that just made everything look really messy. I need to glaze simpler and cleaner. I also tend to not leave enough foot on things, which is okay, but it's just a pain in the ass to dip things if you don't have a good foot to grab. A lot of the things you're supposed to do in throwing are really just to make life easier on yourself later; eg. you can always fix things in trimming, but trimming is a major pain in the ass, so it's just easier if you throw it as close to the right form as possible to begin with.

Pro tip : bowls should not get too wide too soon, or they can get weak and fall. Basically you want to throw an upside down cone shape first, and then you can round out the bottom as the last thing you do.

Pro tip : pulling is pretty basic; some notes to self that I forget some times. The most important thing first is that the pot is evenly wet, if there are dry spots they will catch during the pull and fuck the pot. Go slow and steady, don't stop, if something unexpected happens along the way, just keep moving slow and steady. Try to get your eyes directly over the pull and look straight down , if your head is off to one side you won't pull straight. Don't rest your elbows on your body, it makes you pull an arc, get them out in space. Make sure the two hands are well connected. Do not apply much pressure, you aren't squeezing hard, just gentle pressure and slide up. One finger tip should be slightly higher than the other, if the outside hand is lower you can narrow the pot, if the outside hand is higher you can widen the pot.

I like to open "Simon Leach style" against the whole flat of my hand and do the first rough pull that way. One trick I've seen for wetting is potters hold their sponge in the heel of their hand while they pull with their fingertips, so if they hit dry they can squeeze the sponge right there as they go. You can also pull directly against a sponge.

Some little tea bowl type things :

My first attempts at vase shapes :

Necking down to narrow is quite difficult. If the clay gets too wet or too thin, it will ribbon as you neck like the bottom pot. I've learned two different techniques for necking, one is the "six points of contact" technique which is very slow and delicate, the other is to just grab the clay with both hands like you're trying to strangle it and brute force it. The main thing is that it's much easier to widen than it is to narrow, so you want to do your initial open and pull narrow, and keep it narrow at the top the whole time, don't throw out and try to bring it back in.

03-31-11 | Some image filter notes

Say you have a filter like F = [1,2,3,2,1] . The normal thing to do is compute the sum and divide so you have pre-normalized values and you just do a bunch of madd's. eg. you make N = [1/9,2/9,3/9,2/9,1/9].

Now there's the question of how you handle the boundaries of the image. The normal thing to do is to take the pre-normalized filter N and apply all over, and when one of the taps samples off edge, you have to give it something to sample. You can use various edge modes, such as :

SampleBounded(int i, int w) :

clamp :
  return Sample( Clamp(i,0,w-1) );

wrap :
  return Sample( (i+256*w)%w );

mirror (no duplicated edge pixel) :
  if ( i < 0  ) return SampleBounded( -i, w );
  if ( i >= w ) return SampleBounded( -i + 2*w - 2 , w );
  else return Sample( i );

mirror with duplicated edge pixel :
  if ( i < 0  ) return SampleBounded( - i - 1, w );
  if ( i >= w ) return SampleBounded( -i + 2*w - 1 , w );
  else return Sample( i );

(the correct edge mode depends on the usage of the image, which is one of those little annoying gotchas in games; eg. the mips you should make for tiling textures are not the same as the mips for non-tiling textures). (another reasonable option not implemented here is "extrapolate" , but you have to be a bit careful about how you measure the slope at the edge of the image domain)

The reason we do all this is because we don't want to have to accumulate the sum of filter weights and divide by the weight.

But really, in most cases what you really should be doing is applying the filter only where its domain overlaps the image domain. Then you sum the weights in the area that is valid and renormalize. eg. if our filter F is two pixels off the edges, we just apply [3,2,1] / 6 , we don't clamp the sampler and put an extra [1,2] on the first pixel.

ADDENDUM : in video games there's another special case that needs to be handled carefully. When you have a non-tiling texture which you wish to abutt seamlessly to another texture. That is, you have two textures T1 and T2 that are different and you wish to line them up beside each other without a seam.

I call this mode "shared", it sort of acts like "clamp" but has to be handled specially in filtering. Lets say T1 and T2 are layed against eachother horizontally, so they abutt along a column. What the artist should do is make the pixels in that border column identical in both textures (or you could have your program enforce this). Then, the UV mapping on the adjacent rectangles should be inset by half a pixel - that is, it picks the center of the pixels, not the edge of the texture. Thus the duplicated pixel edge only appears to be a single column of pixels.

But that's not the special case handling - the special case is whenever you filter a "shared" image, you must make border column pixels only from other border column pixels. That is, that shared edge can only be vertically filtered, not horizontally filtered. That way it stays identical in both images.

Note that this is not ideal with mipping, what happens is the shared edge gets fatter at higher mip levels - but it never develops a seam, so it is "seamless" in that sense. To do it right without any artifacts (eg. to look as if it was one solid bigger texture) you would have to know what image is on the other side of the shared edge and be able to filter tap into those pixels. Obviously that is impossible if your goal is a set of terrain tiles or something like that where you use the same shared edge in multiple different ways.

(is there a better solution to this issue?)

I did a little look into the difference between resizing an image 8X by either doubling thrice or directly resizing. I was sanity checking my filters and I thought - hey if I use a Gaussian filter, it should be the same thing, because convolution of a Gaussian with a Gaussian is a Gaussian, right?

In the continuous case, you could either use one Gaussian with an sdev of 8 (not actually right for 8X mag, but you get the idea). If you had a Gaussian with sdev 2 and convolved it 3 times - you should get a Gaussian with sdev of 8.

So I tried it on my filters and I got :

Gaussian for doubling, thrice :


Gaussian for direct 8x :


and I was like yo, WTF they're way off, I must have a bug. (note : these are scaled to make the max value 1.0 rather than normalizing because it's easier to compare this way, they look more unequal after normalizing)

But then I realized - these are not really proper Gaussians. These are discrete samples of Gaussians. If you like, it's a Gaussian multiplied by a comb. It's not even a Gaussian convolved with a box filter - that is, we are not applying the gaussian over the range of the pixel as if the pixel was a box, but rather just sampling the continuous function at one point on the pixel. Obviously the continuous convolution theorem that Gauss [conv] Gauss = Gauss doesn't apply.

As for the difference between doing a direct 8X and doubling thrice, I can't see a quality difference with my eyes. Certain the filters are different numerically - particularly filters with negatives, eg. :

sinc double once : 
sinc double twice : 
sinc double thrice : 

sinc direct 8x : 

very different, but visually meh? I don't see much.

The other thing I constantly forget about is "filter inversion". What I mean is, if you're trying to sample between two different grids using some filter, you can either apply the filter to the source points or the dest points, and you get the same results.

More concretely, you have filter shape F(t) and some pixels at regular locations P[i].

You create a continuous function f(t) = Sum_i P[i] * F(i-t) ; so we have placed a filter shape at each pixel center, and we are sampling them all at some position t.

But you can look at the same thing a different way - f(t) = Sum_i F(t-i) * P[i] ; we have a filter shape at position t, and then we are sampling it at each position i around it.

So, if you are resampling from one size to another, you can either do :

1. For each source pixel, multiply by filter shape (centered at source) and add shape into dest, or :

2. For each dest pixel, multiply filter shape (centered at dest) by source pixels and put sum into dest.

And the answer is the same. (and usually the 2nd is much more efficient than the first)

And for your convenience, here are some doubling filters :

box        : const float c_filter[1] = { 1.00000 };
linear     : const float c_filter[2] = { 0.25000, 0.75000 };
quadratic  : const float c_filter[3] = { 0.28125, 0.68750, 0.03125 };
cubic      : const float c_filter[4] = { 0.00260, 0.31510, 0.61198, 0.07031 };
mitchell0  : const float c_filter[4] = { -0.02344, 0.22656, 0.86719, -0.07031 };
mitchell1  : const float c_filter[4] = { -0.01476, 0.25608, 0.78212, -0.02344 };
mitchell2  : const float c_filter[4] = { 0.01563, 0.35938, 0.48438, 0.14063 };
gauss      : const float c_filter[5] = { 0.00020, 0.20596, 0.78008, 0.01375, 0.00000 };
sqrtgauss  : const float c_filter[5] = { 0.00346, 0.28646, 0.65805, 0.05199, 0.00004 };
sinc       : const float c_filter[6] = { 0.00052, -0.02847, 0.23221, 0.87557, -0.08648, 0.00665 };
lanczos4   : const float c_filter[4] = { -0.01773, 0.23300, 0.86861, -0.08388 };
lanczos5   : const float c_filter[5] = { -0.04769, 0.25964, 0.89257, -0.11554, 0.01102 };
lanczos6   : const float c_filter[6] = { 0.00738, -0.06800, 0.27101, 0.89277, -0.13327, 0.03011 };

These are actually pairs of filters to create adjacent pixels in a double-resolution output. The second filter of each pair is simply the above but in reverse order (so the partner for linear is 0.75, 0.25).

To use these, you scan it over the source image and apply centered at each pixel. This produces all the odd pixels in the output. Then you take the filter and reverse the order of the coefficients and scan it again, this produces all the even pixels in the output (you may have to switch even/odd, I forget which is which).

These are created by taking the continuous filter function and sampling at 1/4 offset locations - eg. if 0 is the center (maximum) of the filter, you sample at -0.75,0.25,1.25, etc.

And here's the same thing with a 1.15 X blur built in :

box        : const float c_filter[1] = { 1.0 };
linear     : const float c_filter[2] = { 0.30769, 0.69231 };
quadratic  : const float c_filter[3] = { 0.00000, 0.33838, 0.66162 };
cubic      : const float c_filter[5] = { 0.01586, 0.33055, 0.54323, 0.11034, 0.00001 };
mitchell0  : const float c_filter[5] = { -0.05174, 0.30589, 0.77806, -0.03143, -0.00078 };
mitchell1  : const float c_filter[5] = { -0.02925, 0.31410, 0.69995, 0.01573, -0.00052 };
mitchell2  : const float c_filter[5] = { 0.04981, 0.34294, 0.42528, 0.18156, 0.00041 };
gauss      : const float c_filter[6] = { 0.00000, 0.00149, 0.25842, 0.70629, 0.03379, 0.00002 };
sqrtgauss  : const float c_filter[6] = { 0.00000, 0.01193, 0.31334, 0.58679, 0.08726, 0.00067 };
sinc       : const float c_filter[7] = { 0.00453, -0.05966, 0.31064, 0.78681, -0.03970, -0.00277, 0.00015 };
lanczos4   : const float c_filter[5] = { -0.05129, 0.31112, 0.78006, -0.03946, -0.00042 };
lanczos5   : const float c_filter[6] = { 0.00499, -0.09023, 0.33911, 0.80082, -0.04970, -0.00499 };
lanczos6   : const float c_filter[7] = { 0.02600, -0.11420, 0.34931, 0.79912, -0.05497, -0.00837, 0.00312 };

The best doubling filters to my eyes are sinc and lanczos5, they have a good blend of sharpness and lack of artifacts. Stuff like gauss and cubic are too blurry, but are very smooth ; lanczos6 is sharper but has more ringing and stair-steps; wider lanczos filters get worse in that way. Sinc and lanczos5 without any blur built in can have a little bit of visible stair-steppiness (there's an inherent tradeoff when linear upsampling of sharpness vs. stair-steps) (by stair steps I mean the ability to see the original pixel blobs).

03-24-11 | Some Car videos I like

Most car racing is just excruciatingly boring. In car time attack videos in crazy fast cars, blah blah boring. Some exceptions :

YouTube - On Board with Patrick Long at Lime Rock 2010
It's cool to actually hear the driver talk about what he's thinking. All throughout racing there's tons of crazy stuff going on, but you can't appreciate it from watching cuz you don't know all the subtle things the drivers are thinking about; it's actually very strategic, one move sets up the next. It's so much more interesting with the voice over.

It's pretty nuts the way the Nurburgring races run cars of all kinds of different speeds. You've got crazy race cars running with just slightly modified road cars, and that leads to lots of passing and action, way better than something like F1 where everyone is the same speed and you can never pass :
YouTube - Corvette Z06 GT3 vs. Porsche Cayman @ Nürburgring Nordschleife VLN
YouTube - BMW Z4 M Coupe vs Porsche 997 RSR @ Nurburgring 24 Hr Race

Then you've got people who take their race car out during normal Nurburgring lapping days : (the Alzen 996 Turbo runs 6:58 on the ring, one of the fastest times ever)
YouTube - Porsche Team Jurgen Alzen Motorsport

The other way you get exciting races is in amateur races where you have some very fast cars, and some cars that are woefully bad and spinning out - especially when the fast cars fail to quality and have to start at the back of the back and move forward :
YouTube - Scott Goodyear On Board - 1988 Rothmans Porsche Turbo Cup Series Mont Tremblant Race
NASA GTS Putnam Park, May 15, 2010 on Vimeo

(BTW the Porsche Carrera Cup races are some of the most boring races I've ever seen; all the cars are identical so they can never pass, and the drivers are all rich amateur boneheads who can hardly work their auto-blipping sequential shifter)

The RUF Yellowbird might be the "lairiest" car ever. Take an old Porsche with no traction, lighten it and stick a giant turbo in it. It did at an 8:05 at the ring with the tail sliding the entire time. Completely insane.
YouTube - Ruf Yellowbird DRIFT in Nurburgring
YouTube - 930 Ruf CTR Yellowbird on nurburgring
YouTube - Insane driving in PorscheRUF Yellowbird - Nurburgring Hotlap

I like this video as contrast, it shows how hard it is to drift in the new 911 (even the GT3, which is much easier to drift than the base models). What you have to do is come in to a corner very hot, over 60 mph, brake very hard and very late, continue braking as you turn in ("trail brake"), this should get the weight loaded up on the front wheels and make the rear end light, now you heel and toe into 1st, wait for the nose to get turned in a bit, then on power hard out of the corner. It's much easier going downhill too.
YouTube - Porsche 911 (997) GT3 drifting

Some other random car shite :

Weight distributions :

Porsche 997 C2 : 38% front , 62% rear
Porsche 997 C4 : 40% front , 60% rear
Porsche Cayman : 45% front , 55% rear
Lotus Elise    : 38% front , 62% rear
Lotus Evora    : 39% front , 61% rear

I was surprised how rear-biased the Loti are. So the best handling cars in the world (Loti) have a 911-like rear weight bias. Granted, the weight at the wheels is not the whole story, the location of the engine lump matters a lot for dynamic weight transfer purposes, but still. Weight under acceleration and lateral forces under cornering would tell you a more complete story.

(people often say a car has a 50/50 weight distribution and thus it's "perfectly balanced" - but that's only true when it's not moving; under acceleration it gets more weight in the rear, and under braking it gets more in the front; I think the best cars are slightly rear-biased, like the cayman, because the under braking they becoming only slightly front-biased; front-biased cars get dangerously light in the rear under hard braking)

It's also interesting to compare tire sizes :

Porsche 997 C2 : 235 front , 295 rear
Porsche Cayman : 235 front , 265 rear
Lotus Evora    : 225 front , 255 rear 
Lotus Elise 1  : 185 front , 205 rear 
Lotus Elise 2  : 175 front , 225 rear 
Lotus Exige    : 195 front , 225 rear 
Honda S2000 AP1: 205 front , 225 rear
Mazda RX8      : 225 front , 225 rear

The differences are revealing. The Evora for example weighs about the same as a 911 GT3 and has the same weight distribution, but has much less staggered tire sizes. This means the tail will come out more easily on the Evora, you generally have less rear grip. The Cayman has much less rear weight bias but has the same wide rear tires, which again edges towards understeer. The Elise setup was changed during its life to a much more staggered setup (it's much narrower tires in general are due to its much lower weight).

(I also tossed in the AP1 S2000 and the RX8 since they are just about the only OEM cars that are tweaked for oversteer; the newer S2000's are on a 255 rear cuz honda pussed out; note that unlike the Elise, the S2000 actually weighs close to the same as the Cayman and Evora, yet is on much narrower tires; that provides more "driving pleasure")

(BTW it's risky to try to learn too much from Lotus because they do things so differently from any other car maker; they use stiff springs, *NO* sway bar or very weak sway bar, no LSD, generally narrow front tires for quick steering, etc.)

For my reference :

BMW M coupe (E86) (2007)
330 hp , 262 lb-ft
3230 curb weight (manufaturer spec)

Cayman S (987.2) (2009)
320 hp, 273 lb-ft
2976 curb weight (manufaturer spec)

I really like the M coupe ; I think it's the last great BMW, reasonably light weight, and tuned for oversteer from the factory. But it is just not as good as the Cayman in pretty much any way. It's also got a much smaller interior and much less cargo space. It's also really not much cheaper, because it's rare and is holding value well, while used Porsche values fall fast. The only real advantage is that the BMW engine is a bit better (it has more potential), and the OEM suspension setup is more enthusiast-oriented. The M coupe would be so much better if they hadn't separated the boot from the cabin; it should have been a proper hatchback; that would provide more feeling of space in the cabin, and much more cargo room. Instead you get a small claustrophobic cabin, and a small boot.

Sometimes I lust after the really old BMW's, like the E28 M5 or the E30 M3, I love how small and boxy they are, and the downward-pointed noses, but they are just so far off modern performance, you would have to do a lot of work on them (suspension, engine). So then I look at newer ones, like the E43 M3, but they really have most of the disadvantages of a new car - big, heavy, tuned for safety, etc. I wind up at the M coupe as sort of the sweet spot of old values and new engineering, but then it just doesn't make sense compared to the Cayman either.

BTW the 2009 Cayman is a big improvement over the earlier cars. It's got a new engine that is really much better; if you just look at the figures it doesn't look like a big improvement (20 hp or something) but that hides the real value - it doesn't blow up like the old ones do; the older ones have power steering problems, air-oil separator problems, oil starvation problems, all of which is fixed in post-2009 cars. But you have to wait until 2013 because the real values in used Porsches come after the lease returns start showing up - when the cars are 4 years old.

Continuing the for my reference theme :

Porsche GT3 (997.1) (2007)
415 hp, 300 lb-ft
3075 - 3262 curb weight (manufaturer spec)

note 1 : it's very hard to find accurate weights of cars. Wikipedia is all over the place with innacurate numbers. For one thing, the US and European official standards for how weight are measured differ, (eg. what kind of fluids are required, standard driver weight added or not, etc), also the weights differ with options, and one of the tricks manufacturers play is they measure the weight without options, but then make those options mandatory. This is another one of those things you would like to see car magazines report on, give you true weights, but of course no they don't do that.

note 2 : the GT3 is more than 2X the price of the M Coupe or Cayman, but in terms of depreciation I'm not sure it costs much more (and the M Coupe is actually cheaper than the Cayman I believe because it will depreciate less). In some sense, you should measure car cost not by initial outlay, but rather by the annual depreciation + opportunity cost of the money sunk. There are some classic cars that have expected ZERO depreciation - that is, other than opportunity cost and transaction cost, they are completely free to own (Jag E-types, classic Ferraris, etc.)

However, buying a car based on expected depreciation sucks as a life move. You have to be constantly worried about how your usage is affecting possible resale value. It's like the future purchaser is watching your every move and judging you. OMG you parked outside in the rain? That's $5k off. You drove in a gravel parking lot? That's $5k off. You put too many miles on it? That's $10k off. It's a horrible feeling. It's much more fun to buy a car and assume you will never sell it and just do as you please with it. (of course many people allow this disease to affect their home ownership experience - they are constantly thinking about how what they do their home will affect resale, which is a sad way to live).

So for example I think a $70k GT3 is actually "cheaper" than a $50k base 997 at the moment, that thinking sucks you into depreciation horror.

The Boss 302 Mustang has layed down some great lap times, in M3 / Porsche territory, for about half the price. The old "retard's wisdom" is that European cars may be slower in a straight line (than cheaper American cars), but they go faster around corners - is no longer true. (In fact a recent comparo of the Porsche Turbo S vs. the Corvette ZR1 found the Porsche to be faster in a straight line, but the Vette to be faster around a bendy track! That's the exact opposite of the old stereotype which people still deeply associate with these cars). That said, I'm totally uninterested in this car (the Mustang). It weighs over 3600 pounds (the GT500 is over 3800). It's got those damn tall doors and tiny glass that makes it feel like a coffin. I drove a standard mustang with the same body style recently as a rental car, and it was just awful, it felt so huge, so unweildly. The whole high-power giant heavy car thing is such a turn off. The only American sports cars is the Pontiac Solstice. But the other reason I don't like the 302 is Aero.

I believe Aero is a very bad thing for road cars. Much of the speed of the 302 comes from aerodynamic bits (henceforth Aero). The same is true of the Viper ACR, and even the Porsche GT3. These cars have layed down much faster track times than their progenitors, and the main difference is the Aero (the other big difference is stiff track suspension). You look at the lap time and think the car is much improved, but the fact is you will never actually experience that improvement on the road. Aero only has a big effect over 100 mph; you are not taking corners at 120 on the road.

And furthermore - even if you *are* taking fast corners I contend that aero is a bad thing. The reason is that amateur drivers don't know how to handle aero and can get unpredictable effects from it. For example if you are going through a corner at 120 steady state and you brake, you decrease your downforce and suddenly can lose grip and either spin out or understeer. Aero can create false confidence because the car feels very stable and planted, but that's only there as long as you keep on the gas. So IMO when a car is improved by getting better Aero, that is not actually a benefit to the consumer.

Take a car and slap a big wing on it; the lap time goes down by 3 seconds. Is it a better road car? No, probably not. But if you look at lap time rankings it seems much better than its rivals.

I believe that lap times in general are not a great way to judge cars. Granted it is much better than 0-60 or 1/4 mile times, the way the US muscle mags compared cars in the old days, but lap times can be gamed in weird ways (tires, aero, suspension, etc) that aren't actually beneficial to the buyer. (of course, even worse is looking at top speeds, which that moron Clarkson seems to fixate on; he even does the most lol-worthy thing of all which is to compare top speeds of cars that are *limitted*, like he'll say that some shit sedan with a top speed of 160 is "faster than a BMW M5" ; umm that's because the M5 is limitted at 155 you giant fucking moron, and even if it wasn't top speed is totally irrelevant because it depends so much on drag and gearing; you can actually greatly improve most cars by regearing them to lower their top speed to 120 or so).

Another issue is that lap times heavily reward grip. And grip is not really what you want. You want a bit of tail sliding fun, and ideally at a safe speed in a predictable way, which means reasonably low grip. This is part of what makes the Miata so brilliant, they intentionally designed it with low grip, so that even though it doesn't have much power, you could still get the tail out (the original was on 185 width tires).

I love the Best Motoring car reviews ; for example in this one :
YouTube - Boxster S vs Elise vs S2000 Touge Test & Track Battle - Best Motoring International
Tsuchiya rates the cars by how progressively they go from stable to spinning ; the ideal is a car that is steady and gives you lots of feedback, the worst is a car that suddenly goes to spinning without warning you.

If you just look at lap times, you will favor cars with aero downforce, stiff suspension, and wide tires. That's not really what you want. You want cars with "driving pleasure". For some reason, the Japanese seem to be the only manufacturers who get this; cars like the Miata, S2000, RX8 are not about putting up figures, they are about balance, and all those little things that go into a car making you happy (such as shift feel and getting the rev ranges right and so on).

03-24-11 | Image filters and Gradients

A friend recently pointed me at John Costella's supposedly superior edge detector . It's a little bit tricky to figure out what's going on there because his writing is quite obtuse, so I thought I'd record it for posterity.

You may recognize Costella's name as the guy who made Unblock which is a rather interesting and outside-the-norm deblocker. He doesn't have an image science background, and in the case of Unblock that led him to some ideas that normal research didn't find. Did he do it again with his edge detector?

Well, no.

First of all, the edge detector is based on what he calls the magic kernel . If you look at that page, something is clearly amiss.

The discrete 1d "magic kernel" for upsampling is [1,3,3,1] (unnormalized). Let's back up a second, we wish to upsample an image without offseting it. That is, we replace one pixel with four and they cover the same area :

+---+     +-+-+
|   |     | | |
|   |  -> +-+-+
|   |     | | |
+---+     +-+-+

A 1d box upsample would be convolution with [1,1] , where the output discrete taps are half the distance apart of the original taps, and offset by 1/4.

The [1331] filter means you take each original pixel A and add the four values A*[1331] into the output. Or if you prefer, each output pixel is made from (3*A + 1*B)/4 , where A is the original pixel closer to the output and B is the one farther :

| A | B |

| |P| | |

P = (3*A + 1*B)/4

but clever readers will already recongize that this is just a bilinear filter. The center of P is 1/4 of an original pixel distance to A, and 3/4 of a pixel distance to B, so the 3,1 taps are just a linear filter.

So the "magic kernel" is just bilinear upsampling.

Costella shows that Lanczos and Bicubic create nasty grid artifacts. This is not true, he simply has a bug in his upsamplers.

The easiest way to write your filters correctly is using only box operations and odd symmetric filters. Let me talk about this for a moment.

In all cases I'm talking about discrete symmetric filters. Filters can be of odd width, in which case they have a single center tap, eg. [ a,b,c,b,a ] , or even width, in which case the center tap is duplicated : [a,b,c,c,b,a].

Any even filter can be made from an odd filter by convolution with the box , [1,1]. (However, it should be noted that an even "Sinc" is not made by taking an odd "Sinc" and convolving with box, it changes the function).

That means all your library needs is odd filters and box resamplers. Odd filters can be done "in place", that is from an image to an image of the same size. Box upsample means replicate a pixel with four identical ones, and box downsample means take four pixels are replace them with their average.

To downsample you just do : odd filter, then box downsample.
To upsample you just do : box upsample, then odd filter.

For example, the "magic kernel" (aka bilinear filter) can be done using an odd filter of [1,2,1]. You just box upsample then convolve with 121, and that's equivalent to upsampling with 1331.

Here are some odd filters that work for reference :

Box      : 1.0
Linear   : 0.25,0.50,0.25
Quadratic: 0.128,0.235,0.276,0.235,0.128
Cubic    : 0.058,0.128,0.199,0.231,0.199,0.128,0.058
Gaussian : 0.008,0.036,0.110,0.213,0.267,0.213,0.110,0.036,0.008
Mitchell1: -0.008,-0.011,0.019,0.115,0.237,0.296,0.237,0.115,0.019,-0.011,-0.008
Sinc     : -0.003,-0.013,0.000,0.094,0.253,0.337,0.253,0.094,0.000,-0.013,-0.003
Lanczos4 : -0.008,0.000,0.095,0.249,0.327,0.249,0.095,0.000,-0.008
Lanczos5 : -0.005,-0.022,0.000,0.108,0.256,0.327,0.256,0.108,0.000,-0.022,-0.005

Okay, so now let's get back to edge detection. First of all let's clarify something : edge detectors and gradients are not the same thing. Gradients are slopes in the image; eg. big planar ramps may have large gradients. "edges" are difficult to define things, and different applications may have different ideas of what should constitute an "edge". Sobel kernels and such are *gradient* operators not edge detectors. The goal of the gradient operator is reasonably well defined, in the sense that if our image is a height map, the gradient should be the slope of the terrain. So henceforth we are talking about gradients not edges.

The basic centered difference operator is [-1,0,1] and gives you a gradient at the middle of the filter. The "naive difference" (Costella's terminology) is [-1,1] and gives you a gradient half way between the original pixels.

First of all note that if you take the naive difference at two adjacent pels, you get two gradients at half pel locations; if you want the gradient at the integer pixel location between them you would combine the taps - [-1,1,0] and [0,-1,1] - the sum is just [-1,0,1] , the central difference.

Costella basically proposes using some kind of upsampler and the naive difference. Note that the naive difference operator and the upsampler are both just linear filters. That means you can do them in either order, since convolution commutes, A*B = B*A, and it also means you could just make a single filter that does both.

In particular, if you do "magic upsampler" (bilinear upsampler) , naive difference, and then box downsample the taps that lie within an original pixel, what you get is :

-1  0  1
-6  0  6
-1  0  1

A sort of Sobel-like gradient operator (but a bad one). (this comes from 1331 and the 3's are in the same original pixel).

So upsampling and naive difference is really just another form of linear filter. But of course anybody who's serious about gradient detection knows this already. You don't just use the Sobel operator. For example in the ancient/classic Canny paper, they use a Gaussian filter with the Sobel operator.

One approach to making edge detection operators is to use a Gaussian Derivative, and then find the discrete approximation in a 3x3 or 5x5 window (the Scharr operator is pretty close to the Gaussian Derivative in a 3x3 window, though Kroon finds a slightly better one). Of course even Gaussian Derivatives are not necessarily "optimal" in terms of getting the direction and magnitude of the gradient right, and various people (Kroon, Scharr, etc.) have worked out better filters in recent papers.

Costella does point out something that may not be obvious, so we should appreciate that :

Gradients at the original res of the image do suffer from aliasing. For example, if your original image is [..,0,1,0,1,0,1,..] , where's the gradient? Well, there are gradients between each pair of pixels, but if you only look at original image pixel locations you can't place a gradient anywhere. That is, convolution with [-1,0,1] gives you zero everywhere.

However, to address this we don't need any "magic". We can just double the resolution of our image using whatever filter we want, and then apply any normal gradient detector at the higher resolution. If we did that on the [0,1,0,1] example we would get gradients at all the half taps.

Now, finally, I should point out that "edge detection" is a whole other can of worms than gradient operators, since you want to do things like suppress noise, connect lines, look for human perceptual effects in edges, etc. There are tons and tons of papers on these topics and if you really care about visual edge detection you should go read them. A good start is to use a bilateral or median filter before the sharpen operator (the bilateral filter suppresses speckle noise and joins up dotted edges), and then sharpen should be some kind of laplacian of gaussian approximation.

03-21-11 | ClipCD

Copy current dir to clipboard :

c:\bat>type clipcd.bat
@echo off
cechonr "clip " > s:\t.bat
cd >> s:\t.bat
REM type r:\t.bat
(cechonr is my variant of "echo" that doesn't put a \n on the end).

I'm sure it could be done easier, but I've always enjoyed this crufty way of making complex batch files by having them write a new batch file. For example I've long done my own savedir/recalldir this way :

c:\bat>type savedir.bat
@echo off
cd > r:\t1.z
cd \
cd > r:\t2.z
zcopy -o c:\bat\echo_off.bat r:\t3.z
attrib -r r:\t3.z
type r:\t2.z >> r:\t3.z
cechonr "cd " >> r:\t3.z
type r:\t1.z >> r:\t3.z
zcopy -o r:\t3.z c:\bat\recalldir.bat
echo cls >> c:\bat\recalldir.bat
call dele r:\t1.z r:\t2.z r:\t3.z
call recalldir.bat

Less useful now that most CLI's have a proper pushdir/popdir. But this is a bit different because it actually makes a file on disk (recalldir.bat), I use it to set my "home" dir and my dos startup bat runs recalldir.

In other utility news, my CLI utils (move,copy,etc) have a new option which everyone should copy - when you have a duplicate name, you can ask it to check for binary identity right there in the prompt :

r:\>zc aikmi.BMP z
 R:\z\aikmi.BMP exists; overwrite? (y/n/A/N/u/U/c/C)?
  (y=yes, n=no, A=all,N=none,u=update newer,U=all,c=check same,C=all)
 R:\z\aikmi.BMP exists; overwrite? (y/n/A/N/u/U/c/C)c
CheckFilesSame : same
 R:\z\aikmi.BMP exists; overwrite? (y/n/A/N/u/U/c/C)y
R:\aikmi.BMP -> R:\z\aikmi.BMP

And of course like all good prompts, for each choice there is a way to say "do this for every prompt".

(BTW if you want a file copier for backing up big dirs, robocopy is quite good. The only problems is the default number of retries is no good, when you hit files with problems it will just hang forever (well, 30 million seconds anyway, which is essentially forever) You need to use /R:10 and /W:10 or something like that).

03-21-11 | Slow Coder

I'm doing the cross platform build script for my RAD library, and I am SO FUCKING SLOW at it. Somebody else could have done it in one day and it's taking me a week.

It reminds me that in some of my less successful job interviews, I tried to be honest about my strengths and weaknesses. I said something along the lines of "if you give me interesting technical work, I'm better than almost anyone in the world, but if you give me boring wiring work, I'm very ordinary, or maybe even worse than ordinary". I didn't get those jobs.

Job interviewing is one of those scenarios where honesty is not rewarded. Employers might give lip service to trying to find out what an employee's really like, but the fact is they are much more likely to hire someone who just says "I'm great at everything" and answers "what is your weakness" with one of those answers like "my greatest weakness is I spend too many hours at work".

It's sort of like the early phase of dating. If you are forthcoming and actually confess any of your flaws, the employer/date is like "eww yuck, if they admit that, they must have something really bad actually wrong with them". You might think it's great to get the truth out in the open right away, see if you are compatible, but all the other person sees is "candidate A has confessed no weaknesses and candidate B has said he has a fear of intimacy and might be randomly emotionally cold to me at times, and that was a really weird thing to say at a job interview".

Furthermore, it's sort of just a faux pas. It's like talking about masturbation around your parents. It's too much sharing with someone you aren't close with yet. All the people who understand the social code of how you're supposed to behave just feel really uncomfortable, like "why the fuck is this guy confessing his honest weaknesses? that is not what you're supposed to do in an interview/date". Job interviews/early dates don't really tell you much deep factual information about a person. There's an obvious code of what you're supposed to say and you just say that. It's really a test of "are you sane enough to say the things you are supposed to in this situation?".

03-19-11 | Fitness Links

I've started working out again recently. I'm trying to do things differently this time, hopefully in a way that leads to more long term good foundational structure for my body problems. Obviously that would have been much easier to do at a young age, but better late than never I guess. I believe that in the past I may have overdeveloped the easy muscles, which is basically the "front" - pecs, abs, biceps, etc. I'm not sure if that contributed to my series of shoulder injuries, but it certainly didn't help.

My intention this time is to try to develop musculature that will help support my unstable shoulders as well as generally help with "programmer's disease". So generally that means strengthening the back, shoulder stabilizers, lots of over-head work, and dynamic work that involves full body moves, flexibility and extension.

The other change is that the gym I'm going to here happens to have no proper weights (aka barbells and racks). Hey dumb gym owners : if you only put ONE thing in your gym, it should be a power rack with barbells. And of course this gym has no power rack, just a bunch of those stupid fucking machines. That is the most useful and general purpose single piece of gym equipment. You could get a full workout with just bodyweight moves for the small muscles and a power rack for the big ones. In fact I would love a gym that's just a big empty room and a bunch of racks and bars, but that's reserved for pro athletes and nutters like crossfit.

Anyway, the one thing they do have is kettlebells, so I'm doing that. It's pretty fun learning the new moves. If you read the forums you'll see a bunch of doofuses talking about how kettlebells "change everything" and are "so much more fun". No, they're not. But they are different. So if you've done normal weights for many years and you're sick of it, it might be a nice change of pace. Learning new moves gives you mind something to do while your body is lugging weight around, it keeps you from dieing of boredom.

I'm also trying to avoid all crunch-like movements for abs, that is, all contractions. So far I'm doing a bunch of plank variants, and of course things like overhead farmers walks, but I may have to figure out some more to add to that. One of the best exercises for abs is just heavy deadlifts, but sadly I can't do that in the dumb yuppie gym.

My new links :

YouTube - Steve Cotter Talks Kettlebell Fundamental Steve Cotter Workshop Tour
YouTube - Steve Cotter Snatch
YouTube - Steve Cotter Kettlebell Turkish Get Up Instructional Video
YouTube - Steve Cotter Kettlebell Overhead Press Instructional Video
YouTube - Steve Cotter Kettlebell High Windmill Instructional
YouTube - Steve Cotter Kettlebell Dead Postion Snatch Instructional
YouTube - Steve Cotter Kettlebell Combo Lift Clean Squat Press
YouTube - Steve Cotter Kettlebell Clean Instructional Video
YouTube - Stability Ball Combination Exercises Stability Ball Exercises Oblique & Abs
YouTube - Stability Ball Combination Exercises Stability Ball Exercises Ab Tucks
YouTube - Squat Rx #9 - overhead squat
YouTube - Squat Rx #22 - overhead
YouTube - The Most Fun Abs Exercises You Can Do with a Ball
YouTube - The Evolution of Abs Exercises and Core Workouts
YouTube - Spinal mobility
YouTube - Push Ups with a Vengeance Part 2
YouTube - Push Ups with a Vengeance Part 1
YouTube - PUSH UPS Hindu vs Divebomber Push Up Pushup Cool Pushup
YouTube - Power Clean
YouTube - Power Clean Teaching Combination 3
YouTube - perfectly executed kettlebell full body exercise - turkish get up
YouTube - perfect russian kettlebell push press technique
YouTube - Pahlevan Akbar
YouTube - Naked Get Up
YouTube - Move Better
YouTube - Medicine Ball Ab Workout Exercises
YouTube - Mark Rippetoe The Deadlift Scapula
YouTube - Mark Rippetoe Intro to the Deadlift
YouTube - KILLER ABS - Stability ball workout
YouTube - Kettlebell Swing (Hardstyle)
YouTube - Kettlebell Snatch by Valery Fedorenko
YouTube - Kettlebell Bootstrapper Squat
YouTube - Kettlebell Basics with Steve Cotter
YouTube - Kettlebell Basics - The Kettlebell Press
YouTube - Kadochnikov System - screwing into the floor
YouTube - Kadochnikov System - Crocodile
YouTube - IKFF Joint Mobility Warm-up Phase 1-Part
YouTube - How to Perform the Kettlebell Snatch Steve Cotter Workshop Tour
YouTube - How to Master the Kettlebell Snatch
YouTube - How To Do 90 Degree Pushups
YouTube - How to avoid banging your wrist in Kettlebell Snatch-Steve Cotter
YouTube - Hip opener mobility
YouTube - geoffcraft's Channel
YouTube - Flexibility Drills for Hips & Lower Body
YouTube - Dan John - Teaching Bootstrapper Squat
YouTube - core medicine ball workout 251007
YouTube - Basic Serratus Anterior Activation
YouTube - Band Pulls for better shoulder strength and health
Yoga for Fighters Releasing the Psoas stumptuous.com
Tweaking the Overhead Squat Dislocates, Reaching Back, Grip Width and Mobility Drills - Ground Up Strength
FARMER'S WALK my new quick morning work out - www.MichaelReid.ca

My Old Links :

YouTube - Tommy Kono lecture instruction on Olympic Lifting Part 1
YouTube - Dabaya 5x200 front squat
YouTube - Broadcast Yourself. - stiznel
Yoga Journal - Upward Bow or Wheel Pose
Women's Weight Training
Welcome to CrossFit Forging Elite Fitness
Viparita Dandasana
Training to failure
Training Primer
TN Shoulder Savers 2
TN Shoulder Savers 1
TN Monster Shoulders
TN Band Man
The York Handbalancing Course
The Video FitCast- Episode 6 - Google Video
The TNT Workout Plan - Men's Health
The One Arm Chin-upPull-up
The Coach - Dan John - Lifiting and Throwing
The 2+2 Forums Rotator Cuff Exercises
TESTOSTERONE NATION - The Shoulder Training Bible
TESTOSTERONE NATION - Romanian vs. Stiff-Legged Deadlifts
TESTOSTERONE NATION - Neanderthal No More, Part V
TESTOSTERONE NATION - Most Powerful Program Ever
TESTOSTERONE NATION - Mastering the Deadlift
TESTOSTERONE NATION - HSS-100 Back Specialization
Testosterone Nation - Hardcore Stretching, Part II
Testosterone Nation - Forgotten Squats
TESTOSTERONE NATION - Feel Better for 10 Bucks
TESTOSTERONE NATION - Essential Waterbury Program Design
TESTOSTERONE NATION - Core Training for Smart Folks
TESTOSTERONE NATION - Computer Guy Workouts
TESTOSTERONE NATION - Computer Guy part 2
TESTOSTERONE NATION - A Thinking Man's Guide to Sets and Reps
tennis ball ART
Stretching and Flexibility - Table of Contents
Squat Rx
San Francisco Sport and Spine Physical Therapy San Francisco, The Castro Yelp
Romanian Dead lift
Rippetoe-Starting Strength FAQ - Bodybuilding.com Forums
Rippetoe's Starting Strength - Bodybuilding.com Forums
Rippetoe's program - Bodybuilding.com Forums
Rack Pull
PSOAS Massage & Bodywork - San Francisco's Best Massage
Posture for a Healthy Back
PNF Stretching
Physical Therapy Corner Iliotibial Band Friction Syndrome Treatment
OHPositionSnatchFB10-6-07.mov (videoquicktime Object)
My RSI Story
MIKE'S GYM Programs
Mastering the Deadlift Part II
Madcow Training - Table of Contents, 5x5 Programs, Dual Factor Theory, Training Theory
Low Back Program Exercises — Portal
Low Back Pain Exercise Guide
Kyphosis - Robb Wolf Shorties Catalyst Athletics The Performance Menu
Kettlebells Training Video - The Turkish Getup Exercise
Iliotibial band stretch
HST - Charles T. Ridgely
HSNHST Articles
How to benefit from Planned Overtraining
Hollywood Muscle
Hindu Pushups
Gym Jones - Knowledge
Guide to Novice Barbell Training, aka the Official RIPPETOE-STARTING STRENGTH FAQ - Bodybuilding.com Forums
Grease the Groove for Strength A Strength Training and Powerlifting article from Dragon Door Publications
Got Rings
GNC Pro Performance® - Therapy - Pelvic Tilts
Girl Squat
Foam Roller Exercises
Finger training
ExRx Exercise & Muscle Directory
eMedicine - Hamstring Strain Article by Jeffrey M Heftler, MD
EliteFitness.com Bodybuilding Forums - View Single Post - HELLO! your on STEROIDS REMEMBER
Deepsquatter back & abs
Dan John Front Squat
CrossFit Exercises
Chakra-asana - The Wheel Posture - Yoga Postures Step-By-Step
Building an Olympic Body through Bodyweight Conditioning A Bodyweight Strength Training article from Dragon Door Publication
Bodybuilding.com Presents Diet Calculation Results
Bodybuilding.com - Patrick Hagerman - Flexibility For Swimming!
BikeTheWest.com - Nevada's Best Bike Rides
BetterU News - Issue #35 - Best Rear Delt Exercise, The Formulator for Forearms, Lower Back Pain and Bodybuliding
Beast Skills
Beast Skills - Tutorials for Bodyweight Feats

03-14-11 | cbloom.com/exe BmpUtil update

I put up a new BmpUtil on the cbloom.com/exe page . Release notes :

bmputil built Mar 14 2011 12:49:42
bmp view <file>
bmp info <file>
bmp copy <fm> <to> [bits] [alpha]
bmp jpeg <fm> <to> [quality]
bmp crop <fm> <to> <w> <h> [x] [y]
bmp pad <fm> <to> <w> <h> [x] [y]
bmp cat <h|v> <fm1> <fm2> <to>
bmp size <fm> <to> <w> [h]
bmp mse <im1> <im2>
bmp median <fm> <to> <radius> [selfs]
file extensions : bmp,tga,png,jpg
  jpg gets quality from last # in name

fimutil by cbloom built Mar 14 2011 12:50:56
fim view <file>
fim info <file>
fim copy <fm> <to> [planes]
fim mse <fm> <to>
fim size <fm> <to> <w> [h]
fim make <to> <w> <h> <d> [r,g,b,a]
fim eq <fm> <to> <eq>
fim eq2 <fm1> <fm2> <to> <eq>
fim cmd <fm> <to> <cmd>  (fim cmd ? for more)
fim interp <to> <fm1> <fm2> <fmt>
fim filter <fm> <to> <filter> [repeats] ; (filter=? for more)
fim upfilter/double <fm> <to> <filter> [repeats]
fim downfilter/halve <fm> <to> <filter> [repeats]
fim gaussian <fm> <to> <sdev> [width]
fim bilateral <fm> <to> <spatial_sdev> <value_sdev> [spatial taps]
file extensions : bmp,tga,png,jpg,fim
 use .fim for float images; jpg gets quality from last # in name

fim cmd <fm> <to> <cmd>
 use cmd=? for help

Some notes :

Most of the commands will give more help if you run them, but you may have to give some dummy args to make them think they have enough args. eg. run "fimutil eq ? ? ?"

FimUtil sizers are much better than the BmpUtil ones. TODO : any resizing except doubling/halving is not very good yet.

FimUtil eq & eq2 provide a pretty generate equation parser, so you can do any kind of per-sample manipulation you want there.

"bmputil copy" is how you change file formats. Normally you put the desired jpeg quality in the file name when you write jpegs, or you can use "bmputil jpeg" to specify it manually.

Unless otherwise noted, fim pixels are in [0,1] and bmp pixels are in [0,255] (just to be confusing, many of the fimutil commands do a *1/255 for you so that you can pass [0,255] values on the cmd line); most fim ops do NOT enforce clamping automatically, so you may wish to use ClampUnit or ScaleBiasUnit.

Yeah, I know imagemagick does lots of this shit but I can never figure out how to use their commands. All the source code for this is in cblib, so you can examine it, fix it, laugh at it, what have you.

03-12-11 | C Coroutines with Stack

It's pretty trivial to do the C Coroutine thing and just copy your stack in and out. This lets you have C coroutines with stack - but only in a limitted way.


Major crack smoking. This doesn't work in any kind of general way, you would have to find the right hack per compiler, per build setting, etc.

Fortunately, C++ has a mechanism built in that lets you associate some data per function call and make those variable references automatically rebased to that chunk of memory - it's called member variables, just use that!

03-11-11 | Worklets , IO , and Coroutines

So I'm working on this issue of combining async CPU work with IO events. I have a little async job queue thing, that I call "WorkMgr" and it runs "Worklets". See previous main post on this topic :

cbloom rants 04-06-09 - The Work Dispatcher

And also various semi-related other posts :
cbloom rants 09-21-10 - Waiting on Thread Events
cbloom rants 09-21-10 - Waiting on Thread Events Part 2
cbloom rants 09-12-10 - The deficiency of Windows' multi-processor scheduler
cbloom rants 04-15-09 - Oodle Page Cache

So I'm happy with how my WorkMgr works for pure CPU work items. It has one worker thread per core, the Worklets can be dependent on other Worklets, and it has a dispatcher to farm out Worklets using lock-free queues and all that.

(ASIDE : there is one major problem that ryg describes well , which is that it is possible for worker threads that are doing work to get swapped out for a very long time while workers on another core that could have CPU time can't find anything to do. This is basically a fundamental issue with not being in full control of the OS, and is related to the "deficiency of Windows' multi-processor scheduler" noted above. BTW this problem is much worse if you lock your threads to cores; because of that I advise that in Windows you should *never* lock your threads to cores, you can use affinity to set the preferred core, but don't use the exclusive mask. Anyway, this is an interesting topic that I may come back to in the future, but it's off topic so let's ignore it for now).

So the funny issues start arising when your work items have dependencies on external non-CPU work. For concreteness I'm going to call this "IO" (File, Network, whatever), but it's just anything that takes an unknown amount of time and doesn't use the CPU.

Let's consider a simple concrete example. You wish to do some CPU work (let's call it A), then fire an IO and wait on it, then do some more CPU work B. In pseduocode form :

    h = IO();
Now obviously you can just give this to the dispatcher and it would work, but while your worklet is waiting on the IO it would be blocking that whole worker thread.

Currently in my system the way you fix this is to split the task. You make two Worklets, the first does work A and fires the IO, the second does work B and is dependent on the first and the IO. Concretely :


    h = IO();
    QueueWorklet( Worklet2, Dependencies{ h } );

so Worklet1 finishes and the worker thread can then do other work if there is anything available. If not, the worker thread goes to sleep waiting for one of the dependencies to be done.

This way works fine, it's what I've been using for the past year or so, but as I was writing some example code it occurred to me that it's just a real pain in the ass to write code this way. It's not too bad here, but if you have a bunch of IO's, like do cpu work, IO, do cpu work, more IO, etc. you have to make a whole chain of functions and get the dependencies right and so on. It's just like writing code for IO completion callbacks, which is a real nightmare way to write IO code.

The thing that struck me is that basically what I've done here is create one of the "ghetto coroutine" systems. A coroutine is a function call that can yield, or a manually-scheduled thread if you like. This split up Worklet method could be written as a state machine :

  if ( state == 0 )
    h = IO();
    state++; enqueue self{ depends on h };
  else if ( state == 1 )

In this form it's obviously the state machine form of a coroutine. What we really want is to yield after the IO and then be able to resume back at that point when some condition is met. Any time you see a state machine, you should prefer a *true* coroutine. For example, game AI written as a state machine is absolutely a nightmare to work with. Game AI written as simple linear coroutines are very nice :

    WalkTo( box )
    obj = Open( box )
    PickUp( obj )

with implicit coroutine Yields taking place in each command that takes some time. In this way you can write linear code, and when some of your actions take undetermined long amounts of time, the code just yields until that's done. (in real game AI you also have to handle interruptions and such things).

So, there's a cute way to implement coroutines in C using switch :

Protothreads - Lightweight, Stackless Threads in C
Coroutines in C

So one option would be to use something like that. You would put the hidden "state" counter into the Worklet work item struct, and use some macros and then you could write :

  crStart   // macro that does a switch on state

    h = IO();

  crWait(h,1)  // macro that does re-enqueue self with dependency, state = 1; case 1:



that gives us linear-looking code that actually gets swapped out and back in. Unfortunately, it's not practical because this C-coroutine hack doesn't preserve local variables, is creating weird scopes all over, and just is not actually usable for anything but super simple code. (the switch method gives you stackless coroutines; obvious Worklet can be a class and you could use member variables). Implementing a true (stackful) coroutine system doesn't really seem practical for cross-platform (it would be reasonably easy to do for any one platform, you just have to record the stack in crStart and copy it out in crWait, but it's just too much of a low-level hacky mess that would require intimate knowledge of the quirks of each platform and compiler). (you can do coroutines in Windows with fibers, not sure if that would be a viable solution on Windows because I've always heard "fibers are bad mmkay").

Aside : some links on coroutines for C++ :

Thinking Asynchronously in C++ Composed operations, coroutines and code makeover
Dr Dobbs Cross-Platform Coroutines in C++
COROUTINE (Keld Helsgaun)
Chapter 1. Boost.Coroutine proposal

The next obvious option is a thread pool. We go ahead and let the work item do IO and put the worker thread to sleep, but when it does that we also fire up a new worker thread so that something can run. Of course to avoid creating new threads all the time you have a pool of possible worker threads that are just sitting asleep until you need them. So you do something like :

  h = IO();

  number of non-waiting workers --;



  number of non-waiting workers ++;

  if ( number of non-waiting workers < desired number of workers &&
    is there any work to do )
    start a new worker from the pool

  if ( number of non-waiting workers > desired number of workers )
    sleep worker to the pool

// CheckThreadPool also has to be called any time a work item is added to the queue

or something like that. Desired number of workers would be number of cores typically. You have to be very careful of the details of this to avoid races, though races here aren't the worst thing in the world because they just mean you have not quite the ideal number of worker threads running.

This is a reasonably elegant solution, and on Windows is probably a good one. On the consoles I'm concerned about the memory use overhead and other costs associated with having a bunch of threads in a pool.

Of course if you were Windows only, you should just use the built-in thread pool system. It's been in Windows forever in the form of IO Completion Port handling. New in Vista is much simpler, more elegant thread pool that basically just does exactly what you want a thread pool to do, and is managed by the kernel so it's fast and robust and all that. For example, with the custom system you have to be careful to use ThreadPoolWait() instead of normal OS Wait() and if you can't get nice action when you do something that puts you to sleep in other ways (like locking a mutex or whatever).

Some links on Windows thread pools and the old IO completion stuff :

MSDN Pooled Threads Improve Scalability With New Thread Pool APIs (Vista)
MSDN Thread Pools (Windows) (Vista)
MSDN Thread Pooling (Windows) (old)
MSDN Thread Pool API (Windows) (Vista)
So you need a worker thread pool... - Larry Osterman's WebLog - Site Home - MSDN Blogs
Managed ThreadPool vs Win32 ThreadPool (pre-Vista) - Junfeng Zhang's Windows Programming Notes - Site Home - MSDN Blogs
Dr Dobbs Multithreaded Asynchronous IO & IO Completion Ports
Concurrent, Multi-Core Programming on Windows and .NET (Part II -- Threading Stephen Toub)
MSDN Asynchronous Procedure Calls (Windows)
Why does Win32 even have Fibers - Larry Osterman's WebLog - Site Home - MSDN Blogs
When does it make sense to use Win32 Fibers - Eric Eilebrecht's blog - Site Home - MSDN Blogs
Using fibers to simplify enumerators, part 3 Having it both ways - The Old New Thing - Site Home - MSDN Blogs

So I've rambled a while and don't really have a point. The end.

03-11-11 | Rant Rant Rant

Well I found out you're not allowed to contribute to a Roth IRA if you make more than $120k or something. WTF god damn unnecessarily complicated tax laws. So now I get to deal with penalties for excess contribution. If you just leave it in there you get a 6% penalty *every year*. God damnit, the fucking Roth limit is $5000 anyway, it's not like the government is missing out on a ton of tax revenue because I made a contribution, it's just part of the fucking retarded way that they raise money without "raising taxes" because they aren't allowed to touch the nominal percent tax rate, they get it in other ways. (actually I'm sure that I made illegal contributions in past years too, god fucking dammit).

Oh, and god damnit why can't the IRS just do my taxes for me !? All I have are W2's and 1099's , you fuckers have all the information, and you're going to electronically check them against my filing, so you just fucking tell me what I'm supposed to pay.

Anyway, I hate fucking retirement savings. You're locking it up in a box where you aren't allowed to use it until you're old. Fuck you future self, you don't get my money, you can earn your own damn money.

My "Brownstripe" internet has been super flakey for the last week. It's incredibly frustrating trying to browse the web when the net is slow, because you become excruciatingly aware of all the unnecessary shit that people are doing on all their web pages. I'm just loading some blog I want to read and I keep getting "waiting for blah blah", site after site, various ad hosts, various tracker sites, etc. Shit like Google Maps is just horrible on slow/flakey nets. I want to be able to manually tell it to cancel all its previous requests and please update this fucking image tile right here that I'm clicking on.

Anyway, because of this I have discovered that Perforce is not actually robust over a flakey net connection. WTF Perforce, you are supposed to be well tested and super-robust. I submitted a big changelist over my flakey net connection. P4 crapped out (rather than retrying and just taking a long time like it should have), and managed to get itself into an invalid state. Some of the files in the changelist got submitted, and when I tried to do anything else to that changelist it told me "unknown changelist #". So I moved all the files in that changelist out to a new one and re-submitted once I got into the office, and discovered that about half the files had merge conflicts because they had already been sort of submitted (not actualy conflicts because it was just the same change) (and "add of added file" errors). WTF, not confidence inspiring P4. Changelist submission is supposed to be atomic.

My fucking PS3 wants to fucking update its system software every two minutes. The worst thing is that it won't let me use Netflix until I do. And it's the worst kind of prompt. I mean, first of all, it's my fucking system, don't force me to update if I don't want to (especially not when the major change in this update is "you can now set controller turn-off timeouts per controller" or some shit). Second of all, if I can't fucking run anything without doing the update, then just do it. Don't ask me. Especially don't pretend that it's optional. I get a prompt like "there is an update available; press X to do it or O to continue". Okay, I don't want to fucking update so I press O. Then after a minute of grinding, I get "press X to update or O to continue" , okay, I press O, don't update. Then I get "you need to log in", WTF I was logged in, but here goes... then I get "you must update your system software to log in". ARG! WTF if it's not optional just fucking do it.

I also wish I could make the PS3 boot directly into Netflix, since that's all I ever use it for. For a device that could be a simple consumer electronics device, it sure is making itself feel like an annoying computer. Oh, and in other PS3 complaint news : the wireless controllers are sort of fail. 1. We spilled like two drips of water on one of them and it doesn't work anymore; 2. They're too heavy, maybe that's the vibration motors, old PS1 controllers were much lighter. 3. The battery runs out in like two seconds if you don't set them to auto-turn off, and 4. if you do set them to auto-turn off they take way too long to wake up. Like, why is my TV remote so much better than my PS controller? The PS3 fan is also a bit too loud. It's much quieter than the Xenon, and it's tolerable when you're playing games, but when you're watching movies it's annoying. The PS3 audio output also has some shoddy non-ground-loop-protected wiring. I was getting a nasty hum out of my stereo and I finally tracked it down to the PS3 RCA wires that I had hooked up. I have various other loops of the same sort and none of them caused any hum, so I put the blame on the PS3.

In non-computer related ranting news, my fence got tagged (spray paint graffiti). I guess that's what happens when you live in an "up and coming" neighborhood. The tagging is just sort of amusing to me (my main complaint is that it's just a shitty tag, come on, put some artistry into it!). The annoying thing is that I have to get the landlord involved. I would just paint over it myself and not report it at all, but then they might see it's a shitty paint job and I'd be responsible. The landlord is just one of those fucking nightmare people who turn everything into a huge stressful hassle. She over-reacts and gets into a giant tizzy about things, it makes you just not want to tell them about any kind of problem. (I've worked with this kind of person before and it's a real nightmare, because you wind up not wanting to assign them any tasks because they act like it's just so onerous, and they wind up working less overtime than everyone else, but complaining more). So now the landlord wants to get in the house to get the old paint stashed in the closet, so I have to dispose of the dead bodies and the meth lab. God dammit.

03-10-11 | House Contemplation

Well, I'm thinking about buying a house. Property values are plummetting fast around here. I think they have a ways to fall still, but asking & selling are starting to come together a bit (for the past 2-3 years there's been a huge gap between initial asking price and final sale price as people refused to accept the reality of the situation). By the time I get my shit together and actually buy in 6-12 months it should be a nice buyer's market. And interest rates are super low and I have a bunch of cash that I don't know what to do with, so that all points to "buy".

On the other hand, it sort of fucking sucks to live in Seattle. I feel like I've explored most of it already and I need a new place to explore. The countryside is really far away here; it's weird because you think of Seattle as being a beautiful place surrounded by mountains, but it's actually one of the most difficult places to actually get away from civilization that I've ever lived. (eg. downtown San Francisco is much much closer to real countryside). Here, you can get out I90, but the I90 corridor really actually sucks, there are zero country roads going off the freeway, and all the hikes are straight up the valley within earshot of the freeway (the thing that doesn't suck is backpacking, when you get far enough in to Alpine Lakes or whatever it's fantabulous). To really get out to country roads and wild open spaces you have to drive 3-4 hours from Seattle, up to Mountain Loop or across a pass, or down to Mount Rainier, something like that.

There's nowhere to fucking bike except Mercer Island over and over (unless you drive 2+ hours, and even then it's not great because it's very hard to find good country roads around here, the ones within 1 hour are generally narrow, trafficky, and pot-holed (eg. Duvall, Green River Valley); I think probably Whidbey is the best spot within 2 hours). And even if there was somewhere to bike it would be raining.

The gray horrible winter is also a sneaky bastard. I find myself starting to think, "I'm used to this, I can handle it" , but the thing I'm not realizing is that I'm just always constantly slightly depressed. It seeps into you and becomes the new norm, and humans have this way of habituating and not realizing that their norm has been lowered. All winter long, I don't laugh, I don't play, I don't dance, I don't meet new people or try new things, I sleep in and eat too much sugar and drink too much booze, I'm just constantly depressed, and I think pretty much everyone in Seattle is, they just don't realize it because it becomes their baseline. You only realize it when you go on vacation somewhere sunny and it's like somebody just lifted a weight off your head and you're like "holy crap, life doesn't have to suck all the time! who knew!?"

And of course the people in Seattle are fucking terrible. Passive-aggressive, busybody, uptight, bland, ugly, pale, pastey, out of shape, unfashionable, slow-driving, sexually timid, white-bread, unfriendly, cliquey. I'm sure whatever house I move into, the neighbors will watch through the window and raise their eyebrows disapprovingly at various things I do. Capitol Hill is by far the best part of Seattle because it's full of The Gays and people who have moved here from out of state, and that's a better population. (in general, the new-comers are almost always a better population than the old-timers; it's generally a better portion of the population who moves to a new place looking for adventure or their fortune; that's why everyone in CA is so beautiful, it's why the West in general is better than the Midwest, it's why America used to be so great and why our closed doors are now hurting us; it's so retarded, of course we should allow citizenship for anyone with a college degree, we would basically steal all the best people from China and India, though it may already be too late for that move).

Okay, Seattle rant aside, I'm still considering it, cuz hey, I'm sick of fucking renting and moving, I want to be able to do what I want to my own house, and you have to live somewhere, and the jobs up here are really good, and if you lock yourself in your bedroom and watch TV all the time it really doesn't matter where you are.

It's pretty insane to go back and look at the property records for sale prices over the last 15 years or so. ( King County eReal Property Records and Parcel Viewer ). For example one house has these sell values :

3/11/2010   asking 500k (sale probably less)
3/23/2006   $739,000.00
1/08/2002   $215,302.00 
1/27/1997   $130,000.00

N found the most insane one :

03/11/2010  asking 475k
05/18/2007  $605,000
10/31/1997  $30,500  

Whoever sold in the bubble sure did well. Assuming they took the profit and moved to The Philippines or somewhere sane.

The other reason I'm thinking about buying is this area around where I live is in the process of gentrifying (see, for example: recent graffiti attack) and I think there's a decent chance to strike lucky. Of course the big percent gain from that has already happened - that's why the prices above have gone so crazy - they were in very poor, crime-ridden, black neighborhoods, that have already semi-gentrified and cleaned up quite a lot. But it's still a bit grungey around here, and only half a block away the real wave of yuppie motherfuckers is marching forward like a khaki tidal wave. The hard thing about the gentrification wave is timing, it can take 50 years

Of course the whole idea of individuals "investing" in the home they live in is retarded and is a real sickness of the last ten years. I have to keep myself from getting swept up in that "norm" (when everyone around you is saying the same wrong idea, it's easy to forget that it's shite). Actual real estate investors invest in lots of properties, not one, and they generally invest for income, not appreciation. And of course home value appreciation is only income if you actually move to a much cheaper place when you sell, which hardly anyone actually does. Unfortunately this belief causes homes to be valued at prices that don't make any sense if you don't believe that it is an "investment".

Anyway, there are two really bad things about buying a house :

1. Transaction costs. They're absolutely absurd. 3-5% !? WTF !? For what? The realtors and mortgage brokers and so on are the only ones really making money long term on housing. Anyway, in the modern era of the internet, there is absolutely no reason for this, I can find my own damn house, I don't need an agent, and I can get my own damn online mortgage. But as usual in the modern "high efficiency economy" the new electronic service providers are doing much less for you, but not actually charging much less. (consider things like Kindle, iTunes, online customer service for banks, etc. , the modern economy is all about reducing producer costs, giving you much less service, and charging roughly the same amount).

2. The seller's information advantage. This is the same thing that fucks buying used cars. The seller may know things about the house that they aren't telling you, and there may be things that are basically impossible for you to know, like once a year a 200 mph wind blows away all the soil. Or the house next door is a death metal band's practice space, and they just happen to be on tour right now. Whether or not you actually get fucked by the information inequality, it costs you an expected $10k or whatever on each transaction.

02-28-11 | Some Car Shit

I wrote about differentials before . Since then I have learned a bit more about LSD's in practical use.

LSD's have a lot of advantages beyond just making your corner exit faster. When I first learned about them I was told they help corner exit speed, and that's like kind of "meh", I don't care about my lap time, so it's hard to get too excited about that. But they do much more.

They make power oversteer much more predictable and controllable. Your two rear wheels are semi locked together so you know how fast they are spinning and thus how much grip you have back there. With an open diff, if you push the throttle, you might just lose grip in one wheel, then when you apply power you spin that wheel but still have grip on the other wheel. If you want to fool around with drifts or donuts or whatever, an LSD is much more fun. It also is safer in the sense that your car is predictable, it's less random whether you oversteer or not.

Without an LSD you can get a nasty chirping/hopping through tight turns. What happens is one wheel lifts, and you speed it up, then it connects to pavement again and you get a sudden jerk and chirp. Through a tight turn (especially with some elevation change) you can get several lift-fall events so you go jerk-chirping around the corner. Not cool.

There are two main types of diff used in sports cars. One is Quaife/Torsen mechanical type, the other is clutch-pack friction disk type. Some random things I have read about these diffs :

Clutch-type diffs can have variable amounts of "preload" and "lockup" or "dynamic" locking. Usually this is expressed as the % of lockup (eg. 0% = open diff, 100% = solid axle), but sometimes it's expressed as the amount of force through the clutch pack. "preload" is the amount of lockup with no force coming through the drive shaft. If you lift the wheels up and turn one side and see how much the other side turns - this is the "preload". Preload can make the car hard to make tight turns - it eliminates the ability to turn one wheel without turning the other, so street cars often have 0% - 20% preload. The housing of the LSD has this ramp built into it and when force comes through the drive shaft, it pushes a pin against the ramp which forces the clutch plates together, creating more lockup. This is the "dynamic number". You will see LSD's described as 20/40 or something, which means 20% preload and 40% under force. Sometimes this is also described as the "ramp angle" because the angle of that ramp determines how quickly the LSD adds more pressure.

Race cars use LSD's with 40/60 or 50/80 settings. This is lots of lockup. This works for races because you are never actually making super tight turns on race tracks. For Autocross people generally use less preload or a Torsen diff. If you have a high lockup, you can only take tight turns by drifting the rear end. Preload also greatly increases low speed understeer. Most OEM LSD's max out at 20-40% (Porsche Cayman/911 non-GT3 LSD maxes at 30%).

Torsen diffs act like zero-preload , and also don't provide much lockup under braking. Clutch type LSD provide stability under braking - they keep the rear end straight, because a wiggling rear end can only happen if the rear wheels are spinning at different speeds.

Clutch type LSD's wear out and have to be serviced like any other clutch plate. TBD's generally don't need much servicing unless they are beat up hard, roughly like transmission gears. Many OEM LSD's have very low settings so you would want to replace them anyway. It may still be wise to tick the OEM LSD option because it can be easier to fit an aftermarket LSD if you had the OEM one in the transmission (details depend on the car). OEM clutch-type LSD's also often wear very quickly; when buying a used car "with LSD" it is often actually an open diff because the LSD is shot.

Gordon Glasgow LSD tech
Porsche GT3 LSD Buster Thread
Early 911 LSD setup

BTW Lotus doesn't fit LSD's, and the new McLaren has no LSD. If your goal is sharp steering and fast lap times, then an LSD is not a clear win. If your goal is the "driving pleasure" of being in good control of your rear end slip angle, an LSD is more obviously good. See Lotus Evora with no LSD for example.

In non-slip scenarios, LSD's increase understeer by binding the sides of the car to each other. (this is why Lotus prefers not to fit them). In some cases, it may be that ticking the optional LSD box actually makes a car worse. For example this may be the case with the Cayman S, where the LSD added to the standard suspension increases understeer too much ; in contrast, the Cayman R has LSD standard and the suspension was set up for that, which involves more camber and stiffer sways, both of which are anti-understeer moves.

Some commentators have been knocking the MP4-12C for not having an LSD, but those commentators "don't get it". The MP4 is a computer-controlled car, like the Nissan GTR or the Ferrari 599. It does not rely on mechanical bits for power transfer, stability under braking, control of over/under-steer / etc. Traditional mechanical bits like LSD or sway bars or spring rates or whatever just don't apply. You don't need an LSD to keep you straight under braking when the car is doing single-wheel braking based on the difference in steering angle and actual yaw rate. Normal car dynamics is a balance between twitchy vs. safe, or turn-in vs. stability; Lotus and McLaren have bent the rules and avoided this trade-off, however that means less hoon-ability. See for example : Lewis Hamilton fails to do donuts in the MP4 .

It's much easier to kick a drift and countersteer through it correctly if you don't have to think through it rationally. It's almost impossible to catch the countersteer fast enough and with the right angle if you are doing it with your conscious mind going through the steps. What you have to do is get your body to do it for you subconsciously. Fortunately there's an easy way to do that :

It just requires using the old adage "look where you want to drive, and drive where you look".

Step by step :

This helps you countersteer quicker, and also helps you to not over-do your countersteer (ala Corvette club), because you aren't thinking "countersteer", you're just thinking "point the wheel where I want to go". The standard mistake to catch it too late, and then over-do it because you are thinking too hard about "countersteer" so then you fish-tail in the opposite direction. You do not have to "countersteer", all you have to do is point the front wheels in the direction you want to go.

(* note : this is not actually true, the correct steering angle is slightly past pointing the wheels where you want to go, but I find if I think this way, then my hands automatically do the right thing).

It really is true that you should "look where you want to drive, and drive where you look" ; it's one of those things they tell you in race school and you're like "yeah yeah duh I know, let me out on the fucking track already", but it doesn't sink in for a while. 99% of the time that I make a mistake driving, I realize after the fact that I wasn't looking where I wanted the car to go. You want to be looking far ahead, not ahead on the ground right near you. When I really fuck up I realize after the fact that I was looking at my steering wheel or my shifter, eg. looking down inside the cabin.

One thing that does help is to put a piece of tape on the top of your steering wheel (if you don't have a mark built in). Again this seems totally unnecessary, you will think "I know where fucking straight ahead is, I don't need a mark", but it does help. What it does is give you a *visual* cue of where straight ahead is, so it connects the eyes to what the hands are doing. You want that connection of visual to hand motion to be subconscious, to avoid the rational mind. Your hands perform better when they are out of conscious control.

I got a lot of bad information when I was shopping for cars. You have to ignore all the mainstream press, Edmunds, Motortrend, all that kind of shit. The professional automative writers are 1. lazy (they don't actually research the cars they write about, or how cars work in general), 2. corrupt (they praise cars so that they can get free testers), and 3. just generally stupid. They don't write about what you actually want to know or what would actually be useful. Furthermore, the idea that you can tell a lot from a test drive is I believe not true. One thing I have to get over is I'm afraid to really thrash cars in the test drive, and without a really hard thrashing you can't tell how they behave at the limit. For example, you'd like to know things like what does the car do if you mash the brake while cornering near the limit of grip? The auto writers won't tell you and you can't find out in a test drive.

A good source of information is web forums. Not your normal "I love my car"/"check out my bling" web forums, but the actual racer's forums, like the SCCA and other groups (spec miata, spec boxster, etc, lots of racer forums out there). There you can read about the things that really matter in a sports car, how it acts on the edge, how much prep it needs to be track-worthy, whether the engines blow up on the track, etc. Even if you don't track and have no intention of tracking, the guys who do tend to be a better class of humans than the normal web forum car people, who care more about "brand history" and rims and who has a faster quarter mile.

The vast majority of modern sports cars are set up for understeer (eg. Porsches, BMW's, etc.). The only ones I know of that aren't set up for understeer are the RX8 and S2000. Of course most of the retarded mainstream press says absolutely nothing about the intentional understeer (which you can fix pretty easily) and they say that the RX8 and S2000 have bad handling because they are "twitchy" or "bad in the rain". That's retarded. Those cars have great handling in the sense that it is predictable - if you go around a corner fast in the rain and jab the brakes, you will likely spin. That's not bad handling, that's a bad driver.

To some extent, "sharpness" or "liveliness" inherently go together with "twitchiness" or "dangerousness". For example cars that give you a nice little bit of oversteer will generally also give you lift-off oversteer, which you probably don't want. The ideal thing would be a car that you could press a button to make it milder and safer for normal use, or twitchy and sharp for fun times. This is sort of impossible with traditional mechanical suspension, you can only get a good compromise. This is why the new crazy cars from Ferrari and McLaren use computers to control the handling. The Ferrari approach is a bit like the "Euro Fighter" - they make a car that is mechanically inherently unstable, and then use computers to keep it under control. The McLaren MP4-12C approach uses crazy new techniques that nobody else is using.

02-28-11 | Game Branching and Internal Publishing

Darrin West has a nice post on Running branches for continuous publishing . Read the post, but basically the idea is you have a mainline for devs and a branch for releases.

At OW we didn't do this. Whenever we had to push out a milestone or a demo or whatever, we would go into "code lockdown" mode. Only approved bugfixes could get checked in - if you allow dev work to keep getting checked in, you risk destabilizing and making new bugs.

This was all fine, the problem is you can lose some dev productivity during this time. Part of the team is working on bug fixes, but some aren't and they should be proceeding with features which you want after the release. Sure, you can have them just keep files checked out on their local machines and do work there, and that works to some extent, but if the release lockdown stretches out for days or weeks, that's not viable, and it doesn't work if people need to share code with eachother, etc.

If I had it to do over again I would use the dev_branch/release_branch method.

To be clear : generally coders are working on dev_branch ; when you get close to a release, you integ from dev_branch to release_branch. Now the artists & testers are getting builds from release_branch ; you do all bug-fixes to release_branch. The lead coder and the people who are focused on the release get on that branch, but other devs who are doing future fixes can stay on dev_branch and are unaffected by the lockdown.

The other question is what build is given to artists & designers all the time during normal development. I call this "internal publication" ; normal publication is when the whole team gives the game to an external client (the publisher, demo at a show, whatever), internal publication is when the code team gives a build to the content team. It's very rare to see a game company actually think carefully about internal publication.

I have always believed that giving artists & designers "hot" builds (the latest build from whatever the coders have checked in) is a mistake - it leads to way too much art team down time as they deal with bugs in the hot code. I worked on way too many teams that were dominated by programmer arrogance that "we don't write bugs" or "the hot build is fine" or "we'll fix it quickly" ; basically the belief that artist's time is not as important as coder's time, so it's no big deal if the artists lose hours out of their day waiting for the build to be fixed.

It's much better to have at least a known semi-stable build before publishing it internally. This might be once a day or once a week or so. I believe it's wise to have one on-staff full time tester whose sole job is to test the build before it goes from code to internally published. You also need to have a very simple automatic rollback process if you do accidentally get a bad build out, so that artists lose 15 minutes waiting for the rollback, not hours waiting for a bug fix. (part of being able to rollback means never writing out non-recreatable files that old versions can't load).

Obviously you do want pretty quick turnaround to get new features out; in some cases you artists/designers don't want to wait for the next stable internal publication. I believe this is best accomplished by having a mini working team where the coder making a feature and the designer implementing it just pass builds directly between each other. That way if the coder makes bugs in their new feature it only affects the one designer who needs that feature immediately. The rest of the team can wait for the official internal publication to get those features.

02-27-11 | Web Shows

My picks :

The old classic is Clark and Michael . It gets a bit repetitive, but not bad.

Wainy Days is pretty great. The super fast paced insanity of it is hard to take at first, but once you watch a few you get into the groove. It's really like a sitcom for the web generation; everything is super compressed, all the unfunny talky bits in a normal sitcom are squeezed out, and any pretense of logical consistency is abandoned, but otherwsie it's standard sitcom fare. I've watched some other Wain products before and found them intolerable (eg. Wet Hot and The Ten are just awful), I think he's much better in 5 minute doses.

I really enjoyed Thumbs Up! . Thumbs up America!

In related news, I now watch Youtube with no comments, thanks to CommentSnob. This saves me from going into a spiral of despair about how fucking awful all of humanity is every time I visit Youtube.

The next step is blocking comments from just about everywhere. I found these but haven't tried them yet :

YouTube Comment Snob Add-ons for Firefox
Stylish Add-ons for Firefox
Steven Frank shutup.css
CommentBlocker - Google Chrome extension gallery

The ideal thing would be for all web pages to have their comments initially hidden, with a "show comments" button I can click.

I would also like an "ignore user" button that I can apply myself to web forums, comments, ebay sellers, etc. that will make it so I never have to see the web contribution from certain sources ever again.

02-27-11 | Nowhere to invest

I believe there is basically nowhere good to invest any more. Bonds, commmodities, stocks, savings, etc.

I believe the fundamental reason is because of hedge funds, large automatic quant funds, cheap fed capital, and the easy flow of huge amounts of money. Basically if there is an edge anywhere in the market, the large funds will immediately pump tons of money into that sector, and that edge will go away. They move funds very quickly and they move enough to shift the profit margin, and they keep doing it until it's no longer desirable.

Put another way, why the hell would a company want your capital? The reason why investments return a profit is because somebody needs your money for their company. In exchange for your $10k they pay you back some interest. But when the fed funds rate is near 0%, the big banks can suck out any amount of money they want and give it to whoever needs capital. So why the hell would any company deal with paying a return to small individual investors when they can just get capital so easily from the big banks? The answer of course is that they don't. When capital is cheap, the rate of return on investments rapidly goes to zero.

I believe that the last 100 years or so (and mainly from 1950-2000) has been an anomaly - a period when individuals could easily get a nice return from investments, and that there's no reason to think we will have this in the future. In the last 100 years stocks have return 3-5% after inflation, which roughly tracks GDP growth (it's no surprise; total stock market value and dividend yield both directly track GDP growth).

Now, obviously if you look at the market recently it doesn't look weak. But I believe that we are in a very large bubble. It's an unusual bubble because it appears to be affecting almost every type of investment - including things that should normally track opposite of each other. That's very strange. A quick rundown of what I think is happening :

Stocks : I believe the current stock bubble is driven by these factors : 1. massive free money from the Fed has to go somewhere; when it turns into buy orders, stocks have to rise (obviously in theory the traders could say "no I don't want the free money because I don't see anywhere profitable to invest it", but LOL that doesn't actually happen). 2. massive free money makes the economic numbers look better than they really are. 3. under-counting of inflation and unemployment makes the economic numbers look better than they are. 4. individual investors and such have tons of money in 401ks and such and don't know what else to do with it, so it goes back into the market. I think everyone sane should realize that stock valuations right before the Great Recession were way out of whack - and values are right back to that level, which makes no sense since nothing good has happened in the economy, in fact we're fundamentally much worse now than before the GR.

Gold : historically, from 1850-2000, gold returned slightly *below* inflation, eg. real return was negative. The only period when it returned well was during the massive inflation era of the 70's (and the last 10 years). Gold should generally track against other economic trends. Yet gold has done very well for the past 10 years, even during times that stocks have gone up and inflation is nominally low. I don't think this makes any sense and I believe that gold is in a massive bubble. If nothing else, when you have people like Glen Beck touting gold, and all the "Buy Gold Now!" web sites pushing it, there's obviously some bubble aspect, and it's all very reminiscent of real estate 5 years ago.

Real Estate : residential & commercial are both bad. Oddly, REIT funds are up to levels matching before the GR, to me this clearly indicates they are at bubble levels. Residential values in most of the US never fell as much as they should have, and tons of the MBS's that the Fed continues to buy up are not worth what they claim. Some people think commercial real estate is due for a big crash; I'm not sure a crash will actually happen, but if it doesn't it will mean a weak market for many years.

Bonds - oddly bond values have gone up at the same time as stocks. Treasuries in particular have been hovering around zero yield, which only makes sense if you think there is a real risk of massive corporate failure or currency crisis or other reasons why you wouldn't just keep your money in a bank returning 1% or a AAA corporate bond returning 4%. (BTW I was always a little confused about the whole idea that bond values rise when yields drop, but really it's very simple. Yields are the return of newly issues bonds. If you have an old bond that returns 4% and the newly issues yield is 1%, then people who want a 1% return could buy your old bond at a 3% higher price and get the same yield ; that is, prices move to make the final return at maturity the same). Bond prices should generally track against stocks, but they've both been doing quite well since the GR.

I don't see anywhere good to put money. And furthermore I do think there is some risk of heavy inflation, so sitting with cash in the bank isn't great either. (a few ideas : I believe gold is so inflated it might actually be a good short; it's also possible that real estate in some areas might be a good investment, at least real estate is a hedge against inflation even if it doesn't return anything; unfortunately Seattle is one of the places where real estate hasn't fallen as much as it should which will lead to long term price stagnation, or even declines in real dollars).

02-24-11 | RRZ On 16 bit Images

Jan Wassenberg sent me a 16 bit test image, so I got my old PNG-alike called RRZ working on 16 bit. (many old posts on RRZ, search for PNG).

The predictors all work on a ring, that is, they wrap around [0,uint_max] so you need to use the right uint size for your pixel type. To make this work I just took my 8-bit code and made it a template, and now I work on 8,16, and 32 bit pixels.

RRZ without any changes does pretty well on 16 bit data :

Original : (3735x2230x4x2)

Zip :

PNG : (*1)

JPEG-2000 :


RRZ default :  (-m5 -z3 -fa -l0) (*2)

My filter 4 + Zip : (*3)

RRZ with zip-like options : (-m3 -z4 -f4 -l0)

RRZ optimized : (-m3 -z5 -f4 -l1)

My filter 4 + LZMA :

*1 : I ran pngcrush but couldn't run advpng or pngout because they fail on 16 bit data.

*2 : min match len of 5 is the default (-m5) because I found in previous testing that this was best most often. In this case, -m3 is much better. My auto-optimizer finds -m3 successfully. Also note that seekChunkReset is *off* for all these RRZ's.

*3 : filter 4 = ClampedGrad, which is best here; default RRZ filter is "adaptive" because that amortizes against really being way off the best choice, but is usually slightly worse than whatever the best is. Even when adaptive actually minimizes the L2 norm of prediction residuals, it usually has worse compression (than a uniform single filter) after LZH because it ruins repeated patterns since it is chosing different predictors on different scan lines.

Note that I didn't do anything special in the back-end for the 16 bit data, the LZH still just works on bytes, which means for example that the Huffman gets rather confused; the most minimal change you could do to make it better would be to make your LZ matches always be even numbers - so you don't send the bottom bit of match len, and to use two huffmans for literals - one for odd positions and one for even positions. LZMA for example uses 2 bits of position as context for its literal coding, so it knows what byte position you are in. Actually its surprising to me how close RRZ (single huffman, small window) gets to LZMA (arithmetic, position context, large window) in this case. It's possible that some transpose might help compression, like doing all the MSB's first, then all the LSB's, but maybe not.

ADDENDUM : another thing that would probably help is to turn the residual into a variable-byte code. If the prediction residual is in [-127,127] send it in one byte, else send 0xFF and send a two byte delta. This has the disadvantage of de-aligning pixels (eg. they aren't all 6 or 8 bytes now) but for small window LZ it means you get to fit a lot more data in the window. That is, the window is a much larger percentage of the uncompressed file size, which is good.

As part of this I got 16-bit PNG reading & writing working, which was pretty trivial. You have to swap your endian on Intel machines. It seems to be a decent format for interchanging 16 bit data, in the sense that Photoshop works with it and it's easy to do with libPNG.

I also got my compressor working on float data. The way it handles floats is via lossless conversion of floats to ints in an E.M fixed point format, previously discussed here and here . This then lets you do normal integer math for the prediction filters, losslessly. As noted in those previous posts, normal floats have too much gap around zero, so in most cases you would be better off by using what I call the "normal form" which treats everything below 1.0 as denorm (eg. no negative exponents are preserved) though obviously this is lossy.

Anyway, the compressor on floats seems to work fine but I don't have any real float/HDR image source data, and I don't know of any compressors to test against, so there you go.

ADDENDUM: I just found that OpenEXR has some sample images, so maybe I'll try those.

ADDENDUM 2 : holy crap OpenEXR is a retarded distribution. It's 22 MB just for the source code. It comes with their own big math and threading library. WTF WTF. If you're serious about trying to introduce a new interchange format, it should be STB style - one C header. There's no need for image formats to be so complex. PNG is over-complex and this is 100X worse. OpenEXR has various tile and multi-resolution streams possible, various compressors, the fucking kitchen sink and pot of soup, WTF.

02-23-11 | Some little coder things : Tweakable Vars

So a while ago I did the Casey "tweakable C" thing. The basic idea is that you have some vars in your code, like :

static float s_tweakFactor = 1.5f;

or whatever, and your app is running and you want to tweak that. Rather than write some UI or whatever, you just have your app scan it's own source code and look for "s_tweakFactor =" (or some other trigger string) and reparse the value from there.

So I put this in cblib a few years ago; I use ReadDirChanges to only do the reparse when I see a .c file is changed, and I actually use a C++ constructor to register the tweakable vars at startup, so you have to use something like :

static TWEAK(float,s_tweakFactor,1.5f);

which is a little ugly but safer than the prettier alternatives. (the parser looks for TWEAK to find the vars).

I thought Casey's idea was very cool, but then I proceeded to never actually use it.

Part of the issue is that I already had a text Prefs system which already had auto-reloading from file changes, so any time I wanted to tweak things, I would make a pref file and tweak in there. That has the advantage that it's not baked into the code, eg. I can redistribute the pref with the exe and continue to tweak. In general for game tweaking I think the pref is prefferable.

But I just recently realized there is a neat usage for the tweak vars that I didn't think of. They basically provide a way to set any value in my codebase by name programatically.

So, for example, I can now set tweak vars from command line. You just use something like :

app -ts_tweakFactor=2.f fromfile tofile arg arg

and it lets you do runs of your app and play with any variable that has TWEAK() around it.

The other thing it lets me do is optimize any variable. I can now use the generic Search1d thing I posted earlier and point it anything I have registered for TWEAK and it can search on that variable to maximize some score.

02-23-11 | Some little coder things : Clip

I wrote a little app called "clip" that pastes its args to the clipboard. It turns out to be very handy. For example it's a nice way to get a file name from my DOS box into some other place, because DOS does arg completion, I can just type "clip f - tab" and get the name.

The other big place its been useful is copying command lines to the MSVC debugger property sheet, and turning command lines into batch files.

Clip is obviously trivial, the entire code is :

void CopyStringToClipboard(const char * str);

int main(int argc,const char *argv[])
    String out;
    for(int argi=1;argi < argc;argi++)
        if ( argi > 1 ) out += " ";
        out += argv[argi];
    lprintf("clip : \"%s\"\n",out);
    return 0;

void CopyStringToClipboard(const char * str)

    // test to see if we can open the clipboard first before
    // wasting any cycles with the memory allocation
    if ( ! OpenClipboard(NULL))
    // Empty the Clipboard. This also has the effect
    // of allowing Windows to free the memory associated
    // with any data that is in the Clipboard

    // Ok. We have the Clipboard locked and it's empty. 
    // Now let's allocate the global memory for our data.

    // Here I'm simply using the GlobalAlloc function to 
    // allocate a block of data equal to the text in the
    // "to clipboard" edit control plus one character for the
    // terminating null character required when sending
    // ANSI text to the Clipboard.
    HGLOBAL hClipboardData;
    hClipboardData = GlobalAlloc(GMEM_DDESHARE,strlen(str)+1);

    // Calling GlobalLock returns to me a pointer to the 
    // data associated with the handle returned from 
    // GlobalAlloc
    char * pchData;
    pchData = (char*)GlobalLock(hClipboardData);
    // At this point, all I need to do is use the standard 
    // C/C++ strcpy function to copy the data from the local 
    // variable to the global memory.
    strcpy(pchData, str);
    // Once done, I unlock the memory - remember you 
    // don't call GlobalFree because Windows will free the 
    // memory automatically when EmptyClipboard is next 
    // called. 
    // Now, set the Clipboard data by specifying that 
    // ANSI text is being used and passing the handle to
    // the global memory.
    // Finally, when finished I simply close the Clipboard
    // which has the effect of unlocking it so that other
    // applications can examine or modify its contents.

(BTW note that the lprintf of my string class in main is not a bug - that's an autoprintf which handles everything magically and fantastically)

(I didn't remember where I got that clipboard code, but a quick Google indicates it came from Tom Archer at CodeProject )

02-23-11 | Some little coder things : Loop

We talked a while ago about how annoying and error-prone for loops are in C. Well at first I was hesitant, but lately I have started using "for LOOP" in earnest and I can now say that I like it very much.

#define LOOP(var,count) (int var=0;(var) < (count);var++)
#define LOOPBACK(var,count) (int var=(count)-1;(var)>=0;var--)
#define LOOPVEC(var,vec)    (int var=0, loopvec_size = (int)vec.size();(var) < (loopvec_size);var++)

so for example, to iterate pixels on an image I now do :

for LOOP(y,height)
    for LOOP(x,width)
        // do stuff

the way I can tell that this is good is because I find myself being annoyed that I don't have it in my RAD code.

There are tons of advantages to this that I didn't anticipate. The obvious advantages were : less bugs due to mistakes in backwards iteration with unsigned types, reducing typing (hence less typo bugs), making it visually more clear what's happening (you don't have to parse the for(;;) line to make sure it really is a simple counting iteration with nothing funny snuck in.

The surprising advantages were : much easier to change LOOP to LOOPBACK and vice versa, much easier to use a descriptive variable name for the iterator so I'm no longer tempted to make everything for(i).

One thing I'm not sure about is whether I like LOOPVEC pre-loading the vector size. That could cause unexpected behavior is the vector size changes in the iteration.


Drew rightly points out that LOOPVEC should be :

#define LOOPVEC(var,vec)    (int var=0, var##size = (int)vec.size();(var) < (var##size);var++)

to avoid variable name collisions when you nest them. But I think it should probably just be

#define LOOPVEC(var,vec)    (int var=0; (var) < (int)vec.size(); var++)

Though that generates much slower code, when you really care about the speed of your iteration you can pull the size of the vec out yourself and may do other types of iterations anyway.

02-23-11 | Some little coder things : Error cleanup with break

I hate code that does error cleanup in multiple places, eg :

    FILE * fp = fopen(fileName,"wb");

    if ( ! stuff1() )
        return false;
    if ( ! stuff2() )
        return false;

    // ok!
    return true;

the error cleanup has been duplicated and this leads to bugs.

In the olden days we fixed this by putting the error return at the very end (after the return true) and using a goto to get there. But gotos don't play nice with C++ and are just generally deprecated. (don't get me started on setjmp, WTF is libpng thinking using that archaic error handling system? just because you think it's okay doesn't mean your users do)

Obviously the preferred way is to always use C++ classes that clean themselves up. In fact whenever someone gives me code that doesn't clean itself up, I should just immediately make a wrapper class that cleans itself up. I find myself getting annoyed and having bugs whenever I don't do this.

There is, however, a cleanup pattern that works just fine. This is well known, but I basically never ever see anyone use this, which is a little odd. If you can't use C++ self-cleaners for some reason, the next best alternative is using "break" in a scope that will only execute once.

For example :

rrbool rrSurface_SaveRRSFile(const rrSurface * surf, const char* fileName)
    FILE * fp = fopen(fileName,"wb");
    if ( ! fp )
        return false;
        rrSurface_RRS_Header header;
        if ( ! rrSurface_FillHeader(surf,&header) )
        if ( ! rrSurface_WriteHeader(fp,&header) )
        if ( ! rrSurface_WriteData(fp,surf,&header) )
        // success :
        return true;
    // failure :
    return false;

Really the break is just a simple form of goto that works with C++. When you have multiple things to cleanup obvious you have to check each of them vs uninitialized.

(BTW this example is not ideal because it doesn't give you any info about the failure. Generally I think all code should either assert or log about errors immediately at the site where the error is detected, not pass error codes up the chain. eg. even if this code was "good" and had a different error return value for each type of error, I hate that shit, because it doesn't help me debug and get a breakpoint right at the point where the error is happening.)


Another common style of error cleanup is the "deep nest with partial cleanup in each scope". Something like this :

  bool success = false;
  if ( A = thing1() )
    if ( B = thing2() )
      if ( C = thing3() )
        success = true;
        cleanup C;
      cleanup B;
    cleanup A;

I really hate this style. While it doesn't suffer from duplication of the cleanups, it does break them into pieces. But worst, it makes the linear code flow very unclear and introduces a deep branching structure that's totally unnecessary. Good code should be a linear sequence of imperatives as much as possible. (eg. do X, now do Y, now do Z).

I think this must have been an approved Microsoft style at some point because you see it a lot in MSDN samples; often the success code path winds up indented so far to the right that it's off the page!

02-13-11 | JPEG Decoding

I'm working on a JPEG decoder sort of as a side project. It's sort of a nice small way for me to test a bunch of ideas on perceptual metrics and decode post-filters in a constrained scenario (the constraint is baseline JPEG encoding).

I also think it's sort of a travesty that there is no mainstream good JPEG decoder. This stuff has been in the research literature since 1995 (correction : actually, much earlier, but there's been very modern good stuff since 95 ; eg. the original deblocker suggestion in the JPEG standard is no good by modern standards).

There are a few levels for good JPEG decoding :

It's shameful and sort of bizarre that we don't even have #1 (*). Obviously you want different levels of processing for different applications. For viewers (eg. web browsers) you might do #1, but for loading to edit (eg. in Photoshop or whatever) you should obviously spend a lot of time doing the best decompress you can. For example if I get a JPEG out of my digital camera and I want to adjust levels and print it, you better give me a #2 or #3 decoder!

(* : an aside : I believe you can blame this on the success of the IJG project. There's sort of an unfortunate thing that happens where there is a good open source library available to do a certain task - everybody just uses that library and doesn't solve the problem themselves. Generally that's great, it saves developers a lot of time, but when that library stagnates or fails to adopt the latest techniques, it means that entire branch of code development can stall. Of course the other problem is the market dominance of Photoshop, which has long been the pariah of all who care about image quality and well implemented basic loaders and filters)

So I've read a ton of papers on this topic over the last few weeks. A few notes :

"Blocking Artifact Detection and Reduction in Compressed Data". They work to minimize the MSDS difference, that is to equalize the average pixel steps across block edges and inside blocks. They do a bunch of good math, and come up with a formula for how to smooth each DCT coefficient given its neighbors in the same subband. Unfortunately all this work is total shit, because their fundamental idea - forming a linear combination using only neighbors within the same subband - is completely bogus. If you think about only the most basic situation, which is you have zero AC's, so you have flat DC blocks everywhere, the right thing to do is to compute the AC(0,1) and AC(1,0) coefficients from the delta of neighboring DC levels. That is, you correct one subband from the neighbors in *other* subbands - not in the same subband.

Another common obviously wrong fault that I've seen in several paper is using non-quantizer-scaled thresholds. eg. many of the filters are basically bilateral filters. It's manifestly obvious that the bilateral pixel sigma should be proportional to the quantizer. The errors that are created by quantization are proportional to the quantizer, therefore the pixel steps that you should correct with your filter should be proportional to the quantizer. One paper uses a pixel sigma of 15 , which is obviously tweaked for a certain quality level, and will over-smooth high quality images and under-smooth very low quality images.

The most intriguing paper from a purely mathematical curiosity perspective is "Enhancement of JPEG-compressed images by re-application of JPEG" by Aria Nosratinia.

Nosratinia's method is beautifully simple to describe :

Take your base decoded image

For all 64 shifts of 0-7 pixels in X & Y directions :

  At all 8x8 grid positions that starts at that shift :

    Apply the DCT, JPEG quantization matrix, dequantize, and IDCT

Average the 64 images

That's it. The results are good but not great. But it's sort of weird and amazing that it does as well as it does. It's not as good at smoothing blocking artifacts as a dedicated deblocker, and it doesn't totally remove ringing artifacts, but it does a decent job of both. On the plus side, it does preserve contrast better than some more agressive filters.

Why does Nosratinia work? My intuition says that what it's doing is equalizing the AC quantization at all lattice-shifts. That is, in normal JPEG if you look at the 8x8 grid at shift (0,0) you will find the AC's are quantized in a certain way - there's very little high frequency energy, and what there is only occurs in certain big steps - but if you step off to a different lattice shift (like 2,3), you will see unquantized frequencies, and you will see a lot more low frequency AC energy due to picking up the DC steps. What Nosratinia does is remove that difference, so that all lattice shifts of the output image have the same AC histogram. It's quite an amusing thing.

One classic paper that was way ahead of its time implemented a type 3 (MAP) decoder back in 1995 : "Improved image decompression for reduced transform coding artifacts" by O'Rourke & Stevenson. Unfortunately I can't get this paper because it is only available behind IEEE pay walls.

I refuse to give the IEEE or ACM any money, and I call on all of you to do the same. Furthermore, if you are an author I encourage you to make your papers available for free, and what's more, to refuse to publish in any journal which does not give you all rights to your own work. I encourage everyone to boycott the IEEE, the ACM, and all universities which do not support the freedom or research.

02-11-11 | Some notes on EZ-trees

I realized a few weeks ago that there is an equivalence between EZ-tree coding and NOSB unary less-than-parent coding. Let me explain what that means.

EZ-tree coding means coding values in bitplanes tree-structured flagging of significance and insignificance. "NOSB" means "number of singificant bits". "significant" at bit level b means the value is >= 2^b . (if you like, this is just countlz , it's the position of the top bit, 0 means there is no top bit, 1 means the top bit is the bottom bit, etc)

"NOSB" encoding is a way of sending variable length values. You take the number, find the number of signficant bits, then you send that number using some scheme (such as unary), and then send the bits. So, eg. the value 30 needs 5 bits, so first you send 5 (using unary that would be 111110), then you send 11110. A few unary-NOSB encoded values follow :

0 : 1
1 : 01
2 : 001,0
3 : 001,1
4 : 0001,00
5 : 0001,00
6 : 0001,01
7 : 0001,02

To be concrete I'll talk about a 4x4 image :


The traditional EZ-tree using a parent child relationship where the lower case quartets (b,c,d) are children of the upper case letters. The spot B has its own value, and it also acts as the parent of the b quartet. In a larger image, each of the b's would have 4 kids, and so on.

In all cases we are talking about the ABS of the value (the magnitude) and we will send the sign bit separately (unless the value is zero).

EZ-tree encoding goes like this :

1. At each value, set 
significance_level(me) = NOSB(me)
tree_significance_level(me) = MAX( significance_level(me), tree_significance_level(children) )

so tree_significance_level of any parent is >= that of its kids (and its own value)

2. Send the max significance_level of B,C, and D
  this value tells us where to start our bitplane iteration

3. Count down from level = max significance_level down to 0
  For each level you must rewalk the whole tree

4. Walk the values in tree order

5. If the value has already been marked signficant, then transmit the bit at that level

5.B. If this is the first on bit, send the sign

6. If the value has not already been sent as significant, send tree_significant ? 1 : 0

6.B. If tree not significant, done

6.C. If tree significant, send my bit (and sign if on) and proceed to children 
    ( (**) see note later)

In the terms of the crazy EZW terminology, if your tree is significant but you are not, that's called an "isolated zero". When you and your children are all not singificant, that's called a "zerotree". etc.

Let's assume for the moment that we don't truncate the stream, that is we repeat this for all singificance levels down to zero, so it is a lossless encoder. We get compression because significance level (that is, log2 magnitude) is well correlated between parent-child, and also between spatial neighbors within a subband (the quartets). In particular, we're making very heavy use of the fact that significance_level(child) <= singificance_level(parent) usually.

The thing I realized is that this encoding scheme is exactly equivalent to NOSB coding as a delta from parent :

1. At each value, set NOSB(me) = number of singificant bits of my value, then
NOSB(me) = MAX( NOSB(me) , NOSB(children) )

2. Send the maximum NOSB of B,C, and D 
  this value will be used as the "parent" of B,C,D

3. Walk down the tree from parents to children in one pass

4. At each value :
    Send ( NOSB(parent) - NOSB(me) ) in unary
   note that NOSB(parent) >= NOSB(me) is gauranteed

5. If NOSB(me) is zero, then send no bits and don't descend to any of my children

6. If NOSB(me) is not zero, send my bits plus my sign and continue to my children

This is not just similar, it is exactly the same. It produces the exact same output bits, just permuted.

In particular, let's do a small example of just one tree branch :

B = 6 (= 110)

bbbb = {3,0,1,2}

Significance(B) = 3

Significance(bbbb) = {2,0,1,2}

And the EZ-tree encoding :

Send 3 to indicate the top bit level.

Level 3 :

send 1  (B is on)
  send 1 (a bit of B)
  send a sign here as well which I will ignore

Go to children
send 0000 (no children on)

Level 2 :

B is on already, send a bit = 1

Go to children
send significance flags :

For significant values, send their bits :
1  1

Level 1 :

B is on already, send a bit = 0

Go to children
send significance flags for those not sent :

send bits for those already singificant :
1 10


Bits sent :

 1  1
 1 10

but if we simply transpose the bits sent (rows<->columns) we get :



Which is clearly unary + values :

1 + 1110

01 + 11
000 (*)
001 + 1
01 + 10

* = unary for 3 would be 0001 , but there's no need to send the last 1
because we know value is <= 3

exactly the same !

(**) = actually at the bottom level (leaves) when you send a significance flag you don't need to send the top bit. The examples worked here treat the b,c,d groups as nodes, not final leaves. If they were leaves, the top bits should be omitted.

So, that's pretty interesting to me. Lots of modern coders (like ADCTC) use NOSB encoding, because it gives you a nice small value (the log2) with most of the compressability, and then the bits under the top bit are very uncompressable, and generally follow a simple falloff model which you can context-code using the NOSB as the context. That is, in modern coders the NOSB of a value is first arithmetic coded using lots of neighbor and parent information as context, and then bits under the top bit are coded using some kind of laplacian or gaussian simple curve using the NOSB to select the curve.

We can see that EZW is just a NOSB coder where it does two crucial things : set NOSB(parent) >= NOSB(child) , and transmit NOSB as | NOSB(parent) - NOSB(child) |. This relies on the assumption that parents are generally larger than kids, and that magnitude levels are correlated between parents and kids.

Forcing parent >= child means we can send the delta unsigned. It also helps efficiency a lot because it lets us stop an entire tree descent when you hit a zero. In the more general case, you would not force parent to be >= child, you would simply use the correlation by coding ( NOSB(parent) - NOSB(child) ) as a signed value, and arithmetic-code / model that delta ( using at least NOSB(parent) as context, because NOSB(parent) = 0 should very strongly predict NOSB(child) = 0 as well). The big disadvantage of this is that because you can send child > parent, you can never stop processing, you must walk to all values.

Of course we can use any parent-child relationship, we don't have to use the standard square quartets.

The NOSB method is vastly preferrable to the traditional EZ-Tree method for speed, because it involves only one walk over all the data - none of this repeated scanning at various bit plane levels.

A few more notes on the EZ-Tree encoding :

At step 6, when you send the flag that a tree is significant, there are some options in the encoding. If your own value is on, then it's possible that all your children are off. So you could send another flag bit indicating if all your children are 0 or not. Generally off is more likely than on, so you could also send the number of children that are on in unary, and then an identifier of which ones are on; really this is just a fixed encoding for the 4 bit flags so maybe you just want to Huffman them.

The more interesting case is if you send that your tree is significant, but your own value is *off*. In that case you know at least one of your children must be significant, and in fact the case 0000 (all kids insignificant) is impossible. What this suggests is that a 5 bit value should be made - one bit from the parent + the four child flags, and it should be Huffman'ed together. Then the 5 bit value 00000 should never occur.

It's a little unclear how to get this kind of action in the NOSB formulation. In particular, the fact that if parent is sigificant, but parents bits so far are zero, then one of the kids must be on - that requires coding of the children together as a unit. That could be done thusly : rather than using unary, take the delta of NOSB from parent for all of the children. Take the first two bits or so of that value and put them together to make an 8 bit value. Use parent bit = 0 as a 1 bit context to select two different huffmans and use that to encode the 8 bits.

Finally, a few notes on the "embedded" property of EZ-trees ; that is, the ability to truncate and get lower bitrate encodings of the image.

Naively it appears that the NOSB formulation of the encoding is not truncatable in the same way, but in fact it is. First of all, if you truncate entire bit levels off the bottom, you can simply send the number of bit levels to truncate off and then you effecitvely just shift everything down by that number of bits and then proceed as normal. If you wish to truncate in the middle of a bit level, that means only sending the bottom bit for the first N values in that bit level, and then storing 0 implicitly for the remaining values. So you just have to send N and then check in the decoder; in the decoder for the first N values it reads all the bits, and then for remaining values it reads NOSB-1 bits and puts a zero in the bottom. Now you may say "that's an extra value you have to send" ; well, not really. In the EZ-tree if you just truncate the file you are effectively sending N in the file size - that is, you're cheating and using the file size as an extra information channel to send your truncation point.

One thing I don't see discussed much is that EZ-tree truncated values should not just be restored with zeros. In particular, truncation is not the same as quantization at a coarser level, because you should sometimes round up and set a higher bit. eg. say you have the value 7 and you decided to cut off the bottom 3 bits. You should not send 0, you should send 8 >> 3 = 1.

A related issue is how you restore missing bottom bits when you have some top bits. Say you got 110 and then 3 bits are cut off the bottom so you have [110???] you should just just make zeros - in fact you know your value is in the range 110000 - 110111 ; filling zeros puts you at the bottom of the range which is clearly wrong. You could go to the middle of the range, but that's also slightly wrong because image residuals have laplacian distribution, so the expected value is somewhere below the middle of the range. I have more complex solutions for this, but one very simple bit-shifty method is like this :

To fill the missing bits
Add one 0 bit
Then repeat the top bits

So 110??? -> 110 + 0 + 11 , 101???? -> 1010101 , etc.

Of course the big deal in EZ-trees is to sort the way you send data so that the most important bits come first. This is like R/D optimizing your truncation. See SPIHT, EBCOT, etc. Modern implementations of JPEG2000 like Kakadu have some perceptual D heuristic so that they do more truncation where bits are less important *perceptually* instead of just by MSE.

01-24-11 | Blah Blah Blah

I'm reading "Creation" now and one tidbit is that Pythagorus believed that beans contained men's souls. A while ago I read a bunch of Ghandi, and one thing that struck me was Ghandi's bizarre emphasis on homespun clothing and spinning your own cotton (the other thing that struck me is that Ghandi was kind of a dick, but that's a rant for another day). Newton believed that God mediated gravity and that any attempt to explain it physically was not only foolish but sacrilege. It's strange to separate the man who can believe in some utter nonsense from the man who is quite reasonable and intelligent, and it just seems so odd to me that so many people can have both aspects in them.

Having ideas is fucking easy. People love to say shit like "I had the idea for Facebook two years before it came out, I could be rich!". Big fucking whoop, I'm sure tons of people had that idea (obviously Myspace had that idea, as did ConnectU, etc.). Ideas are fucking easy, I have a million ideas a day. The hard thing is identifying the ideas that are the really good ones, and then making the decision to go for them. Anyone who's smart and creative has ideas, but we're afraid to go for it, or we don't believe in it enough to take a risk, or whatever. I see lots of people who sit on the sidelines and make retarded comments like this ("I had that idea! I'm so smart!"), but there are also plenty of businesses that have made their fortune and falsely think it was because of their "great ideas" (in general the ability of businesses to be un-self-aware and not understand why they are successful is astonishing).

Working on software tools that enhance my own computing experience is incredibly satisfying. The things that I've done in the last few years that please me most are my NiftyPerforce replacement, my window manager, autoprintf, my google chart maker, my bitmap library, etc. things that I use in my coding life to write more code. It's like being a metal worker and spending your time making tools for metalworking. It's incredibly satisfying because you use these things every day so you get to have the benefit yourself. Paul Phillips talked about the "exponential productivity boost of writing software for yourself" ; in theory there is an exponential benefit, because if writing software tools makes you X% more efficient at writing software tools, you can write more to help yourself, then even more, etc. I believe in practice that it is in fact not exponential, but I'm not sure why that is.

There is, however, an interesting non-linear jump in tool making and process enhancement. The issue is that what we do is somewhat art. You need inspiration, you need to be in the right mindset, you need to be able to play with your medium. When the craft is too difficult, when there's too much drudge work, you sap the vital juices from your mind, and you will never have a big epiphany. If you spend some time just working on your tools and process, you can make the actual act of creation easier and more pleasant for yourself, so that you come into it with a totally different mind set and you have different kinds of ideas. It might seem more efficient to just knock out the work the brute force way, but there is something magical that happens if you transform the work into something that feels natural and fun to play with and experiment with.

Things I want to avoid : dumb TV, booze, sugar, surfing the net, web forums, lying on the couch. But god damn, when you cut all those things out, life is hard. When night time rolls around and you're tired and bored, it's hard to get through the dark hours without those crutches. My real goal is to spend less time on the computer that's not real good productive time.

It's depressing trying to manage your investments in a down-market period. Despite the fact that we're in a mini-bubble false recovery at the moment, I believe that stocks will performly badly for the next 10 years or so; if you can beat inflation during this period you are doing well. I believe that there is very little skill in most "business" ; if you happen to start a company during an up-swing in the economy, you will do well and think you are a genius, if you do it during a down-swing you will do badly (but still probably think you are a genius). In particular, if you are lucky enough to be able to run something with big leverage during a general market up-swing, you basically just get to print free money. It's not that you were some brilliant real estate developer (or whatever), it's just that you put big money into the market when all boats were rising.

01-23-11 | Cars.com extraction

I wrote a little program to extract records from Cars.com and make charts. For example :

Subaru WRX'es :

2002 : median=  9394.00 mean=  9462.42 sdev=  1932.74
2003 : median=  9995.00 mean= 10287.76 sdev=  1667.03
2004 : median= 11495.00 mean= 11295.16 sdev=  2122.11
2005 : median= 13995.00 mean= 13306.94 sdev=  1892.77
2006 : median= 15995.00 mean= 15854.00 sdev=  2065.38
2007 : median= 17990.00 mean= 17455.84 sdev=  2029.77
2008 : median= 20500.00 mean= 20591.19 sdev=  1935.96
2009 : median= 23595.00 mean= 23540.47 sdev=  1523.07
2010 : median= 25900.00 mean= 25750.80 sdev=  3505.73
2011 : median= 25989.00 mean= 24726.00 sdev=  1786.15

The WRX results are roughly what our intuition expects. There is a small step down associated with model year, but the steady price decrease for increasing mileage is much stronger. This is however somewhat surprising because there was a major model change in 2008, and the 2008 cars are known to be much worse than the 2009's.

Porsche 911 3.8 S Manual Coupes :

2006 : median= 56900.00 mean= 55829.26 sdev= 4804.42
2007 : median= 61991.00 mean= 61161.85 sdev= 5069.27
2008 : median= 67981.00 mean= 68558.00 sdev= 4532.27
2009 : median= 79880.00 mean= 79492.58 sdev= 6171.92

The Porsche is the exact opposite of the WRX. I'm surprised at how stable prices are vs mileage, but there is a very steady step down with model year. This is also strange since the 2006-2008 cars are identical.

Honda Civics , Manual , excluding the new 2.0 I4 :

2000 : median=  6990.00 mean=  6978.33 sdev=   977.55
2001 : median=  7999.00 mean=  7999.00 sdev=     0.00
2002 : median=  7995.00 mean=  7472.50 sdev=   738.93
2003 : median=  6995.00 mean=  6852.88 sdev=  1109.37
2004 : median=  8995.00 mean=  8615.50 sdev=  1186.42
2005 : median=  8657.00 mean=  8710.30 sdev=  1412.42
2006 : median= 12495.00 mean= 12604.31 sdev=  2074.50
2007 : median= 12990.00 mean= 13538.41 sdev=  2174.33
2008 : median= 14694.00 mean= 14918.31 sdev=  2282.32
2009 : median= 17975.00 mean= 17244.96 sdev=  2532.70

Hondas behave strangely on the market. There is very little depreciation with mileage for a given model year. There is a big price step at 2006 when the new model came out - people seem to like the new one much better - but other than that there is very little depreciation with age either.

01-19-11 | Good practices

Some things that I always regret when I don't do them. These are as much reminders for myself not to get lazy as they are finger-wags at you all.

1. Save every log.

My programs generally log very verbosely. I log more to the file than to stdout. The key thing is that you shouldn't just overwrite your previous log. You should save every log of your runs *forever*. Disks are big, there is no reason to ever clean this up.

Your log file should contain the time & date of the run + the time & date of the build (this is mildly annoying to do in practice, just using __DATE__ somewhere in your code doesn't work because that module may not be fresh compiled). Ideally it would have the P4 sync state of the code as well (see comments). Ideally it would also log the modtime and/or checksum of every file that you take as input, so you can associate a run with the condition of the files it loads as well.

This is an okay way to start your logs :

// log the command line :
void lprintfCommandLine(int argc,char * argv[])
    for(int i=0;i < argc;i++)
        lprintf("%s ",argv[i]);

// log the command line + build time :
void lprintfLogHeader(const char * progName,int argc,char * argv[])
    __time64_t long_time;
    _time64( &long_time );
    // note: asctime has a \n in it already
    lprintf("Log opened : %s",asctime(_localtime64( &long_time )));
    lprintf("%s built %s, %s\n",progName,__DATE__,__TIME__);    
    lprintf("args: ");

Something I've doing for a while now is to automatically write my logs to "c:\logs\programname.log" , and if a previous one of those exists I append it onto programname.prev . That's okay but not perfect, for one thing the big prev file gets hard to search through; perhaps worse, it doesn't play well with running multiple instances of your program at once (their logs get interleaved and the prev is moved at a random spot).

My videotest does something that I like even better. It makes a new log file for each run and names it by the date/time of the run and the command line args. They all go in a dir "c:\logs\programname\" and then the names are like "-c0rpjt1.rmv-irparkjoy_444_720p_lagarith_50.avi-p-d0-Sun-Jan-16-15-42-27-2011.log" which makes it very easy for me to find particular runs and see what the args were.

2. Make tests reproducable.

Often I'll run some tests and record the result, and then later when I go back to it I'm confused by what was run exactly, where the data files were, etc. Doing this is really just about discipline. There are a few things that I believe help :

2.A. Always make batch files. When you run command lines, do not just type in the command line and run it to execute your test. Put the command line in a batch file and run that. Then check your batch file into perforce!

2.B. Take the results of the test and write a quick note about how they were run and what they were testing. eg. exactly what #define did you flip in the code to run the test. It's so easy to not take this step because you think "its obvious" what you did, but 12 months later it won't be obvious. Check this into perforce.

2.C. For more interesting tests, make a directory and copy the whole thing! Copy in the data files that you ran on, the batch files, the results, and the code! Also copy in any tools you used! eg. if you used imagemagick or ffmpeg or whatever as part of your process, just copy it in to the directory. Disk is cheap and huge! Save the whole thing somewhere so you have a record of what exactly you ran.

2.D. If you changed code to run the test - check it in! Even if it's just for the test run - check in the test code and then check it in back to normal. eg. if you flip a #define or change some constants - check that in to P4 with a description saying "for test such and such".

(ASIDE : the first company I ever worked out was CTC in Houston. At CTC when we finished a major project, we would make an archive of the entire project and everything we needed to reproduce it. That means the code, the compilers, any tools or libraries, and the whole OS. We could take an archive and restore it to a fresh machine and instantly have a working system that would build it. I just thought that made a lot of sense and obviously every developer did that, right? Nope. I have yet to see anyone else do it since. People seem to believe that they can just archive the code without recording everything that needed to be done to the tools and build environment and so on to make a working build system.)

3. Never leave magic numbers in code.

I often stumble back on some old code and find something like :

x = y + 7 * z;

umm.. WTF where did that 7 come from? Even if you only use that value in one spot in the code, give it a name. eg.

const double c_averageACtoDCratio = 7;

x = y + c_averageACtoDCratio * z;

Ah, okay. Which is related to :

4. When values come from training or tweaking, document exactly how they were generated.

Ideally you saved the training run as per #2, so you should be able to just add a comment like

// this value comes from training; see train_averageACtoDCratio.bat

You may think you've found the perfect tweak value, but some day you may come back to it and think "where did this come from? am I sure this right? does this still apply since I've changed a bunch of other things since then?". Also, what data did you train it on exactly? Does that data actually reflect what you're using it for now?

Ideally ever value that comes from training can be regenerated at any time. Don't just do your training run and save the value and then break your training code. Assume that any tweak value will have to be regenerated in the future.

5. Some notes on default values.

In more complex programs, don't just put the default value of a parameter in its initializer, eg.

int maxNumMovecSearches = 32;

no no. The problem is that the initializer can be hidden away in a weird spot, and there may be multiple initializers, and you may want to reset values to their defaults, etc. Instead give it a proper name, like :

const int c_maxNumMovecSearches_default = 32;

int maxNumMovecSearches = c_maxNumMovecSearches_default;

Now you can also show defaults on the command line, such as :

lprintf("videotest usage:\n");
lprintf("-m# : max num movec searches [%d]\n",c_maxNumMovecSearches_default);

For args that are unclear what scale they're on, you should also show a reasonable parameter range, eg.

lprintf("-l# : lagrange lambda [%f] (0.1 - 1.0)\n",c_lagrangeLambda_default);

(though this example violates my own rule of baking constants into strings)

6. Make all options into proper enums with names.

You should never have a "case 2:" in your code or an "if ( mode == 1 )". Every time I short-cut and do that I wind up regretting it. For example I had some code doing :


To select the 6th qtable. Of course that inevitably leads to bugs when I decide to reorder the qtables.

Give your enums the right name automatically using XX macros , eg. :

#define DCTQTableIndex_XX  \
    XX(eDctQ_Flat) YY(=0),\

enum EDCTQTableIndex
    #define XX(x) x
    #define YY(y) y
    #undef XX
    #undef YY

const char * c_DCTQTableIndex_names[] =
    #define XX(x) #x
    #define YY(y)
    #undef XX
    #undef YY

And then make sure you echo the name when somebody selects an enum numerically so you can show that they are correct.

Do not keep a table of names and a table of enums that must be kept in sync manually. In general do not keep *anything* in code that must be kept in sync manually.

Show them as part of the command line help, eg :

        for(int f=0;f < eImDiff_Count;f++)
            myprintf("%2d : %s\n",f,c_imDiffTypeNames[f]);

For example if you tried to be a good boy and had something like :

    lprintf("dct qtable names:\n");
    lprintf("0: flat\n");
    lprintf("1: jpeg\n");

You have just "hard-coded" the values through the use of string constants. Don't bake in values.

One nice way is if command line arg "-m#" selects mode #, then "-m" with no arg should list the modes ("-m" followed by anything non-numeric should do the same).

If somebody has to go read the docs to use your program, then your command line user interface (CLUI) has failed. Furthermore docs are another example of bad non-self-maintaining redundancy. The list of arg mappings in the docs can get out of sync with the code : never rely on the human programmer to keep things in sync, make the code be correct automagically.

7. Copy the exe.

It's so easy during work to think of the exe as a temp/derivative item which you are replacing all the time, but you will often get yourself into scenarios where it's hard to get back to some old state and you want to play with the exe from that old state. So just copy off the exe every so often. Any time you generate major test results is a good time for this - output your CSV test files or whatever and just take a copy of the exe used to make the tests.

A semi-related practice that I've taken up now is to copy off the exe out of the build target location any time I run a test, so that I can still build new versions, and if I do build new versions, the version I wanted to test isn't fouled. eg. I use something like :

run_vt_test.bat :

  call zc -o release\vt.exe r:\vt_test.exe
  r:\vt_test.exe %*

8. Automate your processing

If you're running some process that requires a few steps - like run this exe, take the number it outputs, copy it into this code, recompile, run that on this data file - automate it.

It might seem like it's faster just to do it yourself than it is to automate it. That may in fact be true, but automating it has lots of ancillary value that makes it worth doing.

For one thing, it documents the process. Rather than just having to write a document that describes "how to export a character for the game" (which you will inevitably not write or not keep up to date), instead you have your batch file or script or whatever that does the steps.

For another, it gives you a record of what's being done and a way to exactly repeat the process if there are bugs. Any time you have a human step in the process it adds an element of non-repeatability and uncertainty, is this character broken because there's a bug or just because someone didn't follow the right steps?

There are some very simple ghetto tricks to easy automation. One is to have your program fwrite some text to a batch file and then just run that batch.

More generally, I wish my computer kept a full journal of everything I ever did on it. Everything I type, the state of every file, everything run should be stored in a history which I can play back like a movie any time I want. I should just be able to go, "mmkay, restore to the condition of May 13, 2009 and play back from 3:15 to 3:30". Yeah, maybe that's a bit unrealistic, but it certainly is possible in certain limitted cases (eg. apps that don't access the net or take input from any weird drivers) which are the cases I mainly care about.

01-18-11 | Hadamard

The normal way the Hadamard Transform (Wikipedia) is written is not in frequency order like the DCT. For example the 8-item Hadamard as written on Wikipedia is :

    +1 +1 +1 +1 +1 +1 +1 +1
    +1 -1 +1 -1 +1 -1 +1 -1
    +1 +1 -1 -1 +1 +1 -1 -1
    +1 -1 -1 +1 +1 -1 -1 +1
    +1 +1 +1 +1 -1 -1 -1 -1
    +1 -1 +1 -1 -1 +1 -1 +1
    +1 +1 -1 -1 -1 -1 +1 +1
    +1 -1 -1 +1 -1 +1 +1 -1

The correct reordering is :

  { 0 7 3 4 1 6 2 5 }

When that's done what you get is :

    +1 +1 +1 +1 +1 +1 +1 +1
    +1 +1 +1 +1 -1 -1 -1 -1
    +1 +1 -1 -1 -1 -1 +1 +1
    +1 +1 -1 -1 +1 +1 -1 -1
    +1 -1 -1 +1 +1 -1 -1 +1
    +1 -1 -1 +1 -1 +1 +1 -1
    +1 -1 +1 -1 -1 +1 -1 +1
    +1 -1 +1 -1 +1 -1 +1 -1

which more closesly matches the DCT basis functions.

You can of course do the Hadamard transform directly like an 8x8 matrix multiply, but the faster way is to use a "fast hadamard transform" which is exactly analogous to a "fast fourier transform" - that is, you decompose it into a log(N) tree of two-item butterflies; this gives you 8*3 adds instead of 8*8. The difference is the Hadamard doesn't involve any multiplies, so all you need are {a+b,a-b} butterflies.

ADDENDUM : to be more concrete, fast hadamard is this :

    vec[0-7] = eight entries
    evens = vec[0,2,4,6]
    odds  = vec[1,3,5,7]

    Butterfly :
        vec[0-3] = evens + odds
        vec[4-7] = evens - odds

    Hadamard8 :
        Butterfly three times
this produces "canonical order" not frequency order, though obviously using a different shuffle in the final butterfly fixes that easily.

To do 2d , you obviously do the 1d Hadamard on rows and then on columns. The normalization factor for 1d is 1/sqrt(8) , so for 2d it's just 1/8 , or if you prefer the net normalization for forward + inverse is 1/64 and you can just apply it on either the forward or backward. The Hadamard is self-inverting (and swizzling rows doesn't change this).

The correctly ordered Hadamard acts on images very similarly to the DCT, though it compacts energy slightly less on most images, because the DCT is closer to the KLT of typical images.

In these examples I color code the 8x8 DCT or Hadamard entries. The (0,y) and (x,0) primary row and column are green. The (x,y) (x>0,y>0) AC entries are a gradient from blue to red, more red where vertical detail dominates and more blue where horizontal detail dominates. The brightness is the magnitude of the coefficient.

If you look at the two images, you should be able to see they are very similar, but Hadamard has more energy in the higher frequency AC bands.

original :

dct :

hadamard :

I also found this paper :

Designing Quantization Table for Hadamard Transform based on Human Visual System for Image Compression

which applies JPEG-style CSF design to make a quantization matrix for the Hadamard transform.

In straight C, the speed difference between Hadamard and DCT is not really super compelling. But Hadamard can be implemented very fast indeed with MMX or other SIMD instruction sets.

It seems that the idea of using the Hadamard as a rough approximation of the DCT for purposes of error or bit-rate estimation is a good one. It could be made even better by scaling down the high frequency AC components appropriately.

ADDENDUM : some more stats :

root(L2) : 
|126.27 |  7.94 |  4.41 |  2.85 |  2.00 |  1.48 |  1.12 |  0.90 |
|  8.47 |  4.46 |  3.06 |  2.10 |  1.51 |  1.15 |  0.86 |  0.69 |
|  4.86 |  3.27 |  2.43 |  1.76 |  1.34 |  1.02 |  0.78 |  0.62 |
|  3.13 |  2.33 |  1.88 |  1.45 |  1.12 |  0.90 |  0.72 |  0.60 |
|  2.22 |  1.71 |  1.47 |  1.18 |  0.96 |  0.77 |  0.63 |  0.55 |
|  1.62 |  1.26 |  1.13 |  0.98 |  0.80 |  0.65 |  0.58 |  0.56 |
|  1.23 |  0.94 |  0.90 |  0.79 |  0.66 |  0.58 |  0.52 |  0.51 |
|  0.92 |  0.77 |  0.72 |  0.67 |  0.59 |  0.56 |  0.51 |  0.79 |

Hadamard :
root(L2) : 
|126.27 |  7.33 |  4.11 |  3.64 |  2.00 |  2.03 |  1.97 |  1.73 |
|  7.82 |  4.01 |  2.75 |  2.16 |  1.47 |  1.46 |  1.33 |  1.00 |
|  4.51 |  2.92 |  2.15 |  1.69 |  1.28 |  1.21 |  1.09 |  0.81 |
|  3.90 |  2.26 |  1.74 |  1.41 |  1.10 |  1.04 |  0.93 |  0.71 |
|  2.22 |  1.66 |  1.39 |  1.14 |  0.96 |  0.89 |  0.78 |  0.62 |
|  2.24 |  1.59 |  1.28 |  1.06 |  0.88 |  0.81 |  0.74 |  0.60 |
|  2.19 |  1.42 |  1.14 |  0.94 |  0.78 |  0.74 |  0.67 |  0.59 |
|  1.88 |  1.04 |  0.85 |  0.74 |  0.65 |  0.60 |  0.59 |  0.92 |

FracZero : 
|  4.48 | 36.91 | 52.64 | 61.32 | 68.54 | 76.64 | 82.60 | 87.18 |
| 34.78 | 48.93 | 57.95 | 65.56 | 72.17 | 79.63 | 84.84 | 88.69 |
| 49.88 | 57.31 | 63.19 | 69.21 | 76.14 | 81.96 | 86.48 | 89.98 |
| 58.79 | 64.01 | 68.24 | 73.48 | 79.53 | 84.15 | 88.10 | 91.14 |
| 66.13 | 70.03 | 74.37 | 78.79 | 82.44 | 86.57 | 89.97 | 92.38 |
| 72.79 | 76.66 | 79.56 | 82.31 | 85.42 | 88.87 | 91.62 | 93.51 |
| 79.74 | 82.22 | 83.90 | 86.16 | 88.68 | 91.13 | 93.16 | 94.72 |
| 85.08 | 86.57 | 87.88 | 89.59 | 91.35 | 93.07 | 94.61 | 95.74 |

Hadamard :
FracZero : 
|  4.48 | 38.38 | 53.95 | 50.15 | 68.54 | 69.20 | 67.81 | 65.50 |
| 36.14 | 51.13 | 60.26 | 61.90 | 72.82 | 73.11 | 73.67 | 76.59 |
| 51.21 | 59.22 | 65.63 | 68.40 | 76.63 | 76.53 | 77.78 | 82.02 |
| 47.59 | 61.22 | 68.33 | 70.63 | 78.36 | 78.52 | 80.18 | 84.12 |
| 66.13 | 70.80 | 75.02 | 77.85 | 82.44 | 82.72 | 83.62 | 88.09 |
| 66.09 | 71.37 | 75.47 | 77.84 | 82.67 | 83.22 | 84.83 | 88.54 |
| 65.34 | 72.54 | 77.05 | 79.82 | 83.99 | 85.04 | 86.52 | 89.92 |
| 63.44 | 76.15 | 81.42 | 83.66 | 87.91 | 88.36 | 89.81 | 92.43 |

01-17-11 | ImDiff Release

Imdiff Win32 executables are now available for download. You can get them from my exe index page or direct link to zip download .

If you wish to link to imdiff, please link to this blog post as I will post updates here.

If you wish to run the JPEG example batch file, you may need to get JPEG exes here and get PackJPG here . Install them somewhere and set a dos environment variable "jpegpath" to where that is. Or modify the batch files to point at the right place. If you don't know how to use dos batch files, please don't complain to me, instead see here for example.

Once you have your JPEG installed correctly, you can just run "jpegtests image.bmp" and it will make nice charts like you've seen here.

I am not connected to those JPEG distributions in any way. Imdiff is not a JPEG tester. The JPEG test is just provided as an example. You should learn from the batch files and do something similar for whatever image compressor you wish to test.

ADDENDUM : See the Summary Post of all imdiff related blog posts.

01-12-11 | ImDiff Sample Run and JXR test

This is the output of a hands-off fully automatic run :

(on lena 512x512 RGB ) :

I was disturbed by how bad JPEG-XR was showing so I went and got the reference implementation from the ISO/ITU standardization committee and built it. It's here .

They provide VC2010 projects, which is annoying, but it built relatively easily in 2005.

Unfortunately, they just give you a bunch of options and not much guide on how to get the best quality for a given bit rate. Dear encoder writers : you should always provide a mode that gives "best rmse" or "best visual quality" for a given bit rate - possibly by optimizing your options. They also only load TIF and PNM ; dear encoder writers : you should prefer BMP, TGA and PNG. TIF is an abortion of an over-complex format (case in point : JXR actually writes invalid TIFs from its decoder (the tags are not sorted correctly)).

There are two ways to control bit-rate, either -F to throw away bit levels or -q to quantize. I tried both and found no difference in quality (except that -F mucks you up at high bit rate). Ideally the encoder would choose the optimal mix of -F and -q for R/D. I used their -d option to set UV quant from Y quant.

There are three colorspace options - y420,y422,y444. I tried them all. With no further ado :

Conclusions :

This JXR encoder is very slightly better than the MS one I was using previously, but doesn't differ significantly. It appears the one I was using previously was in YUV444 color space. Obviously Y444 gives you better RMSE behavior at high bitrate, but hurts perceptual scores.

Clearly the JXR encoders need some work. The good RMSE performance tells us it is not well perceptually optimized. However, even if it was perceptually optimized it is unlikely it would be competitive with the good coders. For example, Kakadu already matches it for RMSE, but kills it on all other metrics.

BTW you may be asking "cbloom, why is it that plain old JPEG (jpg_h) tests so well for you, when other people have said that it's terrible?". Well, there are two main reasons. #1 is they use screwed up encoders like Photoshop that put thumbnails or huge headers in the JPEG. #2 and probably the main reason is that they test at -3 or even -4 logbpp , where jpg_h falls off the quality cliff because of the old-fashioned huffman back end. #3 is that they view the JPEG at some resolution other than 1:1 (under magnification of minification); any image format that is perceptually optimized must be encoded at the viewing resolution.

One of the ways you can get into that super-low-bitrate domain where JPEG falls apart is by using images that are excessively high resolution for your display, so that you are always scaling them down in display. The solution of course is to scale them down to viewing resolutions *before* encoding. (eg. lots of images on the web are actually 1600x1200 images, encoded at JPEG Q=20 or something very low, and then displayed on a web page at a size of 400x300 ; you would obviously get much better results by using a 400x300 image to begin with and encoding at higher quality).

01-10-11 | Perceptual Results : PDI

This is my 766x1200 version of the PDI test image (that I made by scaling down a 3600 tall jpeg one).

Hipix and JPEG-XR are both very bad. I wonder if the JPEG-XR encoder I'm using could be less than the best? If someone knows a reference to the best command line windows JPEG-XR encoder, please post it. The one I'm using is from some Microsoft HD-Photo SDK distribution.

It's interesting to see how the different encoders do on the different metrics. x264, webp and kakadu are all identical under MyDctDelta. Kakadu falls down in SSIM, but does much better on SCIELAB. This tells you that kakadu is not preserving local detail characteristics as well, but is preserving smooth overall DC levels much better.

01-10-11 | Perceptual Results : mysoup

Two sets of compressors cuz I have too many on this image ; jpg_pack is on both charts as a baseline.

As before, we see AIC , Hipix and JPEG-XR are all very poor. WebP is marginally okay (the early encoder I'm using is still very primitive) ; it does well on SSIM, but doesn't have the good low bitrate behavior you would expect from a modern coder. x264 and Kakadu are quite good. My two coders (vtims and newdct) are better than the terrible trio but not close to the good ones (I need to fix my shit).

01-10-11 | Perceptual Results : Moses

Moses is a 1600x1600 photo of a guy.

01-10-11 | Perceptual Metrics Warmup : JPEG Settings

This is a repeat of old info, just warming up the new system and setting baselines.

Our JPEG is just regular old IJG JPEG with various lossless recompressors ; results on PDI :

As seen before, PAQ has some screwups at low bitrate but is otherwise very close to packjpg, and flat quantization matrix is obviously best for RMSE but worse for visual quality ("flat" here is flat + pack).

From now on we will use jpg_pack as the reference point.

01-10-11 | Perceptual Metrics Warmup : x264 Settings

Beginning the warmup. Quick guide to these charts :

On the left you will see either "err" or "fit". "err" is the raw score of the metric. "fit" is the metric after fitting to a 0-10 human visual quality scale. SSIM err is actually percent acos angle as usual.

The fit score is 0-10 for 0 = complete ass and 10 = perfect, but I have set the graph range to 3-8 , because that is the domain we normally care about. 8 = very hard to tell the difference.

x264 on mysoup , testing different "tune" options. I'm using my y4m to do the color convert for them which helps a lot.

Well "tune psnr" is in fact best on psnr - you can see a big difference on the RMSE chart. "tune ssim" doesn't seem to actually help much on SSIM, it only beats "tune psnr" at very low bit rate. "tune stillimage" just seems to be broken in my build of x264.

Oh, and I use "xx" to refer to x264 because I can't have numbers in the names of things.

Henceforth we will use tune = ssim. (change : psnr)


I looked into this a little more on another image (also trying x264 with no explicit tune specified) :

You can see that "--tune ssim" does help a tiny bit on MS-SSIM , but it *hurts* IW-MS-SSIM , which is a better metric (it hurts also on MyDctDeltaNew). Though the differences are pretty negligible for our level of study. No explicit tune x264 is much worse. "tune psnr" seems to be the best option according to our best metrics.

01-10-11 | Perceptual Metrics

Almost done.

RMSE of fit vs. observed MOS data :

RMSE_RGB             : 1.052392
SCIELAB_RMSE         : 0.677143
SCIELAB_MyDelta      : 0.658017
MS_SSIM_Y            : 0.608917
MS_SSIM_IW_Y         : 0.555934
PSNRHVSM_Y           : 0.521825
PSNRHVST_Y           : 0.500940
PSNRHVST_YUV         : 0.480360
MyDctDelta_Y         : 0.476927
MyDctDelta_YUV       : 0.444007

BTW I don't actually use the raw RMSE as posted above. I bias by the sdev of the observed MOS data - that is, smaller sdev = you care about those points more. See previous blog posts on this issue. The sdev biased scores (which is what was posted in previous blog posts) are :

RMSE_RGB             : 1.165620
SCIELAB_RMSE         : 0.738835
SCIELAB_MyDelta      : 0.720852
MS_SSIM_Y            : 0.639153
MS_SSIM_IW_Y         : 0.563823
PSNRHVSM_Y           : 0.551926
PSNRHVST_Y           : 0.528873
PSNRHVST_YUV         : 0.515720
MyDctDelta_Y         : 0.490206
MyDctDelta_YUV       : 0.458081
Combo                : 0.436670 (*)

(* = ADDENDUM : I added "Combo" which is the best linear combo of SCIELAB_MyDelta + MS_SSIM_IW_Y + MyDctDelta_YUV ; it's a static linear combo, obviously you could do better by going all Netflix-Prize-style and treating each metric as an "expert" and doing weighted experts based on various decision attributes of the image; eg. certain metrics will do better on certain types of images so you weight them from that).

For sanity check I made plots (click for hi res) ; the X axis is the human observed MOS score, the Y axis is the fitted metric :

Sanity is confirmed. (the RMSE_RGB plot has those horizontal lines because one of the distortion types is RGB random noise at a few fixed RMSE levels - you can see that for the same amount of RGB RMSE noise there are a variety of human MOS scores).

ADDENDUM : if you haven't followed old posts, this is on the TID2008 database (without "exotics"). I really need to find another database to cross-check to make sure I haven't over-trained.

Some quick notes of what worked and what didn't work.

What worked :

Variance Masking of high-frequency detail

Variance Masking of DC deltas

PSNRHVS JPEG-style visibility thresholds

Using the right spatial scale for each piece of the metric
  (eg. what size window for local sdev, what spatial filter for DC delta)

Space-frequency subband energy preservation

Frequency subband weighting

What didn't work :

Luma Masking

LAB or other color spaces than YUV in most metrics

anything but "Y" as the most important part of the metric

Nonlinear mappings of signal and perception
  (other than the nonlinear mapping already in gamma correction)

01-09-11 | On Ranting

In the last few weeks there have been an awful lot of rants on other tech blogs that I follow. (there have also been a lot of pointless "year end summaries" and "keep alive" posts; something about being on vacation for the holidays makes people write a lot of drivel). As a long time ranter, allow me to give you all some tips.

Don't rant about boring shit that everybody already knows. eg. Windows is so annoying, OpenGL is all fucked up, C++ is not what it should be, producers don't understand developers, blah blah blah, snooze. If you want to rant about some boring shit to blow off some steam, at least make it funny or angry or something. Don't take yourself too seriously. eg. don't make bullet-pointed outlines of your ranting. It's fucking ranting, ranting is juvenile, it's self-indulgent, it's ill-conceived, don't dress it up like it's a fucking academic paper.

Moving on.

We watched about 15 minutes of Scott Pilgrim (cringing the whole time) before turning it off in disgust. Can we fucking get past this comic book fad already? Just because you call them "indie" or "graphic novels" or whatever doesn't make them any less vapid. And no more superheroes please.

I'm fucking sick of seeing the NYT talk about how some banker or mortgage criminal "lost $100M in the financial crisis". That's ridiculous. First of all, when your pay is in stock or equity or whatever and the value goes down, you aren't really losing pay. You can't set your norm point at the highest value of the stock; if you're a sensible trader or gambler, you don't book the value of your win until you cash it out, and you know that your positive variance one year may well be balanced by negative variance in another. But the main issue is that these people really *made* massive amounts on the financial crisis. Just because they lost back a little at the end doesn't mean they lost overall. Some fucking Goldman or Lehman crook made $50 M over the last 10 years by speculating with free government funds, over-leveraging, and selling off bogus mortgage packages. Then they lost $10 M at the end. You didn't fucking lose $10M on the crisis, you liar, you *made* $40M on it. If you profitted in the bubble that ran up to the collapse, you did not lose money. It is absolutely not comparable to people (or pension funds or municipalities) who made money over many years, and then invested it in vehicles they were told were very safe and conservative, and then lost most of it.

I get a similar cognitive dissonance when I see small business owners who are "struggling". They drive a fancy new car, they have a big house, they eat out fancy food all the time, and yet their "business is hardly making it". Wait a fucking minute. The division between your personal finances and your company's is of your own making. How can it be that you are in the pink and your business is struggling because of the recession? Are you still paying yourself salary? Are you still illegally taking cash from the company till and paying your personal expenses as if they were business expenses? Then don't fucking tell me you're struggling. It's bizarre, and yet this behavior is completely standard for sole-proprietors. It seems like these people just don't even make the cognitive connection between the two; I met a guy at a track day last year who would alternate between talking about the new car he was buying, and the fact that his business was struggling, and there was no hint that he considered the two topics at all related.

A common retarded winter time rant that I see is how the city is fucking up the snow preparedness, they aren't doing enough plowing or salting or chaining buses or whatever you think they should be doing. This usually goes along with "government is incompetent" rants from small-government idealogues (occasionally you get some "they should privatize it" harmonies). Ummm... hello. You guys are the ones who have crippled our local governments by starving them of any income. They literally have no money, and now you're complaining that they aren't spending enough to take care of the streets?

Seattle's entire annual budget for pot hole repairs is $3M. It was cut last year due to general budget shortfalls, and the four teams were reduced to three. Well that $3M is already spent, and we have finally announced some emergency pot-hole fillers, which is now coming out of the general street maintenance fund (which is a paltry $42M) which will cripple other programs. Among other things, the pot-hole repairs are temporary, and spending more on pot-hole filling means having less cash to repave streets, which means the problem just gets worse year after year. (Seattle's general budget for transportation has been below maintenance level since 1995 ). (see also here or here ).

Seattle's budget for street repairs on non-arterial streets is $0. Zero. It has been $0 since the 90's. Seattle's budget to even *look* at non-arterial streets is $0. There is no survey of non-arterial street condition and no program to pave them. (see here )

BTW buses are in a similar situation to the streets. Maintenance has been deferred for years, it's the easy/sleazy/lazy way for politicians to cut budgets to comply with the retarded voters wish to starve the government - you simply slash those pesky repair budgets and let things gradually decay.

I only know the details in Seattle, but it seems like the same retarded contradictory outroar goes up all over America. One week it's "the government isn't taking care of the streets well enough!" and the next week it's "we need to cut government spending!". Uhh.. do you guys just not see the connection between these two issues? You can moan about taxes or you can moan about shitty government services, but you can't moan about both! (at least not in a place like WA where there *are* no taxes).

In other "you all are retarded about taxes" new; WA is desperately out of money at all levels of government, services are being cut everywhere, and of course the government is turning to horrible non-tax ways of raising money; all sorts of fees are going up, Seattle is increasing the parking meter rate dramatically, and our very limitted number of public prosecutors are now being assigned to traffic court to get revenue : Why It Just Got a Lot Harder to Fight a Traffic Ticket in King County - Seattle News - The Daily Weekly
Kitsap aims to turn traffic court into money maker
So watch out WA speeders, it may no longer be quite to trivial to make tickets disappear here.

In other news, last week the NYT ran this amazing chart of stock return for various entry and exit years . It's a beautiful piece of info-graphics.

01-08-11 | Random Music Stuff

In the recent pop music category, Tame Impala and Caribou are my faves :

YouTube - Tame Impala - Solitude is Bliss
YouTube - CARIBOU - Odessa

(I actually am doing a pretty good job these days of completely cutting myself off from pop media like TV and radio, so I have very little concept of what is "overplayed" and thus uncool; it lets me just listen to what I like without caring if its too mainstream to be cool enough for me, which is nice)

There's so much retro-remake 80's synth shit going on right now; I don't know why you would listen to that when you could just go to the originals :

YouTube - Visage - Fade to Grey
YouTube - Sharevari @ The Scene, Detroit (Remastered)
YouTube - Bruce Haack Party Machine
(BTW the documentary on Bruce Haack is pretty great)

Decent sexy groove, a bit cheezy :

YouTube - The Weeknd - What You Need

This is the one that really stuck from my old prog searches, the song just carries you on this epic journey, takes you away some place weird, then brings you home to rock :

YouTube - Room - Andromeda (1970)

I watched Lila Says which was a pretty shit movie, but it reminded me how amazing this song is :

YouTube - ( Air - ) Run

Digging dubstep in general; it's amusing to compare the Ed Solo version to the original :

YouTube - Ed Solo - Age of Dub

Dig this heavy slow jam :
Sukh Knight - Clotted Cream
Sukh Knight - Jewel Thief

Some other electronica that's good :

YouTube - Moderat - Rusty Nails
YouTube - Burial & Four Tet - Moth

We got a record player for christmas, and it's fun finding old shit and rediscovering the magic of simple stereo-mastered vinyl. Older music sounds so much better, I think you have to go before roughly 1980 to get stuff that hasn't been ruined by studios and producers (plus, modern shit is mixed in really weird ways so it sounds okay on headphones and 7.1 systems and such, you want to find good old stereo mixes). It's so raw and immediate, it sounds like you're in the same room as the band. All that horrible classic rock that they play constantly on the radio sounds so amazing on vinyl with a plain old analog stereo amp. Anyway, some random shit that I rediscovered the greatness of through the vinyl :

YouTube - Eddie Money - Baby Hold On
YouTube - Be My Friend by Free - their best-ever version
Free - Woman ; Free is so fucking great I can't get over it

01-08-11 | Random Car Stuff

There's a whole industry of schadenfreude videos of winter car crashes (for example : Watch An Icy 20-Car Pileup As It Happens ). The thing that really strikes me when I watch this shit is that these are not "accidents". These are largely intentional crashes. That is, first of all the driver chose to drive on a day that his car is woefully unprepared for, then second of all (and worst) he saw a street where there is a huge pileup from other people crashing and he doesn't just stop and reconsider trying to go up/down that hill, he thinks "ha ha those guys all crashed, but I'll be fine". The car damage should not be covered by insurance, when you intentionally crash your car you should have to pay out of pocket. You morons.

Interesting bit in here about the GT3 RSR - they actually move the engine forward on the race car (compared to the street GT3) : JB tests drives the GT3 and interviews Patrick Long for Jay Leno's Garage JustinBell.com - Justin Bell's Official Website
Because of the extra-wide track of the wide body Porsches, they can actually get the engine between the wheels, it doesn't have to be behind the rear axle. Similarly in the M3 GT2 race car, they move the engine down and back. BMW M3 GT Race Car - Feature Test - Car and Driver
By regulation the cabin firewall has to be in the same place, so they get it as close to the driver as possible. In both cases I wonder why they don't do similar things for the road cars (things like the M3 GTS or the Porsche GT3 are not as close to the race cars as they want you to believe).

There are more "Corvette fail" videos on Youtube than any other car type; not sure if that more reflects the car or the average owner. Certainly if you see a Corvette car club coming down the road - take cover !
YouTube - Corvette oops
YouTube - Corvette Fail

This sarcastic Corvette review walks a fine line between funny and excessive douchery : YouTube - 200mph in Corvette ZO6. Fun or Fail ; it's hard to make fun of others' douchery when you yourself are a humongous douche

Sometimes I think a Lotus might be cool (though the Evora S seems to still not exist in America), but then I am reminded that it's not a real car. It's a British sports car, which means it's a few bits of plastic held together with paper clips. For example :
Warranty Work - LotusTalk - The Lotus Cars Community
3 days of raining waterlogged my Evora. - LotusTalk - The Lotus Cars Community
How hard is it really to buy a Toyota engine and some standard 3rd party brakes and suspension bits and tack them onto some aluminum bars? Seriously Lotus?

I think the E86 BMW M Coupe (2006-2008, Z4 based, 330 hp, 3200 pounds) is pretty sweet, though I guess a modern Cayman just beats in almost every way. The earlier weird-looking ones (though sort of adorable for their ugliness) had engine problems apparently. I test drove one a while ago and thought it felt a bit weird to be sitting so far back on the chassis, and also the amount of cargo space is too small.
BMW M Coupe M Roadster Resource Network - MCoupe.com - BMW M Coupe, M Roadster Z3 & Z4 Resource Network

Mazda RX8's have a lot of problems with their engine. Mazda stepped up and took the rare and amazing action of extending the engine warranty to 8 years / 100k miles. Nice one Mazda!
Mazda expects to recall RX-8s - RotaryNews.com
Mazda RX 8's Engine Failure Problem - Mazda Discussions at Automotive.com
Mazda extends rotary warranty on RX-8 to 100k miles — Autoblog

There are various reports around of coking/deposit problems on intake valves with the new DFI engines that are in lots of modern cars, for example : (very amazing pictures in this link :)
Journal N54 Total Engine Rebuild & Upgraded Internals - BMW 3-Series (E90 E92) Forum - E90Post.com

I love the BMW E30 M3, it's the greatest M3 IMO. (the good BMW design is all boxy angles, with forward slanting nose, ala the 2002, and a rectangular grill, not the dual-butt-chin bangled bullshit) (people are putting modern S54 engines them now). It's a chance to watch this video, one of my favorite rally clips of all time : Youtube - Patrick Snijers - E30 Rally

And some more crazy build/mod stuff :

Ran When Parked
JDM Legends - Vintage Japanese Car Sales and Restoration Vehicles for Sale
2001 porsche 996 ls1 conversion - LS1TECH

Aside : this is a real dumb addiction I've fallen into recently. I always seem to have some dumb thing that I think about way more than I should, but cars are particularly bad because 1. they're very expensive and 2. being a "car guy" is repellant to women (and good men). Car guys are either white trash (with cars up on blocks), nerds (who drive an imported JDM R33 GTR and can give you lectures on the correct racing line but promptly crash when they try to execute it), or insecure yuppies (small-penis midlife-crisis better-than-the-joneses).

01-05-11 | QuadraticExtremum

In today's story I do someone's homework for them :

QuadraticExtremum gives you the extremum (either minimum or maximum)
of the quadratic that passes through the three points x0y0,x1y1,x2y2
inline double QuadraticExtremum(
    double x0,double y0,
    double x1,double y1,
    double x2,double y2)
    // warning : this form is pretty but it's numerically very bad
    //  when the three points are near each other
    double s2 = x2*x2;
    double s1 = x1*x1;
    double s0 = x0*x0;
    double numer = y0*(s1-s2) + y1*(s2-s0) + y2*(s0-s1);
    double denom = y0*(x1-x2) + y1*(x2-x0) + y2*(x0-x1);
    double ret = 0.5 * numer / denom;
    return ret;

01-05-11 | Golden 1d Searches

GoldenSearch1d finds the minimum of some function if you know the finite range to look in. Search1d_ExpandingThenGoldenDown looks in the infinite interval >= 0 . Pretty trivial thing, but handy.

The Golden ratio arises from doing a four point search and trying to reuse evaluations. If your four points are { 0, rho, 1-rho ,1 } and you shrink to the lower three { 0, rho, 1-rho }, then you want to reuse the rho evaluation as your new high interior point, so you require (1-rho)^2 = rho , hence 1-rho = (sqrt(5)-1)/2

BTW because you have various points you could obviously use interpolation search (either linear or quadratic) at various places here and reduce the number of evaluations needed to converge. See Brent's method or Dekker's method .

Also obviously this kind of stuff only works on functions with simple minima.

Also obviously if func is analytic and you can take derivatives of it there are better ways. I use this for evaluating things that aren't simple functions, but have nice shapes (such as running my image compressor with different quantizers).

template< typename t_functor >  
double GoldenSearch1d( t_functor func, double lo, double v_lo, double hi, double v_hi, double minstep )
    const double rho = 0.381966;
    const double irho = 1.0 - rho; // = (sqrt(5)-1)/2 

    // four points :
    // [lo,m1,m2,hi]

    double m1 = irho*lo + rho*hi;
    double m2 = irho*hi + rho*lo;

    double v_m1 = func( m1 );
    double v_m2 = func( m2 );
    while( (m1-lo) > minstep )
        // step to [lo,m1,m2] or [m1,m2,hi]
        // only one func eval per iteration :
        if ( MIN(v_lo,v_m1) < MIN(v_hi,v_m2) )
            hi = m2; v_hi = v_m2;
            m2 = m1; v_m2 = v_m1;
            m1 = irho*lo + rho*hi;
            v_m1 = func( m1 );
            lo = m1; v_lo = v_m1;
            m1 = m2; v_m1 = v_m2;
            m2 = irho*hi + rho*lo;
            v_m2 = func( m2 );
        ASSERT( fequal(m2, irho*hi + rho*lo) );
        ASSERT( fequal(m1, irho*lo + rho*hi) );
    // could do a cubic fit with the 4 samples I have now
    //  but they're close together so would be numerically unstable
    //return (lo+hi)/2.0;
    if ( v_m1 < v_m2 ) return m1;
    else return m2;
    // return best of the 4 samples :
    if ( v_lo < v_m1 ) { v_m1 = v_lo; m1 = lo; }
    if ( v_hi < v_m2 ) { v_m2 = v_hi; m2 = hi; }
    if ( v_m1 < v_m2 ) return m1;
    else return m2;

template< typename t_functor >  
double GoldenSearch1d( t_functor func, double lo, double hi, double minstep )
    double v_lo = func( lo );
    double v_hi = func( hi );
    return GoldenSearch1d( func, lo, v_lo, hi, v_hi, minstep );

template< typename t_functor >  
double Search1d_ExpandingThenGoldenDown( t_functor func, double start, double step, double minstep , const int min_steps = 8)
    struct Triple
        double t0,f0;
        double t1,f1;
        double t2,f2;
    Triple cur;
    cur.t2 = cur.t1 = cur.t0 = start;
    cur.f2 = cur.f1 = cur.f0 = func( start );
    int steps = 0;
    Triple best = cur;
        cur.t0 = cur.t1; 
        cur.f0 = cur.f1;
        cur.t1 = cur.t2; 
        cur.f1 = cur.f2;
        cur.t2 = cur.t1 + step;
        cur.f2 = func( cur.t2 );
        if ( cur.f1 <= best.f1 )
            best = cur;
        // if we got worst and we're past min_steps :
        if ( cur.f2 > cur.f1 && steps > min_steps )
        const double golden_growth = 1.618034; // 1/rho - 1
        step *= golden_growth; // grow step by some amount
    // best is at t1 bracketed in [t0,t2]   
    // could save one function eval by passing in t1,f1 as well
    return GoldenSearch1d(func,best.t0,best.f0,best.t2,best.f2,minstep);

Usage example :

#define MAKE_FUNCTOR(type,func) \
struct STRING_JOIN(func,_functor) { \
  type operator() (type x) { return func(x); } \

double TFunc( double x )
    return 100 / ( x + 1) + x;


int main(int argc,char *argv[])

double t = Search1d_ExpandingThenGoldenDown(TFunc_functor(),0.0,1.0,0.0001);

lprintf(TFunc(t) , "\n");

return 0;

More :

10/2010 to 01/2011
01/2010 to 10/2010
01/2009 to 12/2009
10/2008 to 01/2009
08/2008 to 10/2008
03/2008 to 08/2008
11/2007 to 03/2008
07/2006 to 11/2007
12/2005 to 07/2006
06/2005 to 12/2005
01/1999 to 06/2005

back to cbloom.com