Old Rants 5


The older rants are semi-regularly moved off this page. You can always read the old rants :
07/2006 to 11/2007
12/2005 to 07/2006
06/2005 to 12/2005
01/1999 to 06/2005
You can see some of my photos at Flickr .


03-25-08

If you want to Yelp and have it actually be useful for you, here's what you do :

1. Make an account.

2. Click "Member Search" in the upper right, and enter "Toro E". Click on his name to visit his profile.

3. In the left side bar, click "Add To Favorites".

4. Now search for places. The review from Toro Eater should be on top. Read it. Ignore the average rating and all other reviews. If there's no review from Toro Eater it probably sucks.

5. Yes, Toro Eater is pretty much only in San Francisco. If you don't live in San Francisco, every place probably sucks. Live somewhere better.


03-25-08

The Tenderloin is the scariest, but also perhaps the most fun neighborhood in SF.

My suggested itinerary :

check out art at Shooting Gallery / White Walls
eat at Bodega Bistro, Pagolac, Thai House Express or Lahore Karachi
get drinks at Bourbon & Branch , or Rye, or Koko
go clubbing at 222 or see a band at the GAMH


03-25-08

I walked around the park today. The cherry blossoms are amazing around the park right now. Obviously there are great specimens in the Japanese Tea Gardens, but that place is mobbed with tourists, even on a gray week day, I can only imagine the horror on a sunny weekend. If you just walk around the park randomly near there there are lots more good ones, and the Botanical Garden right there also has tons of cherry blossoms and lots of other random flowering stuff.

Also saw them setting up the new Chihuly exhibit at the de Young. It doesn't open until June or something yet, and there will be a bunch of it at the Legion of Honor as well. There were tons of boxes marked Chihuly piled up so it looks like it should be pretty rad.

I love skipping.


03-25-08

Thoughts on intrusive vs. non-intrusive ref counting : (intrusive = refcounted base class, non-intrusive = boost::shared_ptr or equivalent)

We'll do pros & cons in a second, but perhaps the biggest difference is philosophical. Intrusive ref counting forces you to use ref counting & smart pointers to manage those objects. Non-intrusive lets you use them or not, and lets you use a mix of different lifetime management schemes. Some people consider this an advantage for non-intrusive, but personally I consider it a big advantage for intrusive. I consider it to be much better to have a single global clear rule for lifetime management so that you can go into any part of the codebase and things are being done the same way and you are familiar with the gotchas and how things work. Also if I design some classes and managers, I consider it a big advantage that I can force my clients to manage the objects the way I intended and not do whatever they please with them. In general in coding I'd rather have fewer options as long as they are reasonable options, and be able to enforce usage that I know is correct, as opposed to allowing a variety of uses and leaving it up to the client code to be correct.

Non-Intrusive Pros/Cons :

Pro : Can easily refcount classes from outside your codebase, like FILE or Windows/OS objects, or whatever. Doing the same with an Intrusive system would require a wrapper, though it's a pretty trivial template wrapper, so this isn't really a huge difference.

Con : cannot (easily) get from the object to its refcount or smart pointer. That means you can never get from a naked pointer to the controlled pointer. That means any function which could affect lifetime, or could call anything that affects lifetime, must take a smart pointer as an argument. That in turn means you pretty much have to use smart pointers everywhere, which means the celebrated flexibility of being able to use different types of lifetime management is a bit of a phantom. Also means that all those functions that take smart pointers can't be called from object members because "this" is not a smart pointer.

Pro : becoming pretty common through boost so there's a lot of familiarity and code out there that use this stuff.

Con : speed and memory use overhead for the extra shared refcount. Obviously uses some kind of fast pooled allocator, but it's still off somewhere else in memory space, it's not right with the object, which is ugly.

Con : because the object does not own its ref count, it can't verify that it's being counted correctly or managed correctly; in general no way to enforce proper lifetime management because you don't know how your object is being managed, it's left to the client.

Intrusive Pros/Cons :

Pro : Can easily go from naked pointers to smart pointers, which means you can just pass naked pointers in functions, and also use "this". If you choose, the speed hit or threading performance hit can be very very low, because you only need to actually use smart pointers when you are doing something that affects lifetime - otherwise you can just use naked pointers.

Con : a bit ugly with multiple inheritance; either need to use virtual inheritance to refcounted base, or use macros to make concrete classes refcountable. Non-intrusive doesn't have this problem, you can just refcount whatever multiple inherited thing you want.

more ... ??? @@


03-25-08

I thought of a good sound-byte for my code design philosophy : "Compilation equals correctness".

Of course correctness does not necessarilly mean you are bug free, but the idea is that any "policies" which are required by your classes or functions should be enforced via the compiler as much as is reasonably possible. Incorrect coding should show up as a compiler error, not as simply something that you have to know is incorrect because you read the comment and know that "hey that's the wrong way to use that thing".


03-25-08

GOD DAMN MY 5 AM WAKING UP NEIGHBOR! Fuck me, I have to move, this is fucking awful. I think she's a lesbian, but I have very little evidence so far, so that's just wild speculation at this point. Also it seems she's rich; she uses a car service all the time. Car service is one of those things that doesn't actually cost much more than a taxi, but only rich people do it.


03-24-08

I just realized I have to tell girls I'm dating about my web page like it's an STD. I ask them to sit down. "There's something I have to tell you. You're going to find this out eventually, so I thought you should hear it from me. Umm ... I have a blog."


03-24-08

I did my Google interview today. It was pretty rough, it's just a long time to be talking to strangers. I want to write about it more but I'm scared of the Google NDA death squad coming after me. I got woken up repeatedly in the middle of the night last night by my neighbors, so I woke up feeling awful and was really in bad shape in the morning. I think I did pretty poorly on the first two interviews, but then did pretty well on the last two.

Google basically picks four engineers semi-randomly to grill you. That's good in that you get to meet a few different people, and they have different questions and expertises, so maybe as a whole they can get a better picture of you. That's bad in that they aren't necessarily people who are particularly good at interviewing. Doing a good tech interview is a difficult skill that takes a lot of thought and practice. Some of the guys were pretty good interviewers; some were not.

I got a few really good questions that I could tell were well thought out and made me think and work out how to do things and demonstrate competence; I'd like to post them because they were cool, but I'm afraid.

I also got two of the classic bad tech interview types of questions :

A) the brain teaser : present the candidate with a clever question that's really quite easy if they see the answer, but is just a stumper that you can't really work out unless you see the trick. These questions are horrifically bad interview questions, they don't test intelligence (except a certain kind of riddle-solving intelligence) or problem solving skills.

B) the "one true way" question : in this you ask a rather open question with various decent answers, but in your head you have an idea of the "one true way" that you believe is superior, and you try to prod and lead the candidate to come to your solution. These questions only succceed in telling whether the candidate has the same coding background and priorities as the questioner. A better way to test the same point is to just go ahead and tell them the way you're thinking of (after they give their own solution) then ask for the pros & cons or why/when you might use one or the other of the different ways to see if they really understand it.

I also have a general mental problem in those kind of situations where I tend to not give the really simple obvious answer. For simple quiz questions that should have a really simple answer, I tend to overcomplicate them. Like "how would you store 3 names?" , I'm like "well, vector < string > " , when I should just go "char * names[3]". I also spaced on what a semaphore is exactly, so :(


03-22-08

Jordan Chavez wrote me to point out the Weak Pointer implementation I have in cblib is pretty dumb. Yeah, he's right. Let me explain a bit. I wrote an article on smart & weak pointers long ago. I think they're rad. I'm going to assume you have an intrusive refcounted because it's just so superior to nonintrusive.

Now so far as I can tell there are two major styles of weak pointer. One is where the WeakPtr actually holds a pointer straight to the object and gets cleaned up intrusively when the object dies. The other is where the WeakPtr has some kind of indirect handle to the object which becomes an invalid handle when the object goes away.

The first way where the WeakPtr actually points at the object has one big advantage : pointer accesses are very fast, in fact they're free. There's no need to look up a handle, if you have a pointer you just use it. Of course it is slower to destruct an object, because you have to clean up your weak pointers. There are various ways to implement this, but they pretty much all have pretty high memory use overhead. My Galaxy3 has a WeakPtr of this style, implemented using a doubly linked list to clean up the weaks.

The second way with a handle can be very compact in memory, but has the disadvantage of dirtying cache for any weak lookup because you have to go through an extra indirection. The implementation in cblib is based on what I did for Oddworld, and it's intended to be as small as possible in a fixed-memory console environment. On the console, the WeakPtr is 4 bytes per WeakPtr + 6 bytes per object, but you are limited to 64k objects max. The version in cblib is just a lazy port of that to 32bit indexes so it's got 8 byte WeakPtrs which is a waste. Note because of the GUID system in the cblib version, weak table entries are reused right away, so you only need as many weak table entries as you have objects, so I can statically allocated a 64k-element weak reference table.

Jordan's suggestion is basically like this :

WeakPtr
{
	int index; // index to weak table
};

WeakTableEntry
{
	RefCounted * ptr;
	short weakCount;
};
To use the WeakPtr you just look it up in the table with [index] and use the [ptr] there. When an object dies, you set its [ptr] to NULL and leave the entry in the weak table. The [weakCount] counts how many weak pointers point at that slot so you don't reuse the slot until the weak pointers go away.

WeakPtr can also update themselves for free; if you use the [index] to look up the table and you find a [ptr] that's NULL, then you just change your index to NULL and decrement [weakCount]. This is a lazy way of eventually releasing all the weak references to a slot.

In theory there are pathological things you could do to have a huge amount of unnecessary weak table entries, but that's very unlikely in practice. This method is 4 bytes for a weak pointer and 6 bytes per weak table entry - but you can't just statically allocate a 64k entry weak table for a max object count of 64k because there will be empties in your weak table.

Smart Pointers are falling a bit out of favor (not that they ever were IN favor) because of multi-threading. Multi-threading is ubiquitous in games these days so we have to handle it. Passing refcounted objects around between threads is totally fine - it just means you need to thread-protect all your ref access. This also means thread protecting the smart pointer constructor & destructor that make the ref count go up and down, as well as the weak pointer's accesses to the weak table. This can be a bit of a performance hit. In fact the weak table is especially ugly because it's memory that all your processors would have to share. The linked-list style weak pointer is actually better for multiprocessor memory access, assuming that object lookup is far more common than object deletion.

Now, this doesn't have to be a big performance problem, because I've always been a fan of the "ref holder" style of smart pointer use, rather than the full Herb Sutter exception-safe "everything takes smart pointers" style. The ref holder style looks like :

void MyFunc()
{
	SmartPtr sp = world.GetSomeObject();
	thingy * ptr = sp.Get();
	...
	... do stuff on "ptr" ...
	...
}
basically we're just holding sp in the function scope to hold a ref, but all our work is on a plain old naked pointer "ptr". The sp would cause one interlock to take a ref at the top and release one at the bottom, no biggy unless your function is tiny.

There are surely other threading issues that maybe people with more experience can chime in on.

For example, to be thread safe, there should not be any "Get()" function on the WeakPtr to return a naked pointer, you can only return a smart pointer and have an IsNull check, to avoid the cases where your WeakPtr returns a pointer but the object gets deleted by another thread before you use it.


03-22-08

It's a good day in my world. My cold is gone. My left shoulder is almost better. The sun is out. I've been hiking and biking. I did some pretty sweet tricking, such as vaults and kick-jumps. I've been practicing the bike balance a lot, slowly getting better, working up to 5 second balances, but today it suddenly clicked for me and I did a 30 second balance. I almost don't mind red lights and stop signs because I can practice (and show off) my sweet bike stand.


03-22-08

"No Country for Old Men" was really masterfully executed, and Javier Bardem is so creepy in it, but at the same time, it's just elegant emptiness. It says nothing, and it's just a standard ( Texas Sheriff / take the drug money and run / serial killer ) movie. I guess the derivative boringness of it is sort of intentional, the Coen Bros have always loved classic Hollywood movie formulas and many of their movies are just homages to standardized forms.


03-22-08

I've been doing a lot of heavy deep squatting and high box jumping recently, and my knees are happier than they've been in years. The vast majority of people make the mistake of "taking it easy" on their damaged body parts. They have iffy knees or whatever and their response is to not bend them much, never do anything too athletic with them, just not push them. That provides a temporary decrease in discomfort, but it actually makes the joint even less healthy (because it tightens up, the muscles atrophy, blood flow decreases, etc.). What you want to do is to continue to use the joint as heavily as possible, but without aggravating it. You achieve that through controlled stresses.

Addendum : on the other hand I'm a retard and just moved my bed around by myself. That's a good example of a non-controlled stress which can suddenly put excessive forces on body parts that aren't well suited to it.


03-22-08

All these rescue chef shows are retarded because the stars are just bad teachers. They use this teaching method where they just say "don't do what you were doing, let me show you a whole new way". The lesson is not related at all to the student's experience, it's just a presentation of the "right way". That's a horrible way of teaching, but it's very common. People learn much faster if you work with the way they've already figured things out. The student is already trying to do it a certain way and has reasoned out thoughts on the subject. If you can look at what they're doing and go "okay, I see what you were going for, but here's what you did wrong.." or, "okay, I get your way of thinking about the problem, but actually you're missing this key point.." people learn much much faster if you do that.


03-22-08

I've never presented my food well. In fact I almost intentionally present it badly, just piles of slop on an earthen plate. I feel like the food should be judged by its flavor, not the way it looks, and I'm bothered with all the amateur photographers who make mediocre food but make it look gorgeous and everyone drools over it. Of course my attitude is just retarded. I've taken a similar attitude with my personal presentation as well, my fashion & coiffure and so on. I always thought I should be judged by my actions, and not all those other trapping that people care so much about, and I rebelled against them by intentionally being frumpy. Of course that's totally retarded. People do care how the food is presented, even the smart people who mainly care about flavor, even the people that I want to impress, of course they are still affected by presentation, both subconsciously and consciously, and to intentionally flub that aspect just makes you rate lower than you deserve.


03-21-08

I found a nice general page about shoulder health . It's a good read for any athlete, since prevention is so much better than recovery.


03-21-08

Fucking hell. My new neighbor that just moved in has some job where she wakes up at 5 AM and gets ready then goes out slamming the front door, starts up her car and drives off around 6 AM. I know all the details because I sit there in misery listening to each sound and watching the clock tick by. Fucking people should not be allowed to live in my building if they keep fucked hours like that.

In other "this fucking building sucks and I know more about my neighbors than I want" , I talked to the boyfriend of Slutty Girl. He seems like a really sweet intelligent guy and it's made me feel even worse about the situation. I also found out more about the story. They've been dating a while. They met up in the wine country when they were both living there. He owns a house up there somehow and they lived together there. They just got bored and decided to get an apartment in the city here, but he still owns the house and goes up there a few days each week, which is why he's often not here. She got a job here and stays in the city all week. Now it all makes sense; almost every day when he goes back out to the country, she has some random guy over. It's so disturbing. I guess part of what disturbs me so much is that the boyfriend is a lot like me, sort of naive and trusting and easily run over, and Slutty Girl would seem totally normal if I didn't know about her secret life. Any of my past or future girlfriends could be just like her, and I'd never know.


03-20-08

I haven't actually watched a movie in months. I put the movie in the player, watch about 15 minutes, get antsy and bored, boot up the computer and start writing and surfing the web while the movie plays. It really doesn't give the movie a fair viewing, but FUCK sitting and watching something is so fucking boring. Sometimes I'll do my shoulder physical therapy while watching, and that works okay. The only way I can actually sit and watch a movie is if I have a girl leaning against me to give my hands something fun to do. Yeah I guess I have ADD really bad.


03-20-08

MapJack reminds me of those old dungeon crawl games like Wizardry or Dungeon Master. It would be pretty fucking rad to mod mapjack to give you a character and have random monsters pop out as you walk around the dots.

It's also rad you can link to specific views like : this cool stairway near my house


03-20-08

Despite all the internet information sites, there's still no substitute for massive amounts of learned local information and the human memory. Part of the problem is that when you're trying to think of a good place to go, the criteria in your head are not so simple. You don't just think "I need a decent restaurant in Hayes Valley". You think "I need a decent restaurant somewhere near Hayes Valley (though I would go farther if necessary), it should be mid-ranged priced, classy but not stuffy or a big scene, I need to be able to get in without a reservation, it should be decently small and cozy, not a big cavernous noisy mess, somewhere I can wear jeans and a t-shirt and not feel under-dressed". There's no way to really search like that on the web and weight all the different factors. In your head you can store all this local knowledge of good places, and you can do a really nice weighting, so maybe you pick a place that doesn't even fit a single one of the criteria exactly, but it's pretty close on all of them.

After about 1.5 years in SF I'm still not even close to having the necessary local knowledge.


03-20-08

I just found a ruined Calphalon pan on the street and refinished it. It takes about an hour to resurface a pan and quite a bit of muscle work. How to restore the surface of a metal pan :

Start with a metal scouring pad (brass is good, an SOS pad is okay too). Scrub HARD mainly in a back-and-forth manner. The goal is not to get the dirt off, it's to actually remove the top layer of metal from the pan's surface, so you have to scrub hard hard hard and a long time - 5 minutes or so. Make sure you get in the corners too. There should be a large amount of fine metal powder building up in the pan, wipe it out with a towel and move on to the next step :

Metal sanding paper. You want at least 2 and preferably 3 different grits of sand paper, you want the kind made specifically for sanding metal. Start with the coarsest grit and again sand HARD all over. This is where you attack any bumps or uneven spots and try to get a smooth surface. Start with back-and-forth but finish with round-in-circles sanding. Wipe out metal filings with a towel and progress to the next smoother grit. Sand some more. As you get to the finest grit you want to start worrying about your sanding pattern. You don't want to leave big grooves in straight lines from sanding as that will make the food stick to the pan. Your goal is a mirror-like smooth finish. Try to use semi-randomized circular motions so that you are smoothing everywhere and not digging big grooves with hard back & forth sanding. When you think you've done enough, now do that same amount again. Wipe out the very fine metal dust with a towel.

Your pan should now be smooth, but don't try to use it yet. It's still full of fine metal dust. First wash with hot soapy water, and towel dry. Then pour in a little oil and rub it around the pan, then wipe out all the oil with a paper towl. If your pan is cast iron, you need to cure it. It doesn't hurt to cure other pans either. Put in a bunch of shortening (crisco) in the pan to coat, stick it in a 350 degree oven for 30 minutes (or 300 for 2 hours, meh). Wipe out the grease after it cools.

Finally we want to get the last bit of metal shavings out. We do this by cooking something to coat and throwing it out. Making a batch of scrambled eggs works fine for this, just cook them up and throw them out. Alternatively I think making a roux would work too and be cheaper. Your pan is now good as new.


03-20-08

Dan never screwed the cap back on the toothpaste after using it. I never said anything; I mean I specifically chose not to complain about it; I feel like I tend to be that jerk who has to have everything his way and is constantly it pointing out when you do something in a way that I consider the wrong way, and it just gets really annoying to live with me because I'm always nagging about something. So I try to stop myself from being that guy. Some things that matter (like don't use my good chef's knife on any surface but woood) I will nag about but other things that I decide I can let slide (like the cap on the toothpaste) I try to just not say anything about. But the truth is it always bugged me. Every time I walked into the bathroom and saw the cap off the toothpaste it felt like a finger poking me in the ribs. I would often screw the cap on myself. And now that Dan is gone, it's one of the odd little things that I'm happy about. Ahh, every time I walk into the bathroom the cap is on the toothpaste. So relaxing. BTW I've learned absolutely nothing from this anecdote.


03-19-08

Went to Briones park today; put some pictures up in my Flickr. It was pretty wonderful. Read real descriptions here : Gambolin & Kevin . I got kind of lost and accidentally did a 12 mile loop instead of 7 miles. Oops. Also I did a clean box jump onto the back of a bench and stuck the balance which was pretty sweet. There's lots of big meadows and hills there that you can just wander around off the trails. I wouldn't go there any time but spring, though.

I was trying to think of what hikes or natural things I really need to do while it's still lovely spring time. Down in San Luis Obispo the spring is really the only amazing time of year and the rest of the time it's kind of shitty. Up here that's not so true, the redwood parks are better in summer, there's not really a big green flush and wildflower bloom up here. Maybe I'll try to get down to Sunol Wilderness or Henry Coe, but they're so far from the city. Actually San Bruno Mountain right here is really nice in spring, I should try to do that before it gets dry.

I'd never been out the 24 there before. The little pocket around Orinda to Lafayette is really gorgeous, rolling hills and lots of greenery. I'm sure it's full of fucking disgusting tech company yuppie types with their knobby tired self-balancing strollers and biodegradable SUV's. But BART goes right through there which would make it a pretty sweet place to live when I'm old and married.


03-19-08

I've been thinking a lot about Danielle and Tiffiny lately. I try to keep them out of my head to move on, but that's not really possible. I'm trying to make myself more emotionally open and honest, and that's not something you can do selectively, when you open the gates everything comes through. I just learned this word saudade which is pretty much what I feel for them. It's something I'm okay with. I don't think about them constantly and it's not holding me back in any way, it's just a slight sadness that's always present in my mind which I greet and acknowledge and choose not to pay too much attention to.

I wish I had more pictures from my past. I've never been good at taking pictures, it just seems like a pointless nuisance at the time, but after the fact I wish I had them. At the time I think "I'll never forget this, it will be in my memory forever" but in fact the memory fades and becomes only an outline - or a memory of when I could remember it.


03-18-08

On Unit Interval Remappers

Some people have looked at the "controller/lerper" problem and mentioned interpolators. Interpolators are great things but they are not useful as controllers, because the target is not necessarilly constant, and even the "current" object is not under 100% control of the controller. The target can be jumping around, and hell the object may have other forces being applied to it as well. If you try to use an interpolator and just keep resetting it, you will have discontinuities and very ugly responses.

Anyway, it reminded me all the 0->1 : 0->1 float functions. Everybody knows so-called "smooth step" which IMO is more correctly called a hermite lerp because I'm sloppy and say "lerp" for any interpolation :

hermitelerp(t) = (3 - 2*t)*t*t
So, like you have your t in [0,1] which is your lerp parameter, and you want to get between two values A and B, instead of A + (B-A)*t , you use A + (B-A)*hermitelerp(t).

BTW amateurs can easily get into over using this. It's not necessarilly "better" than a lerp. Yes it is C1 (velocity continuous) if you require that the end points are zero velocity. It makes the curve slow at the ends and fast in the middle. Sometimes that's bad. I've seen people just tossing hermitelerps everywhere because it's "smoother" and of course that's retarded.

Of course very similar is

coslerp(t) = 0.5 - 0.5 * cos(t * pi)
which is C-inf if the end points are non-moving. Another problem with these of course is if you use them repeatedly to chase a target, you get really stair-steppy motion, you stop, speed up, slow to a stop, speed up, slow to a stop, it's dumb.

Anyway, we all know that, but if we think a bit more about our lerpers, it's obvious that there's this whole class of functions that remap the unit interval :

f : [0,1] -> [0,1]
f(0) = 0
f(1) = 1
A true remapper is also always in bounds and monotonically increasing in [0,1] :
0 <= f(t) <= 1 , for t in [0,1]
(d/dt) f(t) >= 0 , for t in [0,1]
though these last two strict requirements can often be relaxed depending on the application.

Of course the identity f(t) = t is your basic option.

I think of these as "response shapers". There are lots of times when you are really doing some kind of arbitrary response curve and you don't realize it. For example, say you want to show the user's health through color. You might do something like :

color = red + (green - red) * (health / health_max)
But really what you should have written was :
t = health / health_max;
color = red + (green - red) * response(t);
where response() is a unit interval reshaper. Of course the identity is one option, but it's not necessarily the best one.

It's particularly obvious that a response() curve belongs anywhere that you change units, because the meaning of values in one set of units is not necessarily linear in the other units. I don't really like the notion of sticking response() curves arbitrarily all over to fudge things within the same units - as long as you're in one set of units you should be able to do math to tell you what the curves should be like, but when you change units all bets are off.

One example is the one we used already - any time you convert a variable into a graphical indicator, such as color, brightness, the size of an explosion, etc. - those could easily get a response() curve. The other classic example is converting a mouse or gamepad stick to physical units. Instead of

torque = k * (stick deflection)
it should be
torque = k * response( stick deflection )

Now of course if your unit conversion is meters -> feet that's not what I mean by a unit change, and if it's something where you have a natural equation to convert units, that doesn't call for a response curve, but pretty much any time you change units using some non-physical unitful constant multiplier, it's suspicious. What makes you think those units are on the same scale that you can just use a linear conversion? You should assume you need a response curve, then you just might use the identity response.

Let's look at a few remappers. You can build a bag of tricks and get an intuition for these things and then start tossing them in the right places and understand how they affect control response.

First of all obviously you need to first map your values into 0->1 using something like :

float fmakelerper(float val,float lo,float hi) { return fclampunit( (val - lo) / (hi - lo) ); };
I wind up writing a lot of
flerp(response( fmakerlerper(val,lo,hi) ) ,lo,hi)

Now, given a few basic remappers, there are various transforms you can do which create new valid remappers.

Any time you have a remapper(t), you also have

remap2(t) = 1 - remap1( 1-t )
which is the reflection in X and Y (or the 180 rotation if you prefer). So for example if you have a curve that's always above the straight line, you can change that to a curve that's always below using 1-curve(1-t). And of course if you have any two remappers, then any lerp between remappers is also a remapper :
remap3(t,k) = k * remap1(t) + (1-k) * remap2(t)
This is a way you can expose shape control to the user or artist. Lerping a shaped curve with the identity curve is a good way to tweak the nonlinearity. Finally you can also remap a remap :
remap3(t) = remap1( remap2(t) )
Or any combination of these ways to make a new one.

The first ones that are obvious are the polynomials. Any (t ^ n) works, but we usually work with

square(t) = t*t
cube(t) = t*t*t
Intuition : these push the curve downward. That makes the response stay low longer and then shoot up at the end. This is useful for giving a linear ramp more "impact" and is used a lot in graphics. Linear ramps look boring, if you want a light to turn on over time t, something like square(t) is better. Obviously cube is the same thing but more extreme. These are also useful for providing fine control in UI elements like gamepad sticks or microscope zooms or camera focus; it makes small deflections even smaller, so that you can do very fine work, but it still lets you ramp up to a full speed change when you crank it.

Of course you can do powers of polynomials < 1 too , the basic gamma correction approximation is sqrt :

sqrt(t) = t^.5
intuitively this makes low values jump up fast, then it tails off slower. This is roughly how you convert light linear pixels to gamma corrected pixels. Of course there are also quadratic approximations of sqrt
approxsqrt(t) = t * (27 - 13*t)/14
which I optimized over [0,1]. That's slightly better than (2*t - t^2) but of course that super simple version is okay too. Note that the super simple version there is just the flip & inverse of square(t).
1 - square(1-t) = (2*t - t^2)

Then of course there are various kinds of parameterized warpers that you can expose to prefs to tweak response curves. One is the exponential :

explerp(t,k) = (e^(k*t) - 1)/(e^k - 1)

k > 0
for k very small this becomes a straight line (the identity)
as k gets larger the curve is pushed into the corner in the bottom right
for k very large it's a step function at t=1
and of course 1-exp(1-t) is useful too. (that's the same as using k < 0)

Then there's the quadratic spline remappers.

quad1(t,p) = t*t + 2*p*t*(1-t)

p in [0,0.5] is a true unit remapper
Of course this is just lerp of the identity and the quadratic remappers. More generally :
inline float fquadraticsplinelerper(const float f, const float ptX, const float ptY )
{
	ASSERT( fisinrange(ptX,0.f,1.f) );
	ASSERT( fisinrange(ptY,0.f,1.f) );
	ASSERT( !fequal(ptX,0.5f) );	//singularity here. 
	
	float bx = ptX;
	float a = (1 - 2.f*bx);
	float A = bx*bx + a*f;
	float t = (sqrtf(A) - bx)/a;
	float y = t*t + ptY*2.f*t*(1.f-t);
	return y;
}
will make a curve that goes through (ptX,ptY)

BTW I've only talked about unit remapping, but half-line remapping is very useful too and is a whole other topic. Two of the most useful are tanh(t) , which takes [0,inf] -> [0,1] smoothly, and log(t) which is [0,inf] -> [0,inf] but is very useful for putting things in linear importance scale when the relative magnitudes are what determine importance. For example both sound and image deltas have more linear information content in log scale.

Now we can get to what reminded me of this. Won sent me a link to Herf on Stopping . Mmmm, okay. He comes up with some funny curve. But if you look at the curve he wants, it looks just like a "smooth step" but all squished over to the left. Well, guess what, squishing to the left is exactly the kind of thing that remappers are good at. If we have some plot of a function F(x) in [0,1] and we want to make a new function that has the same endpoints but is squished to the left or the right or up and down - that's exactly what a remapper is. If you imagine having a picture of a checker board pattern, when you run one of these unit warpers, you're stretching some squares and squeezing others.

So, to make the kind of shape Herf wants, we can just use :

stopping(t) = hermitelerp( 1 - cube(1-t) )
and that looks pretty darn good. If you want more control you can use :
stopping(t) = hermitelerp( 1 - explerp(t,k) )

where k = -4 looks good to me

Just for more intuition building I'll show some other ways to get this kind of curve :

squareinv(t) = 1 - (1-t)^2

stopping(t) = hermitelerp( hermitelerp( squareinv(t) ) )
Each time you pass through hermitelerp it makes the ends slower and the mid faster, so doing it twice exaggerates that. squareinv skews the shape over to the left. So we make a shape with really flat ends and a steep middle, then we skew the whole thing to the left and that's the kind of stopping curve we want.
stopping(t) = squareinv( hermitelerp( squareinv(t) ) )
Similar, but the squareinv on the outside takes the shape and skews it upwards. Of course the squares could be cubes to be more extreme, or the squares could be lerped with identities to make them less extreme.

BTW it's a pet peeve of mine when people use physics terms to try to justify something totally hacky, especially when they get it wrong. There's absolutely nothing wrong with being unphysical and just saying "this function looks good". Herf says his "stopping" is based on physical friction; in fact he is not modeling "friction", he's modeling linear drag. Friction is not velocity dependent, it's a constant force (proportional to the normal force, which is proportional to mass in the normal case). Linear drag is a decceleration proportional to your velocity which produces the exponential slowing. Linear drag is the easiest diffeq anyone can solve : a = -k v.

While I'm ranting let me also note that linear drag is not what you usually see in nature. Drag in the air is mostly quadratic drag. That is a = - k v^2. You do see linear drag in the water at low speeds when you have laminar flow. BTW this is why the shape of boat hulls is really crucial - it's not just like a 10% change, it's the difference between laminar and turbulent flow, which changes your drag from linear to quadratic, which is like 1000X difference for fast boats. (it's actually way more complicated than this but that's the rough idea).


03-17-08

Gambolin' Man is a sweet Bay Area hiking blog, except that he put a bajillion pictures on the front page so it takes ten years to load and FireFox retardedly stalls out while it loads.


03-17-08

I've always known that implicit integration is like "the stable way" , but I guess I don't really grok *why* it's super stable. I guess if you look at some graphs of the potential wells of spring forces you can kind of get why it works for those, with a regular Euler integrator you are drawing these tangents that are very steep and you easily overshoot, whereas if you step ahead and take the tangent of the end-point it's milder. But that feels awfully hand wavey.

Anyway, I reminded myself that even the implicit Euler is really very rough and not what you want, because it is just evaluating the function at one point along the time step.

You want to solve (d/dt) y(t) = F(y,t) . When you discretize to steps of " h ", ideally you'd get the average F over the [ t , t+h ] interval. But if you could analytically get the average over the interval you could just analytically solve the whole thing and you wouldn't have to numerically integrate at all. Which of course I guess should always be the first step that we sometimes forget - see if you can just solve it and fuck this numerical shit.

Euler : (bad)

( y(t+h) - y(t) ) / h = F( y(t), t )

Implicit Euler : (stable, but still bad)

( y(t+h) - y(t) ) / h = F( y(t+h), t+h )

Midpoint : (pretty accurate)

( y(t+h) - y(t) ) / h = F( y(t) + (h/2)*F(y(t),t) , t+h/2 )

Implicit Midpoint : (accurate & stable)

( y(t+h) - y(t) ) / h = F( (y(t) + y(t+h))/2 , t+h/2 )

In particular, implicit midpoint will actually keep you on the correct path in phase space {p,q} when you're solving a Hamiltonian system. There will be error, but it will be parametric error in following the path, not error that makes you jump to other paths.

Of course for implicit methods you have to do algebra to solve for y(t+h) in terms of y(t) which is not always possible.

I guess this is all duh duh review. Also more generally Runge-Kutta is a much more accurate version of "midpoint" and then you could also do implicit Runge-Kutta as well.

The thing I forget sometimes is that the standard Implicit Euler that we like to use in games is not like the awesome solution. It just happens to be super stable for spring type systems, but it actually acts like an artificial damping of the system, numerically in terms of how close it gets to the right answer it's very bad, in fact it's just as bad as the plain forward Euler.


03-16-08

My family on my dad's side is partly from "State College" PA. It never occured to me until just now what a ridiculous name for a town that is. When I was a kid they said yeah we're from State College, and I was like okay, yeah, those are just words that designate a place that don't mean anything, fine, you're from there. But it actually is where the state college is. It's like naming your town "Steel Mill" or "Capital City" or "Shipping Port". Where do you live? Oh I live in "Walmart Distribution" TX.


03-16-08

Addendum / FYI : there's been a lot of followup to this and I'll post some useable code some day soon, in like a simple .h you can just include or something.

So, Jon asked this question and I realized that I don't actually know of a good answer :

You have some state variable with inertia, we'll call it a position & velocity {P,V}
You want to drive that variable towards some target, which may or may not continue to change {TP,TV}
You want to reach the target in finite time, and if the target is still you should reach it within time K
You want both your position and velocity to be at least C0 (value continuous)
You want a "good path" which is rather hard to define exactly, but overshooting and oscillating is bad, as is unnecessarily fast accelerations; you don't exactly want to minimize the energy required to move because that will give you very loose springy motion.

(side note : this not actually an "interception" problem, in that the target point is not considered to be actually moving linearly, you need to actually hit the exact end value "TP" and when you hit it you must have the velocity "TV" , you don't get to hit the point TP + t* TV somewhere along its path ; note that if you can solve this problem then you can solve the interception problem by using a moving frame, but the converse is not true).

PD controllers are very commonly used for this in games. They're nice and smooth (they're always C1, and are also C2 if the target moves smoothly), but have a few bad properties. For one thing they never actually reach the target, they just keep getting closer forever. Furthermore, if you damp them enough to prevent overshoot, they converge very slowly. People often use a critically damped controller, but it doesn't really solve these issues (overshooting vs. overdamping), just picks a middle ground where you have a little of each problem.

So far as I know there is no standard good solution to this, which is odd because it seems like something we want to do all the time. Does anybody know if there are good solutions to this problem ? It kind of blows my mind that we don't all just have this routine sitting around.

So I made a test app with various more or less hacky solutions : testdd zip and there are algorithm descriptions in the readme

I also just added a new mode in there today which so far as I can tell is the best way to do this. It's "mode 1" the "cubic maxaccel".

It solves a cubic, which is the lowest order polynomial that hits the endpoints (and thus is C1). The cubic is constrained to never accelerate faster than a max acceleration parameter. This removes all the hacky stuff about resetting the time to get to the target when the target moves slightly. I can't find a flaw with this, it just seems to work.

Now it might also be interesting to go ahead and use a quartic curve, because then you could also keep the acceleration continuous as well. Dunno if that would actually make better looking curves because the cubic paths look pretty good already.

It's very important IMO for whatever algorithm here to not have a lot of hacks/tweaks/prefs that need tuning, because you want to be able to take your interpolator and toss it anything have it "just work". For example, from what I can tell a PID controller is just a no-go because tweaking the values is very difficult and data dependent so it's not practical to use as a generic interpolator. What I want is basically one single parameter factor for the interpolator which controls how fast the value converges to the target.

I would like to see or compare to an AI steering type of interpolator, but I don't really want to write it, so if somebody has code for something like that, please send it to me, ideally as function that plugs right into that testdd/main.cpp

Also the code I've got in there for dealing with finding the root of the quartic polynomial is kind of nasty, so if you want to fix that up go for it. In words : I need to find the lowest non-negative root of a "depressed quartic". Any kind of iterative thing like Newton's is nasty because you have start with some seed, and there's not a simple way to be sure you picked the right seed to get the lowest positive solution.


03-15-08

I've talked before about how Web 2.0 is fucked, but cooking & food information is probably the best example, so let's look at that.

The information that exists out there currently is basically in three forms :

1. Food blogs. There's a ton of great food blogs, but they are not aggregated/indexed, so it transforms useful reference material into useless feature/diary materials.

2. Recipe collection sites like cooks.com ; These are ridiculously overloaded with just god-awful recipes. The star rating systems they use are worthless because they're dominated by morons who have zero concept of cooking. This is a big part of what makes the recipe problem so obvious, is that there's such a massive amount of bad information out there, and also a massive amount of retarded people rating things, so that any non-localized rating system is worthless.

3. Sponsored sites like FoodNetwork or major individual's pages like Joy Of Baking. These are actually the most useful places to get recipes, because the standards are high, they're searchable, and furthermore they give you the author's name which gives you context.

The first most obvious thing that's needed is a global aggregator that breaks the data atoms from blogs into a single searchable database. As usual stupid Google is not a great way to search for recipes, because pagerank is not at all what you want for sorting results, and you get tons of spurious pointers to restaurant menus and junk like that.

More importantly though you need author context. To some extent, all information in the modern era is only useful with author context. There's too much information out there for you to trust the quality of it without knowing where it came from. Now if you go to FoodNetwork or something and see actual names and know that a recipe from "Alton Brown" is probably good, that's fine, but that is only possible with a limited network, for the whole web there will be too many people and they may have only a few recipes so the information is too sparse for you to remember them all.

Obviously what you need is some kind of Collaborative Filtering, but really that's not ideal, what you want is a Network of Trust. The big difference is I get manual control over my links and I also get to see where they're coming from. So if I want I can just say thumbs up/ thumbs down on various recipes and get CF recommendations, but I can also manually say "I trust Alton Brown" and automatically get recommendations from him and also everything that he trusts. Seeing WHY you got something recommended is also very useful ; some recipe that looks nasty is recommended to you, but on the side it shows a network connection from Alton Brown and you can conclude that in fact this bizarre looking recipe is worth trying.

The reason recipes are such a strong example of how Web 2.0 is fucked is that there's tons of good information out there that you should be able to find, and currently you just really can't.

The whole thing that's great about Web 2.0 is that there are actual communities with actual people, and you can get to know them, and when you read bits of content you can see who it came from, and that little "who" piece of information is incredibly valuable. The problem is it's balkanized and they aren't working with collaborative filters. When they do have collaborative filters, they just run an impersonal CF algorithm which throws away the valuable revelation of the "who" to the user which is the value of the community.


03-14-08

Today I was watching a Rick Roll (I love getting Rick Rolled, I'm so addicted to that song, some times I just Rick Roll myself), and I realized that I dance a hell of a lot like Rick Astley. That's mostly not a good thing. In fact the only time it is a good thing is when I'm trying to do a Rick Astley impersonation.


03-14-08

Whether you can talk to someone really has nothing to do with what you have in common. Dan used to try to set me up on "blind dates" with guys because I'm such a fucking loser I need help meeting guys. She would say "oh, you would like this guy, he rides bikes". Oh really? So our conversation will be like :

"hey, uh, I hear you like bikes"
"yeah"
"me too"
"cool"

Sweet conversation, glad we did that.


03-14-08

Walking around today I noticed a hispanic guy going through recycling bins taking out bottles. That's not unusual at all here, quite a few people seem to be scraping by in the city on the refuse of the rich. He was wearing an iPod as he did it. Something is seriously wrong with the universe when the guy who's scavenging in people's trash for a living is listening to an iPod.


03-14-08

I've gotta stop drinking so much. I really don't like it. Well I like it, but I don't like being hung over at all. The problem is I just cannot handle social interaction without booze, so for me meeting people = boozing. I mean, have you ever hung out with people without alcohol !? it's unbearable, you're totally aware of everything, everyone is so dorky, then you get bored, it just doesn't work. I've known some really cool coders that I really liked, but I could never really be friends with them because they didn't drink, and me trying to socialize without boozing = kill self.


03-13-08

The best condom review site is "RipNRoll" because they have the detailed dimensions of every condom. There's actually a lot of size variation among normal sizes, and it's not printed on the boxes, so without teh interweb you don't know what you're getting. For example, most of the much-celebrated Japanese condoms are slightly smaller than normal. Some others are slightly bigger than normal, and none of it is on the packaging.


03-13-08

The iPod Shuffle is such a shitty product. For one thing, the physical design of the buttons is retarded. It's this cute looking circle, which I guess is intended to mimic the look of the normal ipod wheel thing, but it's not a wheel it's just 4 buttons. The consequence of the circle is that it's rotationally symmetric so you can't tell what you're touching and it makes it really hard to hit buttons by touch. A correct design would be to just have four buttons with distinct shapes, like the Playstation control buttons.

Furthermore, the software/firmware is shit. You should be able to set up playlists that choose a subset of the songs, and then select between the playlists. This could easily be done with no additional buttons, just a shift-click or something like that. Also the sequential play mode is a pain because you have to just skip through a ton of songs, but if there was a "skip album" shift-click it would be pretty useable.


03-13-08

Yikes. 45 year old short bald guy just came out of Slutty Girl's apartment. So gross. He had a spring in his step. WTF is she thinking. That's the first time I've seen anyone outside of the 20-30 hipster demographic.


03-12-08

Projects I'd like to work on :

ImDoub. See notes here. One of the complicating things is that you really need the ML (maximum likelihood) double, not the Ln-norm lowest error double.

Network of Trust. See many previous writings here. Basically this is just collaborative filtering, but where I'm in control of my own graph, more like my "friends network" on facebook or something. This is only really awesome if it's global, which is connected to :

Unified blog / recommender accounts. The balkanization of Web 2.0 is severely reducing how useful it could be. All the seperate accounts and friend connection networks means that your return on effort is linear, instead of exponential like it should be. (an open friend system is exponential because making a new connection linearly increases your rate of new connections). This is mainly a political problem, but if you built a really good framework for it, maybe you could convince more of the little forums to drink the koolaid. Obviously the sites are opposed to this because their proprietary user-created data is their treasure.

My text -> blogger app so I can write my blog like this but have it update to blogger or whatever.


03-12-08

My left shoulder is pretty much painless & has full mobility, but there's a weird loud pop that it makes in some movements. I've been reading up on symptoms and I think I may have a torn labrum (probably a SLAP), which would fit with the mode of injury (falling on outstretched arm). I've learned that arms are fucking lame and if you're falling, don't try to stop your fall by putting out your hand, just take the impact with your face, you'll take less damage. Of course the treatment for a SLAP is probably nothing.

I've been making myself go out to cafes a bit to read & write just so I'm not sitting alone in my house all the time. I fucking hate it. In general I hate all these "activities" that people do which are basically just leaving home to sit around or stand around somewhere else. If I'm gonna be doing nothing, I'd rather be at home where I can get naked and browse the internet and watch TV and dance around and cook. If I'm gonna have all the trouble and pain of leaving home, I wanna fucking do something. Let's build a fort from cardboard boxes, or paint graffiti, or practice cheerleader routines, or get naked and have a feather tickling fight, or fucking something, anything.

My computer has started doing some of those weird awful things that inevitably happens to Windows if you use it a lot. Like sometimes I'll hit standby and the machine will just seem to lock up for 5 minutes and then eventually do it. WTF Windows.


03-12-08

My resume' really doesn't look that hot. If you just saw it and didn't know me, you would never hire me. I'm pretty much relying on my rep in the industry, which is still pretty decent, but I think I'm fucked for jobs that aren't somehow connected to games. For one thing I have the big gap for the last 3 years, but even before that it looks like I moved around a lot. I really didn't move around that much, there are things like Eclipse->Wild Tangent which is the same company, and then there were part time and summer jobs when I was a kid, and then I was at Oddworld a long time, until they closed, but if you look at the resume there are all these 3-6 months stints and it looks like I don't stick it out when times are tough. The only place I really bailed on was Surreal because it was a total disaster.

Also as usual I've timed things to be at exactly the wrong phase of the economy. Yay me.

Also I'm not sure who to use as a reference. Past employers are meh. I guess I'll just use past coworkers and general game industry people.


03-12-08

How to take the arms off your Aeron :

No chair should have arms, they're horrible for you. Fortunately it's pretty easy to take them off the Aeron. First remove the back of the chair. There are 4 large hex bolts that hold the back on to the two posts coming up from the seat. Those come off easily. Now you can see the inside of the sliding arm assembly. Each arm is held on by one bolt. These are star-head ("torx") bolts which nobody has the tool for. Fortunately if you jam a hex key in there really hard and press firmly while you turn you can get these off. The arm tightening wheels will spin so you have to hold them in place to unscrew these bolts. The sliding arm assembly is sprung and the tightening wheels have ball bearings, so you want to be careful as you get close to taking these bolts out - try to hold it all together as you pull the bolts so it doesn't all go flying. The arms should fall right off now and you can put the back back on.

LOL on the negative side I keep reaching for the arm to brace myself when I sit down and almost just wiped out when I reached and grabbed air.

Also a small bonus, the really ugly Aeron looks much better without arms.


03-11-08

So many people say they have "no regrets". That basically means you're retarded. Have you made decisions which in hindsight were the wrong decision? Of course you have. That's a fucking regret. If you just don't care when you fuck up and realize it, it means you are not learning and getting better at living, which means you make the same mistake over and over, which literally means you are retarded. "I have no regrets". You should regret being so fucking dumb.


03-11-08

I hate fucking IM. People who IM have the worst manners. It's the culture I guess, they're multitaskers, so they get a hundred different things going on, and the pace of the conversation is all screwy. I can handle slow pace (email) or fast pace (talking in person), but IM will be like fast pace fast pace, and then they disappear for 30 seconds. It's really disturbing. I pretty much just refuse to IM with anyone now because it always pisses me off. I also don't like the way it's a quick casual dialog but then it's all stored in history so you can get smeared.


03-11-08

Netflix Prize notes , Part 1 : Intro

DISCLAIMER : this is mainly a brain dump for my records. If you really want to know how the good people are now doing it you should just go read the papers on the Netflix Prize forums, in particular the stuff from BellKor and all the SVD stuff that everyone is doing.

I worked on the Netflix prize from fall 2006 to spring 2007. It's been about a year since I touched it so this will be a little rough, but I realized I never wrote up what I was doing, so I figure I should try to get some notes down before I lose them forever. There are a lot of little quirks that are specific to the Netflix data set (stuff related to just having 5 ratings, trends related to when the rating was done in absolute time, when the rating was done relative to when the movie was released, and trends related to when the movie was released in absolute time, and others). I'm going to ignore all those factors and just talk about general collaborative filtering topics. Obviously to make a top-tier competitor you have to compensate for all those quirks on the front & back end. Also I haven't really followed developments too much since then so this is well out of date.

The basic Netflix problem goes like this. You have a bunch of users and a bunch of movies. Users have provided their ratings for some portion of the movies. Some users have rated only a few; a very few users have rated tons. Generally we think of it as one giant matrix, rows are movies and columns are users, and the ones with past data are filled in with a rating, and there are tons of blanks. You goal is to predict the most likely value from a blank. (actually in the case of the netflix prize, it's to minimize the L2 error on a blank, which is not the same as predicting the most likely value since you'll predict values in between the integers to reflect uncertainty).

In the particular case of Netflix you have 480k users, 17k movies, and 100 million previous ratings. That sounds like a lot of ratings, but in this big matrix, only 1.2% of it is filled in, so it's very sparse. Probably the biggest issue is dealing with the sparsity. The huge dimensions and sparsity is why you can't just use a standard supervised learning approach.

Basic "collaborative filtering" is a kind of machine learning or prediction algorithm; the basic operation is like this : a query is a {movie,user} index in the matrix that's blank. You find other similar movies to the query movie, and other similar users to the query user. This gives you a little local matrix. Say you found 20 similar movies and 50 similar users, now you have a 20x50 matrix, which is still sparse, but less so, and small enough to manage. You then use this little matrix to predict the blank. The most obvious way is to find the single most similar non-blank entry and take that rating. Obviously that sucks but it's a start.

I'm gonna go into some quick general notes on training. Of course you need to be careful not to test on the data you trained on. Netflix had this concept of a "probe set" but actually that's a bad way to go also because you're just excluding the probe from the training which gives up information. A good general way to do this is if you have your big data set, you divide it into N chunks. Then you do N seperate train & test operations, where you train on (N-1) of the chunks and test on the chunk left out. You then average the N errors to get your whole set error. Since you trained N times you have N models, to create your final predictive model, you could train on the whole N chunks, but usually it's better to take the N models, each trained on (N-1) chunks, and average the results from the N models. This is better because it smooths out overtraining quirks. Another general training thing is that when training like this, you need to account for the number of free terms in your model. If you have T training samples and a model with M free parameters, a better error estimate is (error total)/(T - M). Basically each time you add a term to your model, that should be able to at least cancel out the error from one sample by fitting to it, so you need to pretend your data is one step smaller. This makes is a better way to compare models of different sizes. Models that are too big will tend to overfit and not be good on unknown data. Note that in the simple case of linear models, averaging the result is the same as averaging the models.

I should also note that this is one particular problem where regression & classification both make sense. The Netflix Prize error measure is an L2 norm which should immediately suggest regression with least squares - that way you're optimizing your model on the exact same error measure that you will be judged on, which is good. But classification also has some appeal because of how discrete the ratings are. There are only 5 ratings in Netflix; to use classification you would not just predict the most likely rating, but rather predict a probability for each class, and your final output would be P(1)*1 + P(2)*2 + P(3)*3 + P(4)*4 + P(5)*5. This lets you use whatever kind of discrete classification method you want, as long as you do it in a kind of Bayesian way and generate probabilities of being in each class. For example you can use SVM to classify and use the distance from the SVM boundary curve to generate a probability of being on each side. Perhaps a better way to make probabilities is to use a big ensemble of classifiers built from random subsets of the data and count how many members of the ensemble vote for each class to generate a probability. Most classifiers are binary, but there are many different ways to make binary classifiers work on N classes. These classes happen to have a sensible linear relationship, so just doing a binary tree is perfectly reasonable. For classes that are not necessarily linearly orderable, the binary tree method is a bad way because it imposes a false linear order and you have to do something like an error correcting binary code.

Making ensembles of simple predictors and averaging is a really amazing thing in general; I'm not gonna really talk about it, but you can search for "bagging" and "boosting", in particular AdaBoost & the many more modern followups. Also obviously classification and regression are closely related. You can make a binary classifier by doing regression and just saying if output is > 0 that's class 1, if it's <= 0 that's class -1. If you use a linear regression that's the same as a plane fit classifier ("perceptron"); the difference is the error metric; regression is using some L2 or other norm, while most classifiers are designed to optimize to # of classification errors. Obviously you need to use the training method that matches the error metric you care about. There's also logit or logistic regression but I'm getting off track.

Let me also digress a second to note some things that are obviously wrong with all the standard literature on collaborative filtering. First of all, everyone makes the "missing at random" assumption. This is the assumption that which movies a user has chosen to rate is totally random, that is which elements are blank or not blank is random. Another way of saying this assumption is that the ratings which a user has previously provided have the same statistics as the theoretical ratings they would've provided on all the other movies. This is just obviously very wrong, since people do not choose to see movies randomly, there's a lot of information not only in the ratings they have provided, but also in which movies they have provided ratings for. For example, 22% of the users in Netflix have only given ratings of 4 & 5. That means they only rated the things they really like. Obviously if you do something retarded like normalize to their average (4.5 or so) and say all the 4's were things they "didn't like" you would be way off. Though it's also important to note that the Netflix test query is also not actually a random movie choice, it's also something the user chose.

Also another quick note on general learning stuff. I talked briefly about cross-training, the N chunk thing, and also about overfitting. Almost all the learning algorithsm have some kind of damping or model size parameter. Even with linear regression you want to use a regularized/damped regression solver. With neural nets there are lots of things that affect overfitting, like how you feed your data through, how fast you train, how many passes you make on the data, etc. With SVM you have the # of support vectors as your control (usually this is controlled through some kind of error margin size parameter). The less you damp, the more you overfit, and the better your results will be on that particular training set, but it will make the results worse when you test on new data, which is what you actually care about. The goal is to optimize on your training set, but not in a way that overfits and thus hurts you on new data. Cross-training is one way to do this, another is the use of an ensemble, as are the damping parameters in the algorithm. Unfortunatley there's no theoretical way to optimize the damping parameter, you have to just use trial & error to search for the best damping parameter, which you can find through cross-training.


03-11-08

Netflix Prize notes , Part 2 : Collaborative Filtering

I'm gonna describe basic collaborative filtering along with ways to improve on the standard literature. There are a few steps :
1. Finding similar movies & users
2. Weighting the similar movies & users
3. Using the similar movies & users & weights to make a prediction.

I'm going to try to be sort of vague and general, but then I'll also add in very specific details of what I actually found to be best on the Netflix data, so hopefully we can make that weird mix work.

1. Finding similar movies & users. We're interested in some query movie & user {m,u}. First let's find movies similar to m. We can greatly reduce our search by only finding similar movies that the query user u has rated. Those are the ones that are really useful to us so it's a good speedup. So each movie is a row in this big matrix of ratings, with many blanks. There are two sort of general ways to define similarity, one is a dot product, and the other is a difference/distance or MSE. Of course those are very closely related, if you normalize your vectors, then the dot product is linearly related to the distance squared via the cosine rule. A lot of the papers jump right ahead to using "correlation" but I don't want to do that. "Correlation" is a very specific mathematical thing, and may not be the best thing for our similarity metric.

So I have these two movies that have been rated by various users, they have some users in common (both not blank), some not in common, and many entries that are blank in both. Now, if you just did a distance squared using only the elements where you had ratings in both movies, that would be very bogus, because it actually favors movies that have a very small user base intersection, which are probably in fact very similar movies.

The other thing we want to do is remove linear bias when comparing movies. If two movies are very similar, but one is just better so that everyone rates it higher on average, they can still be highly correlated and useful predictors. We're going to remove average bias when we do the prediction, so we should do it now. So, basically any time we use a rating, we subtract off the average rating of that movie. NOTE : in general you might also want to remove linear scale. On the Netflix data I found that hurt, that scale contains important information and you should not remove it, but on more general data you should consider removing scale. Of course then you also have to remove it & restore it when you do the prediction.

A note on the average rating though : you don't want to just use the average of the ratings that exist on the row. Rather you want to make a prediction of what the average rating of that movie would be if all the blanks were filled in. There are fancy ways to do this, but it turns out that just pretending you have C more ratings with the global average rating works plenty well.

movie average = [ ( sum of ratings on movie ) + C * ( global average rating ) ] / [ ( num ratings ) + C ]
and C = 12.5 was best for Netflix. You want to do the same thing for the customer average, but for that C = 25 was best.

Now, most of the literature uses a Pearson or "correlation" measure for similarity, which are forms of dot product measure, higher is better. I found using a distance was easier to tweak and gave better results, but of course they are directly related as I noted before.

So, let's create this distance between two movies. First we have the distance squared between ratings where they both have a rating for a given user :

movie errsqr = Sum[u] {  ( ( rating(m1,u) - average(m1) )  - ( rating(m2,u) - average(m2) ) )^2 }

m1 & m2 = movies to compare
u = a user index , sum is over users that have a rating for m1 and m2
The number of users in common is "num shared" and the number of movies where one has a rating but other does not is "num unshared". We create a modified movie distance thusly :
movie distance squared = ( errsqr + (num unshared) * A + B ) / (num shared)

where A & B are constants
A = 0.016 and B = 4 were best on Netflix
And the movies with the smallest distance are the most similar.

Okay, now we need a similar thing for customer-customer similarity. We could of course just use the exact same type of thing, but I found something similar was better & faster. We're going to use the similar movies list that we already found and find customers that are similar over those movies, rather than finding customers that are globally similar. In fact this should immediatley give you the idea that we could've done the same thing for movies - rather than comparing the movies globally we could compare them only around users similar to the current one. More generally you want to pick rows & columns of the big matrix which produce a sub-matrix that is related to the query. You could do this with more generalized clustering or with something like SVD, but I'm getting away from regular collaborative filtering.

So, for customers, we do a similar thing but only over the similar movies. First, for acceleration we only consider other users which have rated the query movie. So we walk the list of users that rated the query movie and find the ones that are most similar when measured over the similar movies list. It's very similar to before with similar motivation :

user errsqr = Sum[m] {  W(m) * [  ( rating(m,c1) - average(c1) ) - ( rating(m,c2) - average(c2) ) ]^2 }   / Sum[m] { W(m) }

c1 & c2 = customers to compare
m = movie in the "similar movie" list which is rated by c1 & c2
W(m) = weight of movie m
Note that weighting has appeared and we'll talk later about how we weight a movie. The customer average ratings are corrected using the formula previously given. We then make our distance :
user distsqr = [ (user errsqr) * (num shared) + A * (num unshared) + B * max(0, C - num shared) ] / (num shared + num unshared)

num shared = # of movies used in the user errsqr sum
num unshared = # of similar movies without ratings in common (note this includes both-blank here)
num shared + num unshared = # of similar movies, a constant for this movie

for Netflix :
A = 1
B = 1.168
C = 7.7
There's this extra term with the max which makes users with fewer than "C" movies in common get a big penalty. So then we gather the users with smallest distsqr.

The set of N most similar movies and M most similar users is the local neighborhood of the query. For Netflix I found the best results with N=30 and M=100. We index the local neighborhood sorted by similarity, so the 0th movie is the query movie, the 1st is the most similar, etc. The [0,0] element of the local matrix is the query spot, it's blank and it's what we want to fill. The whole 0th row and 0th column are fully populated with ratings, by construction - we only considered similar movies & users which came off the query. The [0,1] element for example is the rating of the most similar user on the query movie. The [1,1] is the rating of the most similar user on the most similar movie. Roughly the farther away in this matrix, the lower the correlation, but of course you shouldn't use the indexes but rather the distances we computer above. Note that of course you can't always find a valid N movies or M users, in which case the last few rows or columns are left blanks and their weights are set to zero.

2. Weighting the similar movies & users

Now we need to talk about weighting. We already used the movie weights in the user-user similarity metric, and we'll keep using them similarly. We want a weight that will multiply how much a term should contribute, proportional to the similarity to the query. Implicitly all these quantities are related to the query user & movie.

Our weight is going to be based on a cosine / dot product , so let's start with that.

First define rub = rating unbiased = rating - average. If we're comparing two movies, that "average" should be movie's average, if we're comparing two customers it should be the customer's average. The cosine formula is deceptively simple :

cosine(m1,m2) = rub(m1) * rub(m2) / ( |rub(m1)| * |rub(m2)| )

Remember the ratings are like a row and we just treat them as a vector and do a dot product and magnitude. But there's some subtlety. The dot of rub1 and rub2 is done only on elements where they are both not blank, that is only over "shared" elements. However, the magnitudes |rub1| and |rub2| are the sqrt of dot products over ALL the elements in rub1 and rub2 respectively. That means we are in fact penalizing unshared entries. Note that if you pretended that elements where one was blank might be the same, that should contribute positively to the numerator of the dot product, and here it contributes zero.

The cosine is in [-1,1] , and in theory it's the absolute value of the cosine that you care about - eg. a -1 would indicate perfect opposing correlation and would also be a good predictor. Some of the papers use a "bipolar" model to use this. On Netflix I found the opposing correlations to not be helpful and excluded movies with a cosine < 0.

You do this exact same cosine thing for user-user similarity and it's called the Pearson similarity measure (but the "rub" uses the user average not the movie average to unbias).

Now we have this cosine but it's not actually the weight I found to be best. The weights I used on Netflix were :

weight(m) = cosine(qm,m) / ( distance(qm,m) + 0.00001 )

qm = the query movie
m = movie to weight

weight(u) = cosine(qu,u) / ( distsqr(qu,u) + 2.0 )

qu = the query user
u = the user to weight
There's really no theoretical reason to prefer any particular form of weight. I tried a lot of things. Just the cosine is okay. Just 1/dist is okay. Even something as weird and simple as just using the # of shared ratings is actually very good. These forms were best on Netflix but the exact best form is going to depend on your problem. This is obviously a flawed part of the problem, the reason the weird forms are working is because they're catching something about the shared / overlap / sparse problem which is hard to solve for.

Of course the overall scale of the weights doesn't matter because any time we use them we divide out the sum of the weights. We also force the self-weight to be exactly 1.0 , weight(qm) = 1.0 and weight(qu) = 1.0

3. Using the similar movies & users & weights to make a prediction.

Okay, we have our local neighborhood of ratings, let's call it L[] , and our goal is L[0,0]. Let's build up a series of predictors because they will all be useful to us. From now on I'm using indexes sorted by similarity, with 0 being the query, so m0 is the query movie, m1 is the most similar movie, etc.

I. "One Movie" :

pred = average(m0) + ( L[1,0] - average(m1) )
This is the rating of movie 1 by the same user, compensated for the average bias difference between movie 0 and 1.

II. "One User" :

pred = average(u0) + ( L[0,1] - average(u1) )
Similar to One Movie.

III. "N movie" :

pred = Sum[mi>0] { W(mi) * clamp[ L[mi,0] - average(mi) + average(m0) ] } / Sum[mi>0] { W(mi) }
This is just the weighted sum of a bunch of "One Movie" predictors on each of the similar movies. In the future I'm not going to bother writing the denominator - any time there's a weighted sum you of course divide by the sum of weights to normalize. NOTE : on Netflix it's beneficial to clamp the term in brackets into the valid range [0,5]. Of course we always clamp at the end but it's maybe a bit odd that clamping internally is beneficial also, this may or may not be good in general.

IV. "N user" Just like N movie. Duh.

V. "N movie slope-1" :

pred = Sum[mi>0] { W(mi) * clamp[ L[mi,0] + slope(mi,m0) ] }

slope(m0,mi) = Sum[u>0] { W(u) * ( L[m0,u] - L[mi,u] ) }

and we add a pretend extra term of (average(m0) - average(mi)) with a weight of 0.0075
This is exactly like N Movie, but instead of just using the difference in average bias between the two movies, we use this "slope" thing. If we just used (average(m0) - average(mi)) for Slope it would be exactly the same as N Movie. The "slope" is just the average difference in rating, but summed over the local similar users and weighted by the user weight. Thus instead of using a global average delta we use the delta where it matters to us weighted by how much it matters to us. This is a general trend that we can do - stick weights on everything and make all the deltas and biases local & weighted.

This should be pretty intuitive. If you picture the L matrix, remember our goal is to predict [0,0]. We want to find a way to make other entries in the 0th column down to the [0,0] spot. So, we want to find the way to get from one row (m) down to the 0th row. The answer of course is just the average delta between the mth row and the 0th row.

VI. "N user slope-1" : Exactly like N movie but switch user <-> movie.

VII. "Off Axis Slope 1" : So far we've only dealt with "on axis" parts of L[] , that is the row and column that go through [0,0]. But we can use the whole thing. To do so, we need a weight for arbitrary users and movies, which we can easily make :

W(m,u) = sqrt( W(m) * W(u) )
That's the geometric average of the movie and user weight, relative to the query. Note that arithmetic averages or sums make no sense on these weights because they are not on the same scale.

pred = Sum[mi>0, ui>0] { W(mi,ui) * clamp[ L[mi,ui] + slope(mi,m0) + slope(ui,u0) ] }
Pretty obvious. Note that this is only the "off axis" terms, we're not including the stuff that we had in N movie and N user, those could definitely be crammed in here but we prefer to keep them seperate.

1are both good predictors on their own. I should note that traditional Collaborative Filtering is only "N user". That is, the strict definition is using similar users ratings of the same item to predict my rating of that item. "N movie" is not strictly CF, but it's so analogous that I consider them the same thing, and of course the off axis naturally follows. On Netflix, "N movie" actually performs slightly better than "N user". If you average the two, that's even better.

To make a prediction for each query, you can just pick one of these predictors, or just pick the query movie average rating, or the query user average rating, or some linear combination of all that. In fact you can optimize the linear combo on the L2 norm by just doing an LSQR fit of all these predictors + the averages + a constant 1.0 term, and that will give you the optimal coefficients for averaging these preds up.

So these are basic collaborative filters. These are all basic linear models making simple "DPCM" style predictions. All this can be done pretty fast and you can beat CineMatch just doing this junk. Next time we'll talk about where I went beyond this, and what some of the other cutting edge people did.


03-11-08

Netflix Prize notes , Part 3 : Local weighting of predictors

So, last time we built some CF predictors and talked a bit about combining them, so let's go into more detail on that. This is going to be even more rough because it's the stuff I was still actively working on when I stopped.

First, how you would make an optimal global weighting of predictors. This is easy because our metric is an L2 error, we can just user an LSQR fit. For each query in the test set, we run the CF predictors. We put all the CF predictors, [I] through [VII] into a row vector. We also tack on the movie average and user the average and a constant 1.0. We take all these rows from all the test queries and tack them together to form a big matrix A and solve to minimize |Ax - b|^2 , where b are the actual ratings. In reality we want a damped solver as mentioned previously to reduce overtraining. We also want to do this with the N-chunk cross training previously mentioned. The solution "x" gives us the coefficients for each of the predictors in our row. Since this is a totally linear model if we train N times we can just blend those together to make a single average set of coefficients. This is the optimal set of coefficients for training our predictors. Note that this is also a really good way to see what predictors are working well - they will have coefficients that are large, the ones that are worthless will have coefficients near zero.

But it's immediately obvious that a global weighting can be beat. Different queries have different characteristics, and will have different optimal weightings. For example, you might do one query and find that there are similar movies that are very very similar, but there are no really good similar users for that query. In that case you will want to not use the N-user prediction at all, and just use the N-movie prediction. Local weighting will let us select for the predictor that is more suited to the query, but of course you don't just want to select you want to do some linear combo.

Now, this is very similar to combining "experts" and there's lots of theory on that. It does not however fit the normal experts model, because we aren't getting feedback as we go along, and we don't really have time continuity of the queries, so we can't track how well experts are performing and adjust their weight using those schemes.

One thing we can do is estimate the error of the various predictors. Think of each predictor as a predictor for data compression. Rather than just predict a value, you need to predict a probability spectrum. Let's make Gaussian probabilities, so we need to predict a center and a width. The center is just the prediction value we already did in Part 2. We need a width which minimizes the entropy. eg. if we're more confident we can have a smaller width, if we're very unsure we must predict a large width. This is the same as estimating an MSE for each predictor.

Once we have an MSE estimate for each predictor, we can combine them in various ways which I've written about in the past here. For example, we could weight each predictor by

e^(- Beta * MSE)

for some constant Beta to be optimized
or by
(1 / MSE)
and of course normalize by diving through by some of the weights.

How do we get an error estimate for each predictor? We train a learner to output MSE given some good conditioning variables. The most obvious thing is the "distance" to the similar movies & users that we computer to find our similar neighbors. Small distance should be a good indicator of confidence. Another pretty obvious one is the sdev of ratings in the L[] matrix along the 0 row and 0 column (after adjusting using slope1). That is, if you look at the "N movie slope 1" it's just a weighted average of an array. Instead I can take the weighted sdev of that same array and it tells me how much variation is in it. I don't just use that sdev as my estimate of the MSE of the predictor, I use it as a training variable. So I gather 4 or so interesting variables like this which are good indicators, and now I have 4 float -> 1 float supervised learner to train. In my research I didn't work out exactly what the best learner is here; you can certain easily use a Neural Net or anything like that. I just used a linear fit.

BTW I should note that any linear fit can easily be made polynomial by adding cross terms. That is, say I have a 4 float input set to learn on. I want to try the quadratic functions, you just add (4*5)/2 more terms for all the squares you can make from the inputs. The actual learner is still linear, but you find quadratic functions in the inputs. Of course you can do other functions of the inputs besides polynomials, but you can't learn the parameters of those functions.

So, to summarize : take measurements of the neighborhood that reflect confidence in the different predictors, feed them to a trained learner that will guess the MSE of a predictor (one learner for each predictor), use the MSE to weight the various predictions.

Aside : when combining all these predictors, you do not really want to get as many good predictors as you can to combine, because they will be very similar and not complementary. What you really want are predictors that offset each other well, or predictors that are very good for certain types of neighborhood which can be reliably identified. This is something I learned from the Volf switching compression work - if you just try to weight together the best compressors in the world, it doesn't help because they're all PPM variants and they don't complement. Instead if you weight in some really shitty compressors that are very different, like LZ77 or LZP or Grammars, you get a big benefit because they complement the main PPM coder well.

Aside #2 : at this point I'll mention there are some weird things with pretending we're predicting a Gaussian, because the values are only discrete 1-5. Instead of just averaging together the centers of the preds, we could treat each one actually as a probability spectrum, and add the spectrums. This is messy, I'm not going to get into it here.

Now let's look at a totally different way to do the same thing. Again we're trying to find local weights for the different predictors. This time we're going to think of it as a local regression problem. Instead of finding global coefficients with an lsqr to weight our predictors, we want to find a bunch of regions, and find coefficients in each region.

There are a few ways to do this kind of local regression. The first is to use a classification learner. First let's simplify and just worry about the "N movie slope-1" and "N user slope-1" predictors, since they are the most valuable and it gives us just two categories, let's call these "M" and "U" here. We look at all our training samples and see if the actual value is closer to the M or the U value. We label each value with an M or a U depending on which is closer. Now our goal is to train a learner which can look at the local neighborhood and predict a category, either M or U.

To train this learner we want to use the same kind of things as we used to estimate the local MSE - stuff like the distance to the similar user and the similar movie, the sdevs, etc. Also note that this learner must be nonlinear - if it's just a linear learner like a simple NNet or LSQR then we may as well just do a global linear fit of coefficients. We have a bunch of conditioning values for the learner, these are the axes of some high dimensional space. In this space are scattered the training samples labelled with M's and U's. We want to find the pockets where there are lots of M's or lots of U's and put down splitting curves. The ideal solution for this is an SVM (Support Vector Machine) with a radial basis kernel. SVM's are pretty ugly to train, so before you try to train one you need to try to get all your variables nice, get rid of any that are redundant, remove bias & scale. There are also ugly things about this, one is that to use an RBF machine you need to define a distance in this space, but your values are not in the same units and it's not obvious how to combine them.

I'm not gonna really talk about SVM's, you can follow the links below if you like, but I will wave my hands for a second. To do category prediction the simplest thing you can do is find a single plane that puts one category on the front and one category on the back. That's the basic linear categorizer, which can be found with neural nets among other ways. The basic trick of SVM's is to note that the plane test only relies on a dot product, and mathematically any time you have a dot product you can replace it with any other valid "kernel". This is the "kernel trick" and it's very common and useful. So for example a Gaussian RBF is a kernel so it can be used in the exact same kind of "plane" fit, instead of you use e^-k*(a-b)^2

If you can make the SVM that makes guesses about where the good "M" and "U" regions are, you still want to actually weight each one rather than selecting. There are a few good ways to do this. One is the SVM can tell you the distance from a query point to the nearest threshold surface. You can then turn this distance into weights some way or other. Another way is instead of training one SVM, you make a bunch of random subsets of your training data and train N SVM's. Now when you query a value you query all N of them, and your weights are the # that voted M and the # that voted U. This method of making a big randomized ensemble and using it to vote is very powerful. It's also great in practice because SVM's are evil on huge data sets, so doing N seperate trains on chunks of (1/N) samples is much better.

Now the final way I'll talk about is using a Decision Tree to select areas and do local regression. Now, with the whole M/U labeling thing we could've totally used a decision tree there as well, so you could apply the same ideas. All of these techniques are just things in your bag of tricks that you can use anywhere appropriate.

A Decision Tree is basically just a binary tree on various attributes of the neighborhood. If we again think of this high dimensional space where the axes are the useful properties about our neighborhood, the decision tree is just a BSP tree in that space. We want to build a BSP tree that takes us down to leaves where the neighborhoods within a leaf have similar attributes. Trying to do this greedy top-down does not work very well because you have to search tons of directions in high-D space and it's hard to find axes that provide good seperation. Instead what we're going to do is just randomly build a deep tree and then prune leaves.

To make our DT, we just start splitting. To make each split, we choose a direction in parameter space at random. We then find the centroid of the values in that direction and put a plane there, then go to each child and repeat. I made 8-deep trees (256 leaves). Now we want to collapse leaves that aren't too useful. The reason we need to do this is we are worried about overtraining. We want our tree as small as possible.

What we do is within each leaf, we do an LSQR to linear fit the predictors and find the best coefficients in that leaf. We store the error for these. Then for each node that's just above the leaves, we look at the error if pruned it - we put the values together and do an LSQR on the union and measure the error there. It's very important to account for the parameters of the model when you do this, because the extra parameters of the leaves always let you fit the data regardless of whether it's a good leaf or not

C = # of LSQR coefficients to weight predictors
P = # of parameters used in DT plane

Q(leaves) = (sum of errors in L0) / (N0 - C - P) + (sum of errors in L1) / (N1 - C - P)

Q(pruned) = (sum of errors in L0+L1) / (N0 + N1 - C)

L0,L1 = leaf 0 and 1
N0,N1 = # of items in leaf 0 and 1

prune if Q(pruned) < Q(leaves)

These trees can kind of suck because they are randomly made, but as usual we can throw a big hammer at them. Just randomly make a ton of them. Then we can test how well they work by trying them on some training data. The ones that suck we can just throw out. The rest we can average.

BTW an alternative to the DT thing is a kind of k-Means. You pick seed samples, map each sample to the closest seed, this defines clusters. Then you do the LSQR on each cluster. Then to query you interpolate between the fits at each seed. There are various ways to interpolate. Some good ways are just to weight each seed using the distance to that seed using either 1/D or e^(-k*D). Again instead of trying hard to find really good seeds you're probably better off just making a big ensemble by randomly picking seeds and then throw out the bad ones.

The final usage looks like this : build a local neighborhood L[] and compute the basic CF predictors as well as the distances and other good parameters to select predictors with. This defines a point in parameter space. Use the point to walk down the DT (or ensemble of DT's) that we built to find a region. That region tells us coefficients to use to weight all our predictors, which gives us our output prediction value.


03-11-08

Netflix Prize notes , Part 4 : other stuff @@ ... coming soon ...

user bias from the value they were shown Filling blanks Dimensional reduction, k-Means ; SVD and NNet reduction. Similars that don't necessarily have great overlap. Hidden germinators.


03-11-08

Netflix Prize notes , Part 5 : References

Links :

Torch3 The Dream Comes True
TinySVM Support Vector Machines
The SMILES System
The Robotics Institute
The OC1 decision tree software system
TANAGRA - A free DATA MINING software for teaching and research
tan classifier - Google Search
SVM-Light Support Vector Machine
SVM Page of Keerthi's Group at NUS, Singapore
Support Vector Machines Platt
Support vector machine - Wikipedia, the free encyclopedia
Supervised Learning and Data Mining (Tom Dietterich)
Slope One Predictors for Online Rating-Based Collaborative Filtering
Singular value decomposition - Wikipedia, the free encyclopedia
Singhi papers
Rapid - I - Home
Radford Neal's Home Page
qinv.c
Publications, Roman Lutz
Problems of the multivariate statistical analysis
Principal components analysis - Wikipedia, the free encyclopedia
Predictive complexity
Performance Prediction Challenge Results
Penn Data Mining Group Publications
PDP++ Home Page
PCP - Pattern Classification Program
PB NNets
Nonlinear Estimation
Neural Networks & Connectionist Systems
Neural Network FAQ, part 5 of 7 Free Software
Neural Network FAQ, part 2 of 7 Learning
Netflix Update Try This at Home
Netflix Challenge
Moonflare Code Neural Network
MLIA Technical Papers
Michael Pazzani Publications on Machine Learning, Personalization, KDD and Artificial Intelligence
Machine Learning and Applied Statistics - Home
LSI - Latent Semantic Indexing Web Site
Linear discriminant analysis - Wikipedia, the free encyclopedia
LIBSVM -- A Library for Support Vector Machines
Learning with Kernels
LAPACK++ Linear Algebra Package in C++
Kernel Machines
Keerthi npa
Joone - Java Object Oriented Neural Engine
Introduction to Radial Basis Function Networks
Independent component analysis - Wikipedia, the free encyclopedia
Huan Liu
GroupLens
Geoffrey E. Hinton's Publications in reverse chronological order
Fast Artificial Neural Network Library
FANN Help
FANN Creation-Execution - Fast Artificial Neural Network Library (FANN)
ensembles decision trees - Google Search
Decision Trees
Decision tree - Wikipedia, the free encyclopedia
DBLP Gavin C. Cawley
Data Mining (36-350) Lecture Notes, Weeks 4--7
CS540 Fall 2005 Lectures
Collaborative Filtering Resources
Collaborative Filtering Research Papers
CiteULike bpacker's library
Chris Burges's Publications
Ben Marlin - Research
Balaji Krishnapuram
Atkeson Local Learning
ARTIFICIAL NEURAL NETWORKS
Amazon.com The Elements of Statistical Learning Books T. Hastie,R. Tibshirani,J. H. Friedman
Aldebaro's publications
- Netflix Prize Home


03-10-08

The sad thing is the work you get to do as a game programmer might be the best programming work in the world. The community of game programmers is great, they share information and help each other and are all good guys, it's very different than most industries, or even academia, which is super secretive and cut-throat and full of pompous jerks. When I say "game programming" people who don't know think that it's awful work, because they just know about the hundreds of bums doing gameplay code for EA. That's not what game programming is to me. Game programming is super cutting edge virtual world simulation and large database manipulation that has to be done on limited CPUs and memory. You get to do graphics, lighting, physics, collision, spatial indexes, animation, locomotion & motion synthesis, advanced AI, etc. etc. all these things at the absolute cutting edge and with hard efficiency problems. It's really super fun work.

Is there a way I could work on game technology stuff and not actually be making a game? (I've heard bad things about most of the game engine companies too)


03-09-08

I think the depression from not being able to exercise is starting to set it. I'm feeling occasional sudden attacks of misery, but most of the time I feel pretty good, like I'm making progress. Addendum : fuck, yes, I need to exercise really bad.

It's ridiculously gorgeous here today. Dolores is a crazy party today, tons of people, a DJ playing dance music, a big drum circle, people barbecuing. It's spring, but locals call it the "San Francisco summer" because this is the hottest and sunniest time of year. The hill above my house, around 20th and Sanchez, is covered with gardens and stairs, with loads of flowers in bloom and lots of fun little passages between the different levels of streets. It's one of the little-known gems of the city.

There are a million people out and about, so many gorgeous girls and people having fun and all hip and cool. It makes me completely miserable. Here you can't hide from the reality of how shitty your life is. In the suburbs, you can go to your job, get in your car, go home to your shitty wife or shitty friends and do nothing special and think your life is normal and fine. Here there are amazing people doing everything that I ever wanted in my life, and I'm not part of it, and I can't hide from that fact.

Sometimes walking down the street I feel like I'm somehow disconnected from this reality. To make a totally retarded pseudo-scientific analogy, I feel like I'm on a different Brane than everyone else, I'm offset by one centimeter in one of the higher dimensions, so I'm right there, right next to this reality, and yet cannot interact with it. Obviously that's not true it's just a depressive self-indulgence.


03-08-08

Sex Sim ADDENDUM : (scroll down and read the first post before this) :

I actually forgot to write about the most interesting thing. When you add actually STD transmission simulation you find that the "supervectors" can cause a critical point where the infection rate becomes nonlinear. This is super interesting, I guess this must be well known in infectious disease.

The simple model of STD's in the system goes like this : people are born in with STD's at some point (this is just to seed the system, I use 0.1% of births infected so it's a very small number and quickly becomes irrelevent). At each coupling, if one partner is infected there's some transmission rate which determines if the other partner is infected. Something like 10% is reasonable for this; remember it's not really just one night of sex, it's a whole relationship. Same basic model as before. We then measure the percentage of people infected at time of death for both normals and sluts.

For example :

Varying the population fraction of sluts :

	population:1000
	lifetime:100
	couplingchance0:0.25
	couplingchance1:10
	relationduration0:25
	relationduration1:4
	infectedbirth:0.001
	transmissionrate:0.15

fracsluts	normalinf	slutinf	overallinf
0.05	 	0.234 %		0.70%	0.26%
0.10		0.327 %		0.86%	0.38%
0.15	 	0.531 %		1.14%	0.62%
0.20	 	0.995 %		2.06%	1.21%
0.25	 	2.754 %		5.21%	3.37%
0.30	 	10.720 %	18.18%	12.96%
0.35	 	23.612 %	36.35%	28.08%
0.40	 	33.670 %	47.81%	39.33%
0.45	 	42.281 %	56.58%	48.71%

You can see the overall infection rate grows slowly and linearly with a small "slut" population, but once the slut population hits a critical point (around 25%) it suddenly jumps. You can see it really well in the graph of slut population vs. overall infection %

The whole system behaves differently on the other side of critial; before critical, infection is roughly linearly proportional to slut fraction. Also the normal infection rate is 1/3 the slut rate. Above critical, the normals quickly catch up and their infection rate is no longer much less than the sluts.

Of course slut population isn't the only thing that causes a critical point, the promiscuity of the sluts can have the same effect, and so can the transmission rate of STD's.

Varying the transmission rate :

	population:1000
	lifetime:100
	couplingchance0:0.25
	couplingchance1:10
	relationduration0:25
	relationduration1:4
	popfraction0:0.80
	popfraction1:0.20
	infectedbirth:0.001

behaviors the same in all case :
normalsex 	fracslut	slutpartners
6.82	 	70.093%		22.129

transmission fraction vs. infected at death :

trans	overall infection
0.10	0.37%
0.11	0.43%
0.12	0.55%
0.13	0.68%
0.14	0.86%
0.15	1.30%
0.16	2.20%
0.17	4.84%
0.18	12.63%
0.19	23.65%
0.20	32.55%

You can see the critical point here really radically, infection proceeds in a normal linear way up to a transmission rate around 16% (0.16), and then suddenly hits a point where a massive chunk of the population is infected. And the graph is really striking : Graph of overall population infection vs. transmission rate

This is a bit interesting for thinking about pandemics and plagues too. The difference between an influenza that kills millions and one that spreads a bit then goes away may not actually be that big. If one is slightly above or below the critical point, that produces a massive change in results. What this means is that for near-critical transmission situations, even a slight reduction in the number of vectors can cause a huge drop in infection if you get below the critical point.


03-08-08

If you are going to be on TV, get somebody who knows something about clothes to help you pick your outfit. Don't wear all the same color. Don't wear a turtleneck. Don't wear a fine pattern that will alias. Do not wear anything poofy or see-through. Good god. I mean, sure you have bad fashion sense, that's fine, but don't you have any friends!?


03-08-08

Hello, I'm a girl that thinks she's way too good for you. My favorite things are cuddling and puppies and my friends. I spend my free time drinking beers with my roommates at the park. I'm not remarkably attractice and I don't really work out. I love Will Farrell and Kate Winslet and Pan's Labrynth. I think saving the earth is so important, I have an eco yoga mat. I'm a passionate liberal but I don't actually know anything about current events or history. I don't really have any skills and work as a receptionist or waitress. I'd like to be a writer/artist/musician but I don't really spend any time working on that, I just imagine it's going to somehow magically happen some day. I've got tons of student loan debt that I'm not paying off. I claim that I can cook but that pretty much tops out at scrambled eggs. I am so unique and special!! Some day a handsome prince is going to whisk me out of this life and treat me like I deserve.

(disclaimer : this is really not about anyone in my past, it's what I imagine all the girls that walk by are thinking)

I have no respect for these girls who have nothing to offer but think they deserve the best. Ideally great people get great people and losers get losers. Of course that doesn't happen reliably, and I'm not quite sure where I fit in that anyway; I swing between thinking I'm a fucking god and deserve the best, and thinking I'm a huge loser and should be happy if I ever find anyone again.

addendum : Obviously everybody want to feel like they're special, but lots of people take it much farther and imagine that they're super stars, that they deserve worship and attention and so on. Even people who know rationally that they don't want to be treated like stars by their friends and lovers. I'm not really down for that. I don't care to be treated like a VIP, I'd rather just stay with the normal people, and I don't need a big fancy life, just a little corner of happiness would be just fine. I'm also not down for helping other people feel special who obviously aren't.


03-08-08

Fucking Google Product search is so useless. If you just search for products half the results are fucking patents, the other half are fucking epinions pages, and the rest are fucking sold in the UK. I need a double shoulder support like this for use in sports, not one of the Thermoskin fucking things for warmth that google keeps telling me about. But I need one that's sold in the US cuz those UK prices are not good for me. I'm convinced I have some severe shoulder disfunction and I need to wear something like this even to walk around the neighborhood safely.

I also don't know who to see about these fucking shoulders. Something is wrong with me, but nobody can fix it. The orthopedic surgeons take one look and go "no surgery" and then don't do anything else cuz that's all they know. The physical therapists say "do these exercises" and I do them and nothing changes. Fuck. The chiropractor says I have nerve impingement preventing energy from going to the shoulders, but he's a fucking quack that's inventing causes to match the symptoms.


03-08-08

There's this girl who lives in my building. Thanks to the thin walls and my windows in the front, I know pretty much everything that goes on in the building - I wish I didn't, I don't want to be a nosey neighbor, but it's impossible to block out of your ears when you can hear people talking. She has a steady boyfriend who lives with her, but he seems to leave pretty regularly, I dunno if he goes out of town for work or what. Literally every single night that he's not here, she has some new random guy over. She leaves the house around 10, usually gets back around 2, and then the new guy leaves in the morning. She's only lived here a few months and has had at least 20 partners that I know of. I assume she's going out to bars alone to find these guys.

She's a living example of a "supervector" in human sexuality. A "vector" here means a connection for disease transmission. If we think about the system of STD's in humans, humans are a bunch of random scattered points that don't transmit disease except through vectors, which are coupling events. I've long held a theory that there exist "supervectors" in the population which faccilitate much wider transmission than simple coupling would indicate. Basically if you have a large population with a constant low coupling rate, STD's cannot survive. Talking about average rates is misleading, because the distribution of sexual partners is very skewed, particularly in women. There are lots of women with 3 or fewer partners, and then a very tiny minority with a huge number of partners, on the order of 100 over a lifetime.

So I just wrote a quick simulator. The simulation works like this : The population is randomly seeded. Old people die when their lifetime is reached. Couples break up when their relationship duration is reached. Single people randomly couple in each discrete timestep. When two singles meet they form a couple with probability equal to the product of their "coupling chances", Ci * Cj.

The population is divided into two groups, the "normals" and the "sluts". In this test the sluts are 10% of the population. The sim runs until steady state is reached and then we look at how the people have coupled. In particular we look at the average # of partners of each group, and the % of the normals partners that are sluts. If the populations behaved the same, the normals should sleep with sluts 10% of the time, since that's the fraction of the population.

Coupling chance captures how "easy" someone is, that when two random normals meet they often don't connect, but if a normal-slut or a slut-slut pair happens they are far more likely to engage. Relation duration captures the fact that normals tend to commit and stay together more, so that once the population is paired up, the single people left out are more likely to be sluts.

Here are the results :


pop = 90/10
pop 1000
life 100
group 0 = normals
group 1 = sluts

Run 1:
coupling chance variation :
both relation durations = 10

couple0, couple1, partners0, percentslut, partners1
0.100, 2.500, 2.114 , 42.873% , 9.134
0.150, 1.667, 3.079 , 28.150% , 8.966
0.200, 1.250, 3.981 , 20.873% , 8.785
0.250, 1.000, 4.766 , 16.549% , 8.600
0.300, 0.833, 5.441 , 14.378% , 8.291
0.350, 0.714, 6.017 , 12.847% , 8.006
0.400, 0.625, 6.495 , 11.632% , 7.744
0.450, 0.556, 6.904 , 10.729% , 7.492
0.500, 0.500, 7.255 , 10.001% , 7.254

Run 2:
both coupling chances equal 0.5
relation duration1 = 5
relation duration0 variation :

duration0, partners0, percentslut, partners1
10, 7.734 , 13.503% , 10.905
15, 5.973 , 15.854% , 10.223
20, 5.013 , 17.773% , 9.790
25, 4.409 , 19.213% , 9.467
30, 3.958 , 20.419% , 9.201
35, 3.701 , 21.425% , 9.054
40, 3.413 , 22.322% , 8.857
45, 3.158 , 23.198% , 8.677
50, 3.046 , 23.793% , 8.579

Run 3:
coupling chance 0.25,1.0
relation duration1 = 5
relation duration0 variation :

duration0, partners0, percentslut, partners1
10, 5.293 , 22.934% , 14.228
15, 4.439 , 25.573% , 13.835
20, 3.916 , 27.523% , 13.544
25, 3.565 , 29.236% , 13.346
30, 3.311 , 30.550% , 13.183
35, 3.112 , 31.635% , 13.052
40, 2.951 , 32.528% , 12.925
45, 2.839 , 33.321% , 12.842
50, 2.754 , 33.699% , 12.768

Run 4:
coupling chance : 0.25,1.0
relation duration : 33, 5
slut population fraction variation :

slutfrac, partners0, percentslut, partners1
0.050, 2.814 , 18.761% , 12.409
0.100, 3.188 , 31.200% , 13.099
0.150, 3.526 , 40.320% , 13.640
0.200, 3.844 , 47.589% , 14.080
0.250, 4.148 , 53.546% , 14.454
0.300, 4.441 , 58.587% , 14.763
0.350, 4.714 , 62.880% , 15.026
0.400, 5.000 , 66.790% , 15.271
0.450, 5.282 , 70.379% , 15.481
0.500, 5.565 , 73.629% , 15.671

The sim confirms what you would obviously expect, which is that a normal is much more likely to have sex with a slut than their small fraction of the population would suggest. The average partner counts here for sluts is not even that extreme, obviously if I crank up the promiscuity of the sluts the numbers get even more radical.

One thing that may be surprising is that even with only 5% of the population being sluts, 20% of the normals partners are sluts. Also even with equal coupling chances, relation duration alone creates a big bias.


03-08-08

Gary Gygax's death has spawned a lot of fond reminiscing about D&D all over the internet. Something that's not being said much is the sad way that Gary's work has been sullied over the years. Gary knew that the important thing was providing a basic structure for human creativity, that D&D was really a kind of communal story-telling, and the important thing was the fantasy, not the numbers.

Even before Wizards, TSR started driving D&D into the ground trying to scrape out a buck. They released a ton of box sets and new campaigns and new worlds, trying to sell more books. Gary left TSR around this time and didn't get a ton of money. Then Wizards bought them and did a good job of bringing a little energy back to the brand, but also began to release version upon version to sell more books, and added more and more complication and numbers crunching. Original D&D is a very inflexible system, you pick a class and roll your die and that's about it, you don't obsess over optimizing your character traits. The newer D&D is a Wizards points allocation system with all kinds of complexity that stat geeks can fiddle with.

Of course computers were the fatal blow for RPG's. I love computer RPG's, but they're really completed unrelated to what playing a real live RPG is like. It's sort of like playing solitaire vs. playing bridge.


03-06-08

The way you interview a candidate really makes a big impression. It's not really fair, because it's not very reflective of how the company is run in general, but if the interview process is a mess the candidate will think the company is a mess. Things I have often encountered :

Show up for the interview and nobody is ready to talk to you. You sit for a half hour or whatever waiting for someone to come and then it's just some HR guy and you just sit in his office for an hour.

You're dumped in a conference room with a programmer to interview you and he has no idea who you are or what to ask you. Or he's ridiculously junior and has no clue of how to qualify you. They obviously haven't read your resume before the meeting and have not prepared for you at all.

They schedule phone interviews and then don't call, or are just like a nightmare to schedule, like "can you do it at 8 AM ? no? well how about 7 PM? " . Or the ridiculously no-notice scheduling, like "I have a spot in 10 minutes can I call you then?". WTF give me a range of reasonable hours I can pick from let's not send specific times back and forth forever.

They fly you somewhere for an interview then give you minimal or shitty help finding hotel/transportation etc. That should be all completely mapped out for you so you feel taken care of. They make you pay for expenses and require receipts for reimbursements, and don't pay for weeks or unless you bug them. Relic did that to me and I don't think I ever got reimbursed.

Interviewing was one of the things Oddworld did really well. It doesn't take any more time or money to do it professionally and it creates such a better impression that you don't have your head up your ass.

The other thing I see even from pretty smart programmers is asking questions that are totally opinion/style questions and acting as if their way is the One True Way and anyone who doesn't get it is not a great candidate. They might for example ask you do write a class factory and if you don't use a Meyers style "Singleton" pattern they think you "don't get it". Not only is that dumb, it's just a horrible way to qualify people because it isn't testing anything except whether they agree with your style choices.


03-05-08

Won sent me this Wired article on Netflix Prize . As usual it's too basic, it just tells you the things that are completely obvious and stops before it gets interesting.

One thing I was talking to checker about the other day is the way user's ratings are highly affected by the rating that was predicted for them. He claimed this makes recommender systems broken. I contend that the recommender is working totally fine, the problem is the Web 2.0 web site that let you see the guessed rating before you rate yourself.

There are a few different psychological factors at work. One is that the guessed rating becomes a baseline. Then if you like it you want to rate above that baseline, and if you didn't you rate below it, so you're not really using the absolute scale. The other is that seeing the rating before you watch the movie preconditions your response; it tells you something is supposed to be really great and you are biased based on that.

Did I never write a description of my ideas on Netflix? I can't find it anywhere in my old rants. I should do that.


03-04-08

I had this ridiculous like 70k todo list text file. I decided it was unweildly so I split it into like 10 files based on categories. Now I can never remember what file I put something in and go searching around for things, and wind up keyword searching my todos which is totally fucking broken.


03-04-08

On the plus side, I'm now an expert in shoulder rehab. On the minus side, one of the things I've learned is that you really need to lay off it way longer than you think you need to, and that is absolutely what I don't want to do.


03-04-08

I just started whistling "If I were a rich man" and went immediately into the fucking Gwen Stefani song. UGH. That song is lost to me now. There should be a law against pop musicians taking great old songs because they completely ruin the memory of the original song in the collective mind.


03-03-08

FUCK FUCK FUCK I'm so retarded.

After doing a hard HST workout in the gym and then going to a Yoga class in the morning, all of which was pretty good, this afternoon I decide it's a nice day I'm gonna go run around in the park. I took the rugby ball to toss around to myself and practice hands. So I'm running around being a goof like I do, and I decide I'm gonna practice some ugly bounce recoveries. So I start flipping weird bounce grounders and sprinting after them and trying to make a clean pickup on the run. The field at Dolores is absolutely treacherous, it's full of pot holes, so I make a throw and go running after it, and make a hard swoop to pick up the ball and slip on one of the huge divots and go flying. I put my left arm out to brace the slide, and when I landed on it my left shoulder made a huge POP. (for those of you who remember my shoulder injury last year, that was on the right).

So now I'm icing my shoulder and taking a ton of advil. Best case I'm out of commision for a week. Worst case is much much worse. I'm sore and I wish I had a lover to take care of me and put my bandages on. When you're sick or injured is when you find out just how alone you are. This is so fucked. I'm so fucking stupid. I always do risky shit that I don't need to because it's "fun". Screw fun. When I hurt my right shoulder last year, I was layed up and acting like a baby, and Dan took care of me. I probably didn't need that much help, but it's so nice to be taken care of by someone who loves you when you're hurt.

If I can't do physical things I have no joy in my life. If I can't work out and run around, I have nothing to do, I'll go insane. This is a bad time for this to happen. Plus I have a free week of Yoga right now and it's going to go to waste because I don't want to risk putting weight on my left shoulder. All the group activities I've been trying to make myself do to meet people are physical things. Now I'm going to sit at home alone and stew. FUCK.


03-03-08

I've done Yoga like three times now so I consider myself one of the world's experts on Yoga, and I'll tell you a bit about it.

I keep noticing similarities between Yoga and modern fitness training. It's interesting when you see an ancient practice that has already got things that scientific people have only recently figured out. In particular I'm thinking about the way stretches in Yoga are done rhythmically, and with generally with muscular opposition. This is opposed to the stupid old way that people stretch which is just passive stretching. Yoga is a lot like "PNF" stretching, which is one of the new ways that modern understanding advocates stretching. There's a great link on my fitness page about modern stretching technique.

So, it's interesting when these ancient practices that don't really understand the science have come up with the right answer over time through trial and error. You see it a lot in cooking too, trying to figure out the science of how to logically cook things rarely works, and if you just copy what the French worked out over hundreds of years you'll probably be better off. On the other hand I think it's dangerous to look at these examples and assume everything they're doing is right.

It actually reminds me a lot of how species evolve. You'll see these adaptations happen over time that just makes them amazingly perfectly suited to their environment, but then there also be totally random bizzaro adaptations that don't help at all. I think all these sort of ancient practices are similarly, 50% of the practice really doesn't make sense.


03-02-08

I think when I get a job I will release the full source code for my poker apps, since at that point I won't care any more if it gets out and my accounts get locked and the games get ruined. If the high stakes poker guys only knew what I had they should pay me millions to NOT release my code. I wonder if there could be any way that I'm liable for breach of the terms of service with the poker sites. Most of them are offshore entities that are illegal in the US so I don't think they can practically come after me. I might need to investigate that a bit more.

To be clear, I personally never used my apps to make money, other than a form of HUD just like PokerAce. The apps were sort of fun research projects & something I considered selling. When I played high stakes I just played with my brain. What always concerned me is what other less scrupulous people might do with the apps, which is why I never released them.


03-01-08

Why isn't rent a bidding system like house sales are? I just watched our neighbor's apartment get rented out, and the day it went on the market some 30 people came by to look at it. It was very disturbing because the stupid landlord wasn't here so they would show up and bang on my window to be let in. I'm sure if the landlord took bids for rent there would be a price war and it would go way above asking. I supposed there may be laws against this.


03-01-08

buzzinfly is an awesome weekly mix by the great DJ Ben Watt. It's sort of minimal house. I guess it's called "deep house" though I don't get in what way it's "deep". This week part 1 is meh but part 2 is pretty rad.


03-01-08

I've got this problem where my laptop screen is broken and I need more disk space, it would be probably like $1000 to get everything I want. Spending that is pretty retarded though, I should just buy a brand new laptop. But then I have to set it up, config all the little things I do, and install all the software I use, and that takes days and days, the time cost of that is immense. So I'm faced with a tough decision that has no urgent time pressure on it and do what I usually do, which is nothing.


03-01-08

I just did a box jump to my waist height. That's not actually that great, there are people who can do like 60 inch box jumps (no run up allowed), but I still think waist high is pretty good considering that my vertical leap is like 1 inch.


03-01-08

San Francisco is the best clubbing city in the US, by far, by really far. I've heard good things about clubs in foreign lands but nothing comes close here. There a several venues that are just the right size IMO, big enough dance floors (eg. not just bars), but not super huge, I don't like those giant factory clubs. There are lots of great sound systems, partly from the Burning Man influence which creates lots of high tech resident DJ groups here. There are plenty of clubs that are totally devoid of the whole velvet rope bottle service pretentious nonsense scene which totally ruins places like Vegas and Miami IMO. There are also plenty of clubs that don't have a strong pickup scene, where the people just want to have fun and dance which is nice, which also means you can avoid the whole popped-collar cruising guy crowd.

I kind of like going out dancing alone, cuz it means I can cruise around and dance with anyone. As much as I enjoy going with a girlfriend, you do get kind of bored of dancing with the same person all night if it's like 4 hours or more. I guess going out with guy friends would be even better cuz you don't dance with each other and you can chat if you want, but I've never known guys cool enough to do that without totally bringing me down and making me feel repressed. I do wish there was like a form I could sign to swear that I'm not cruising for chicks, and then I'd get a special hat or something that lets them know I just wanna dance and they can relax and stop playing stupid games.

The ideal thing would be to go with a girl but then be able to mingle and trust each other and come back to each other frequently. It's unlikely I'll ever be with a girl that would really be okay with that, and even if I was I doubt I would really be okay with it unless she handled herself just right.


03-01-08

I've been thinking of going to therapy but I can't imagine that it will help. For one thing I know various people in therapy and they don't seem to really be improving the way they live. For another I'm plenty introspective already, I don't need somebody to dig out what's wrong with me, I pretty much know it.

The whole "don't care what other people think" prescription is completely inane. Of course you should care what other people think. People who don't are called sociopaths. Well then, "don't let it bother you when you're rejected or laughed at". Of course it should bother you. It's one of the worst insults anyone can give you. Getting love and approval from other humans is one of our strongest happiness triggers, of course getting shut down or put down should hurt immensely. You can make it not hurt by closing off and not feeling anything, but that's not an improvement. In general there's this spectrum that you can either be really emotionally exposed and react to everything, or you can totally shut down and not feel anything. Both ends of the spectrum suck. I believe it's a fallacy that you can somehow be open to feeling the positives and yet not care about the negatives, they necessarilly come together.


02-29-08

I'm totally going gray now. It's alright, I think I'll be able to rock the sexy salt & pepper look. I think my dad actually looked better once he went salt & pepper, he lost the nerdiness and became distinguished. The downside is it will be harder to fit in with the youngsters, not that I really do anymore.

On the one hand, spending all your time worrying about getting girls is pretty retarded, it seems so shallow and unproductive. On the other hand, finding the right mate is probably the most important single thing in life, so not spending most of your energy on doing that is pretty retarded.


02-29-08

"proselytize" is mis-spelled so often that Google won't even correct it. Some common ones you can find are "prosletize" and "prosletyze", "proseltize" and "proseltyze". The last one actually has a Yahoo Answers page where people define it and fail to point out the horrible misspelling.


02-27-08

New phone etiquette : when you answer a cellphone and you have the caller Id so you know who it is, you should not just say "Hello?" you should say "Hi Chris" , which tells the caller you know who they are and they don't have to identify themselves.


02-27-08

Almost every single bar in America is retarded. Small tables and chairs do not belong in bars. The proper furniture for bars is picnic tables. (one long bench seat along the wall is okay too). Picnic tables accomodate big groups, and also encourage mingling if couples or a few small groups sit at them. Most other countries in the world get this right.

Also, Karaoke is fucking disgusting unless you are a Japanese business man in a suit with your coworkers. The proper way to sing in a bar is for everyone to sing folk songs together in a regular bar.


02-27-08

Some notes on bike balancing :

I wrote before about how the crucial thing is to hold the brakes and keep tension in the pedals so that you can transfer power through your legs to balance the bike. (search the archive for this). Some other little notes :

You don't want to come in too hot. Do most of your stopping before you try to go into a balance and get yourself really slow, then prep for the balance, then do it. Coming in hot to a full stop balance is a very advanced skill.

The balance is much easier if your two feet are level with each other. Once you're slowed down and getting ready to balance, get your feet in this ready position. There will be some slack in your freewheel when you come to a stop, so to compensate prepare your front foot slightly higher, then as you pull the brakes to go into the balance push forward with your front foot to take up the slack and get your feet into the level position.

It's important that your whole body is tensed but not stiff. You need to be supple, sprung, coiled with muscle but still loose. You are connected to the bike at your hands and your feet, so the energy is transfering through your entire chain, so you want elbows bent, knees bent, abs and back tight, shoulders slightly forward.

You obviously should be standing up to balance (standing and getting your body loose and flexed is part of the "ready position"), but you don't want to go too far forward. You don't want all your weight to go over the front wheel. For one thing you need weight in your legs because that's where your power is to control the balance, but also you need the front wheel to still be easy to maneuver. You may want to do some slight twitching of the front wheel, and that only works if your weight is centered back a bit.

Don't steer too much. You want to stay active and loose with the steerer, don't try to keep the front wheel perfectly straight, but at the same time, don't try to steer to balance. The balance is best acheived through lots of little movement, shifting your hips slightly, leaning your head, and steering ever so slightly. A good correction involves all these things, and not one big movement with any one of them.

Addendum : it helps a lot to look down when you balance. I like to focus on the spot where the front wheel touches the pavement.


02-27-08

My blog has always been a totally inappropriate combination of technical and personal. I guess that's what made it somewhat interesting back in the day, though this style is totally ordinary now. Actually writing interesting personal things is becoming more and more difficult. When I started, the internet was a much smaller place. Pretty much only highly technical people were on the net then, eg. none of my family or girlfriends or their families. I could write about my personal life without it coming back to hurt the people around me. My own inclination is to be completely honest all the time, I'm not really embarassed of anything in my life, I would rather just write about everything, but people around me wouldn't understand and would take things the wrong way, and unfortunately I have to worry a bit about people's feelings. Despite how it may seem I actually think about other people's feelings a lot; I think of tons of stuff I would love to write about, because it's interesting and amusing, but stop myself or delete it for the sake of others.


02-27-08

Things I wanted to do while on "sabbatical" :

1. Drive around the US and camp and hike. Did it, but did it much faster than I dreamed. Got bored and came home.

2. Travel, to South America and South East Asia. Didn't do it. Couldn't find anyone to go with and didn't want to go alone.

3. Get back in shape, in particular work on my back/neck/posture/etc which was in serious trouble when Oddworld closed. Sort of did it, though wound up spending most of my time and energy on shoulder rehab which is still fucked. My back/neck was way better for a while when I was off the computer, but I've wound up spending a ton of time sitting at the computer again and those problems come right back.

4. Play team sports. Didn't do it.

5. Be more social / work on getting out of my shell. Made some serious effort at the beginning but failed and gave up. Regressed completely with Dan.

6. Make an indie game or some kind of other fun software project for profit. Didn't do it. Made a few little aborted attempts but never thought of a game that I actually wanted to make.

7. Get better at poker and try to beat the highest levels. Mmmm.. never really did this the way I wanted, partly because I was spending a lot of energy on Dan and for me anyway it's impossible to get to the highest levels and have a relationship at the same time. I did play and beat 2000 NL online, which is pretty good, so we'll call this "sort of done".

8. Try some other careers to see how I would like them - like being a chef, a farmer, a journalist. Didn't do it. I did cook more but that counts for nothing. I also wrote some articles to submit to editors to try to get a writing job, but never actually submitted any of them.

9. Have lots of sex with different girls. I never really had a partying / screwing around phase, since I was working on software pretty much full time all through college, in addition to lots of extracuricular physics work. This has never been anything that I really wanted to do, but what annoys me is that I always have the thought somewhere in the back of my head that I didn't do it, so I wanted to do it just so that thought would go away and stop taking up brain space. Didn't do it.

10. Learn to play guitar. Made a few half hearted attempts. Failed.

11. Learn to make music on the computer with some kind of tracker thingy. Spent a few hours on this a few times and was flummoxed by how unusable these apps are. I actually started this again in the last few days, but it' still a "Didn't Do" at the moment.

12. Figure out what's going to make me happy and how I should spend the rest of my life. Didn't do it.

13. Be a really great boyfriend for Dan and succeed in having a healthy open relationship. Didn't do it.

Addendum : I know potential employers are reading this garbage brain dump of mine at the moment so I should note the thing I actually did accomplish was writing a bunch of code. Some stuff I did :

	Netflix Prize - a machine learning application on a huge dataset
	Guitar tuner - developed an unusually accurate and stable pitch detector
	Image doubler - in progress work on maximum likelihood super-resolution inference
	Sports bettor - solve sparse matrices and use machine learning to automate bets
	Limit poker bot - developed a Bayesian limit bot, as well as an anti-TTH perfect simulator
	Poker site helper app - developed for potential sale, learned win32 hooking
	Short stack poker bot - developed optimal game theory solution for simplified NLHE


02-26-08

Holy SHIT the Google Maps "Take Public Transit" thing is UNBELIEVABLE. My god it's like people with half a brain are actually writing software and making it do the things that it obviously should do. It seems to only have BART in the system, but I'm sure that better data coverage of other systems will be forthcoming.


02-25-08

Whoah, I just realized why house music resonates so much with life in the city. It's always in my head when I walk around and I automatically start beat-boxing. It's because of the footsteps! If you walk fast like a city person, you're walking left-right-left-right , dum-dum-dum-dum , about 120 bpm. The beat of your steps automatically sinks into your brain and sets up a rhythm base with each step and you walk quick bum-bum-bum-bum , and then your brain automatically starts adding the off beats, bum-chik-bum-chik-bum-chik-bum-chik .

ps. no I was not on drugs when I wrote this. I wish I was.


02-25-08

A "Chicken Burger" is a Chicken that lives in a city. A chicken served in the style of a Hamburger should be called a "Chicken Hamburger". The 'ham' does not mean that it's made of ham, and "Hamburg" is not seperable.

Addendum : Ah crap, it appears this rant was a case of Cryptomnesia , perhaps from Dmetri Martin, but probably the original source is even older.


02-25-08

Anybody out there work for Pixar or ILM ? Any idea if that would actually be a fun job, working on tools/algorithms/rendering stuff?


02-25-08

I realize now that I've been extremely depressed for a long time. I guess it was obvious, but you don't really see it when you're in it. The thing is it comes on so gradually. One day you're happy, then life just beats you down and you make less and less effort, and day by day very gradually you give up and lose hope, and you start sleeping more and not wanting to go out, and just not looking forward to anything and thinking everything is shit and everything is pointless. But it doesn't feel like depression, you just think that's the way life is, because it came on so gradually that you can hardly remember when it was different, and the happy times in the past seem like fleeting moments in a sea of gray. Then something happens to shake it up and all of a sudden the birds are signing and the world is beautiful and you can't wait to get started on all these new projects and you're looking around for new fun things to do and trying to pack your days full of new experiences and new people - and only then do you realize that wow, do normal people actually feel like this all the time?


02-22-08

Bleck I'm being such a retard about working out and the worst thing is I know it. I'm lifting pretty hard, but I'm just not eating enough, and I can't add muscle unless I eat more. My bodyfat% is already super low and it's just not possible to put on more muscle at my bodyweight. I really want some more shoulder muscle to protect my injuries but it's not gonna happen unless I add 500 cals a day or so. The problem is my stomach tells me to stop eating and I listen to it; partly it's all the years of trying to eat right and stay lean, it goes against all my instincts to force myself to overeat. I mean, I have no problem overeating once in a while for special occasions, but that isn't what I need, I need to slightly overeat every day, and it's really hard. In the mean time all this working out is pretty much completely pointless because I'm not providing the fuel to create new muscle.


02-22-08

Yes, yes, I really am looking for a full time job ; I keep telling people this but they don't believe me. Yes I really want to go to work and write code. I realized a little while ago that I was basically working full time writing code anyway, so I may as well get paid for it. Plus I miss having other people to work with.

No, head-hunters that does not mean you should send me all the shit jobs you're trying to staff. I'm still not looking for anything awful. Game company jobs in general don't appeal to me that much; the management is too shitty, I don't want to get involved in another badly run disaster where I have to work my butt of and don't make any money. I want to ideally work on interesting things where I can write fun code, either at a big safe corporation or a tiny tiny place with friends. I don't want a management job or anything with a lot of friction and non-technical headaches. I'd love to be able to just write interesting code and work with smart people.

I'm gonna be talking to Google and NVidia ; if you have any other good suggestions that fit the bill let me know.


02-20-08

People at the GDC ask me what I'm up to these days. I tell them I'm learning to do the Cabbage Patch, from this amazing teacher .


02-20-08

It's annoying that one of the most common things you want to do with floats is also one of the least precise : summing or taking an average of an array. Say you have a bunch of values that are all similar, and you want the average and sdev. To get a good sdev you need a very accurate average. If you have say a million or more numbers, each one becomes tiny compared to the sum, and if you just do a normal sequential sum you make total shit. Of course if your numbers are floats you can do your sum in doubles and mostly not have a problem, but if you really need fine accuracy you need a better solution.

Of course the solution is both easy and obvious. In the common/standard case that the numbers are all very close to the average, a simple recursively dividing sum works perfectly :


template < typename _t >
_t recursive_sum(const _t * begin,const _t * end)
{
	int count = end - begin;
	switch(count)
	{
	case 0: return 0.0;
	case 1: return begin[0];
	case 2: return begin[0] + begin[1];
	case 3: return begin[0] + begin[1] + begin[2];
	case 4: return (begin[0] + begin[1]) + (begin[2] + begin[3]);
	default:
		{
		const _t * mid = begin + (count/2);		
		return recursive_sum(begin,mid) + recursive_sum(mid,end);
		}
	}
}

ADDENDUM : Okay this is a pretty dumb way to do it. It actually works fine and is plenty fast but the better way is this Kahan Summation thing. BTW the PDF linked at the bottom of that page is an awesome awesome doc on floating point.

(UPDATED 3-04-08) : Here's a version that roughly matches the signature of std::acumulate ; the type inference from accum is a little bit of a subtle thing that can bite you in code if you're not careful. BTW I flipped the sign compared to Wikipedia cuz it just makes it really obvious what's happening IMO.


template 
static inline T kahan_accumulate(Iter begin, Iter end, T accum) 
{
	T err = 0;
	for(;begin != end;++begin)
	{
		T compensated = *begin + err;
		T tmp = accum + compensated;
		err = accum - tmp;
		err += compensated;
		accum = tmp;
	}
	return accum;
}

template 
static inline typename std::iterator_traits::value_type
kahan_accumulate(Iter begin, Iter end)
{
	return kahan_accumulate(begin, end,
                         typename std::iterator_traits::value_type());
}

There are a bunch of ACM papers on this stuff; if you're serious you should go there. For our rough graphics kind of use this Kahan thing is going to be fine. The "Cascading Accumulators" method seems to be the awesome way to go but it's a bit complex to implement. Apparently Kahan also has less error than the recursive way, though in practice they're both tiny.


02-18-08

I discovered there are all these amazing videos of Foundries on YouTube made by amateurs, many of them made in the last days of US and UK steel production. You can start here : Youtube Iron Pour and then browse to similar videos and find tons of amazing stuff.

I also found this performance art group : Iron Guild which looks totally awesome, and some stuff one of them made : Iron Art


02-17-08

I can't figure out a trick to remember how to spell "guarantee". I always want to write "garauntee" which IMHO makes much more sense and you can remember it because it has "aunt" in it making the same sound.


02-17-08

Poker and coding both have this really sucky nonlinear return property. I guess physics did too and probably most really complicated mental endeavors. Basically if you are thinking about it 100% of the time, then you are in the zone and your efficiency per unit time is E. So like if you work for 8 hours you accomplish 8*E. If you try to do other things and balance your life more, you don't just spend less time at E, you spend less time at way way below E. The problem is it takes a huge amount of time to get back in the zone when you're out of it, to remember all those many connected things that you can't really write down and all the hunches and intuitive senses of the problem that you had in the zone. If you spend 6 hours, maybe you lose 3 to getting back in the zone and then only have 3 hours at full efficiency. It makes me feel like there's no point to doing it unless you go all out.

Coding is bad this way (and worse in a big complex code base or a really hard problem), but Poker is worse. One of the things that makes poker so bad for this is the lack of good feedback due to the high variance, which it makes it very tricky to get yourself back in the zone because you'll get both false positive feedback and false negative feedback. As you work back into the zone you may have multiple false plateaus where you think you've made it, only to realize later that you were wrong. It literally takes weeks to get back in the zone, which makes dipping in and out of play pretty impractical.


02-17-08

I found a bunch of craigslist adds for free massages for "fit men". Hmm.. I'd like a free massage, I wonder if they'll agree to use latex gloves.

So I guess I should play soccer. I've been looking for some kind of fun team sport, and there are just tons of pickup soccer leagues, and that's a relatively low-injury sport aside from twisted ankles and such.


02-16-08

XLR8R has a shit ton of good free music to download. Stop playing those god awful party mixes you make yourself that have no flow and jump from Conchords to Britney Spears to Gnarls Barkley. Download these pre-mixed podcasts and become a party music super hero.


02-13-08

Watching the Nature episode "Crash" about the Red Knot seabird that lost 90% of its population in almost one generation due to human disruption of their food source. It reminds me of something I wrote in "Fitness" about how the human anatomy has evolved to not build unnecessary muscle. You see with this occurance of the Red Knot how catastrophic events can cause massive rapid genetic selection. Up until the 1970's or so, everyone thought of genetics as gradually evolving over the aeons, lots of tiny changes adding up. We now know (and it seems quite obvious in hindsight) that in fact the overall genetic makeup of the population makes very rapid and massive changes in response to cataclysmic events. Lots of minor differences evolved into the population over the years, and they weren't strongly selected. Then suddenly something happens and every individual without a certain gene is dead, either from famine, or a disease, or a change in a food source or a predator. With humans, there have been countless famines (and a recent ice age and plague) which have wiped out the individuals that built muscle more easily and required more food to survive.


02-13-08

I put a bookcase in my kitchen out of necessity and have all my pantry goods on it; at first I thought it was pretty ghetto but now I realize it's TOTALLY AWESOME. Everything is easy to grab and you can see what you have. It's even better than those restaurant style racks cuz those racks are too big, the nice little cubbies of the bookshelf are perfect. It made me realize the whole obsession with "kitchen cabinets" is totally retarded; cabinets suck, it's way more functional to just have no doors and have lots of shelves, and IMO it looks much better too if you have a cool functional kitchen like that. The kitchen cabinet obsession is like the 50's gauche bourgeois manicured lawns and plastic on the sofa.


02-13-08

It's so hard to find good people to play board games with. They can't be too geeky, you have to be able to talk about things outside of games, cuz the actual game isn't that fun it's just a venue for socializing. On the other hand, they have to be smart enough to play well or it just feels pointless. They also have to be smart enough and sharp enough to move really quickly. Most board games are just excruciatingly boring if people take too long, or aren't even paying attention to say they don't realize it's their turn. Sometimes I wind up turning into the "Dealer", going "okay it's your turn now, okay that's what you do? you're done? okay next person", which is not really a good spot to be in, it makes you an asshole and means you can't chitchat at all yourself, but god damn people when it's your turn you just FUCKING GO, and then you can go back to chit-chatting when you've made your move. Most board games are super high variance and also quite shallow. They're much more fun if you treat them as a quick semi-random match, and play over and over quickly.


02-12-08

Hi-Def San Francisco has some amazing videos; check out the "Best Of" section to watch the fog roll across the city.

Refocus Imaging and Helicon Soft are two different ways to do 3d-photography where you can set the focal plane (or arbitrary focus surface) after the fact.


02-09-08

I bought a Samson C01U USB microphone. I have to say I'm quite disappointed and would not recommend it. Most of the reviews I read were very positive, and in theory you get better quality and save money by buying a USB microphone so you don't have to have a seperate A2D and Mic PreAmp and all that stuff. The problem is it's really really quiet, it's made for you to put your lips almost right on the mic. It works fine when you're that close, but anything further is inaudible, and if you pump up the volume after recording it's just noise. Like even one foot away and the sound goes to shit. God damn it.

Sweet page on how mics work


02-09-08

When skiing I thought of this invention. Ski Goggles should have a nose cover. Not a sealed nose cover but just a piece of plastic or maybe cloth attached over the nose. It would protect the nose from cold wind and also from the sun, since the nose is the most likely thing to get sun burned. I drew a picture . On the right you see normal goggles, on the left the deluxe goggles with Nose Guard. Patent Pending!

It's such an obvious and seemingly awesome thing to have there must be some reason why it's not a good idea or it would've already been done.


02-09-08

Writing the date in my rant entries is just about my only connection to the calendar.


02-07-08

Went skiing the last few days. WTF is up with Truckee? First of all, TRAFFIC CIRCLES !? I actually think traffic circles are generally superb and good for traffic, but Americans are retarded around them, and to have a tourist town that's a winter destination with icy roads covered with traffic circles is so insane. Second, when did the downtown area get all fancy? I remember Truckee as a total trashy mountain town but now there's this strip of Aveda spas and all that kind of yuppie shit. Bah humbug. The skiing was mostly pretty great.

This image doubling thing I'm thinking about is mostly called "Super Resolution" in the literature (all 2 papers). I hope that name doesn't stick. There is a rather seperate thing which is also called "Super Resolution", which is creating hi res images from crappy video sequences. It seems to be further along in development. Sina Farsiu has good papers on the video kind of super resolution. There's even a free research application called MDSP that does video super-resolution.

It's funny that this kind of video resolution enhancement was being done in the movies way before it was actually being done on computers in real life. I mean, it's no surprise that movies made up semi-sciency mumbojumbo for plot purposes, what's funny is they got it almost exactly right, and I'm sure most computer people at the time knew it was a totally reasonable thing to do, it just hadn't really been fully worked out yet. Unfortunately for all the replicant hunters out there, we still haven't perfected the zooming around the corner technology.


02-04-08

LOL internetaments. For some reason I keep trying to spell "schizophrenic" like "pschyzophrenic". I was writing it today and it looked wrong so I typed it into Google to spell check as I often do. Google tells me the right spelling, but I also notice a few results down is a link to my fucking rants where I used the wrong spelling. If I keep spelling things wrong I can be the definitive site for words like "seperate" and "beaurocracy".

In the future everyone will have a completely unique name so that they're easily searchable.


02-04-08

I've got two computer problems maybe someone can help me with.

1. "Shortcut Keys" in Windows. I'm talking about the thing where from an app's right-click Properties dialog you can set a "Shortcut Key" and set a key combo. I use this a lot as a way to key-chord to apps. It works great except once in a while it stalls out like crazy. You hit the key combo and nothing happens, and in fact the whole taskbar becomes nonresponsive; all the other apps still respond fine, and the process monitor doesn't show any CPU spike. A minute later suddenly windows goes ahead and executes the shortcut key. Note that just switching to a macro program doesn't fix it, because the shortcut key thing is smart enough to switch you to an existing instance rather than start a new one.

2. Getting files from Perforce that don't exist. This is some problem with the VC-Perforce integration. As usual I can't get any help from either of them. This happens when you have a VC project which is under source control, but some file in the project does not exist either on your disk or in the depot. You start up VC and load the project, and VC goes into this mode "Getting files from source control" where it tries to get the file from the depot. If the file is in the depot it gets it fine, but if it's not in the depot, VC appears to hang. I've never let it sit long enough to get past this, but I've watched the disk activity while it's hung in this mode and it appears that VC is doing a recursive scan of the entire depot and touching every single file; I have no idea why it does this and I can't stop it. It's pretty fucking annoying. There is a workaround which is to open the vcproj and figure out what file is mising and make an empty file with that name or remove it from the project, but that's a pain and I'd like to just not have this stupid hang.


02-04-08

My next project is to get back on the Image Doubler and see if I can actually make the predictive/learning doubler do something worthwhile. I went looking for a big repository of hi-res research/reference images a while ago and couldn't find a damn thing that was decent, it's all super low res or super small collections, like 16 pictures or less. Yesterday I had a "duh" moment. Torrents! Go to the torrent sites and filter for pictures. Of course there's a lot of stupid pictures of ugly girls, but there's also awesome stuff like a package of 800 photos of nature at 1920x1200. Each pixel in each color plane is a training sample, so that's 5.5 billion training samples right there which should hold me for a while.

Ideally I'd get the uncompressed so I don't have spurious JPEG artifacts in my images gunking things up, but it's hella hard to find a good uncompressed image data set.

Ideally I would like an image training set which statistically exactly mirrored the overall statistics of all digital images in existance (weighted by the probability of a user interacting with that image). That is, if 32% of the neighborhoods in all the images in the universe were "smooth" , then in my ideal training set 32% of neighborhoods would be smooth. The average entropy under various predictors would be the same, etc. Basically it would be an expectation-equivalent characteristic sample. Some poor graduate student needs to make that for all of us.


02-04-08

I updated the guitar tuner app with slightly better harmonic/fundamental tracking. It's still not ideal, that's sort of a messy heuristic problem, I could do better than I am doing but whatever it's not hurting the app much so there it is. There are a lot of advantages to all the noise tolerance work; I can tune my guitar while cars are driving by outside, which totally freaks out my handheld crystal tuner thingy; also I can just use the super shitty mic that's built in to my laptop and actually tune pretty well. One ugly thing is that the guitar's low E is very close to the kill noise frequency threshold I'm using (it's 82 Hz and I kill everything below 77 Hz) which can cause some inaccuracy on the low E if you aren't careful, cuz part of the spectrum tail gets cut off.

For my own reference in case I come back to it : there are a few issues with the whole fundamental frequency thing. I mainly modulate out octaves, so F - 2F - 4F harmonics aren't a problem, it gets really screwy when you switch from 2F to 3F , and actually even 5F can be the highest peak. If you get a spectrum where the 3F peak is biggest, but there are still solid 1F and 2F peaks, that's pretty easy to detect. Another case is sometimes the 1F peak is completely missing, but if there's a 2F peak you can still deal with that by looking at 2/3 or by seeing that the spacing between peaks is F. The really really evil case comes from time evolution. Sometimes you can strike a note and it starts up with strong peaks at 1F,2F, and 3F. Over time each of their amplitudes changes and you can get cases where the 1F and 2F peaks almost completely disappear and you're left with only a 3F peak. To handle that correctly you have to use temporal continuity and just assume that that sound is still acting like a 1F pitch (sometime the 1F and 2F peaks come back and the 3F peak dies out as time keeps evolving). The spacing between the peaks in the frequency domain is a pretty way to get the base pitch (you imagine an additional peak at zero frequency).

I hate relying too much on temporal continuity because it mean that short-time errors get persisted. It's much cooler if I can do everything without relying on the previous frame. You do get a tiny short-time error from fourier analysis of transients, but that's negligible. The real problem comes from real world short-time sounds. With tuning a guitar, when you first strike a string there are lots of funny sounds from your finger rubbing the string and perhaps the string slapping the body and all that stuff that only lasts a second or so. If you start building in too much continuity, you could pick up weird pitches in that mess and then try to persist them. Really once the clean note starts sounding you want all that junk to be forgotten.


02-02-08

There's something about Jamie Oliver that I really hate; maybe it's his big pouty lips, or maybe all his cute little Brittishisms, or the way he says "yeah" all the time as punctuation. Anyway, his show "Jamie at Home" is probably the best cooking show on TV at the moment. (apparently Jacques Peppin has some new shows but the PBS here doesn't carry them). Jamie's sous-chef Gennaro is absolutely amazing, such a weird character.


02-02-08

Musharraf has certainly done a lot of things we should be unhappy about, but we have to remember he's in a very difficult situation. He has to contend with four very seperate and powerful forces in Pakistan. 1) The middle class and the lawyers, which wants democracy, rule of law, and stability; they would mostly vote against him if there was a good alternative. 2) The devout muslims and the tribes, which want Sharia law and independence from the government; this faction could easily become very violent if upset, and is dangerously close to a majority which means if there were true open elections they might win. 3) The military. This is Musharraf's base (remember he became president in a coup and his power is still backed by the military) - but the military is quite independent, and portions have strong ties to the tribes or the ISI; if anyone tried to curtail the power of the military they could face a coup. 4) the ISI (the intelligence service), which has strong ties with the tribes and the Taliban, and is very independent and certainly responsible for many political assasinations in Pakistan; again moving against them could easily lead to disaster.

The US official policy is that we want real democracy in Pakistan, but behind closed doors the CIA and State aren't so sure about that. An election could easily destabilize Pakistan if someone is elected that is disliked or moves too fast against the Islamic extremists, the military or the ISI. On the other hand if anyone comes to power that works too closely with any of those factions that could also be bad. Musharraf is doing a half decent job at the moment of keeping all the factions reasonably pacified.

Much like Iraq, Lebanon, and Palestine, pushing for elections too soon could be a disaster. First you need some level of stability and rule of law, protection against assasination, a fair election system, the confidence of the populace so that no big groups boycott the vote, etc.


02-02-08

If recommender systems like Netflix were augmented with a Network of Trust, one thing you would want is to remember where you got the recommendation from. That way when I watch a movie that was highly recommended for me, and I hate it, I can go back and see the link that recommended it for me and mark it as "don't trust" (for movies). In theory a simple collaborative filtering system will eventually learn your similarity with everyone else in the system, but in practice you have to have a very large # of movies in common before that becomes accurate, far more than normal people watch. If you allow the user to provide extra information you can converge on their tastes much faster.


02-01-08

What should our next president actually do?

1. Get rid of the G.W.B. tax cuts. Politically it would be best to leave the tax cuts for everyone making $100k or less and phase them out above that. Restore the estate tax and dividend taxes. Our government desperately needs money to fix this country and putting back those taxes on the rich will hurt the economy the least. It's a damn shame those tax cuts ever passed.

2. Figure out a way to get out of Iraq and restore our military's morale and fighting capability. This is a mess so I won't go into big details.

3. Get more troops in Afghanistan. Hopefully get an administration with more leverage internationally that can get some more support from NATO. Then we have to do something about Pakistan which is a huge problem. Getting the Pakistani government to really go after the tribal areas is impossible, but it might be possible to get more of the Pakistani military on the border to try to reduce the amount of border crossings. Make it clear to the tribal leaders that they are not being invaded and they are free in their domains but they aren't to cross the border. That certainly won't work without a much bigger force on the Afghan side of the border.

4. Actually do something to improve education in the US. "No Child Left Behind" is such a retarded worthless law, it creates standards and holds schools to them, but A. the tests are horrible measures of learning and B. doesn't provide funding. We need a lot more federal money for schools. We need legal maximum class sizes, maybe 20 kids per class. We need to stop cramming together kids of different aptitudes; maybe require that the slower kids get even smaller class sizes. Offer more after school programs, particularly for the kids that don't have good home environments. Then we need to start thinking about doing something with the colleges. College prices are skyrocketing and at the same time the quality of education is going way down with lots of kids getting rubber stamp degrees.

5. Spend on infrastructure. This is just something we should be doing all the time that we cut at some point to save money in budgets. Infrastructure spending is great for the overall economy; no it's not a great way to get out of recession because it's not fast enough, but it provides lots of steady jobs, and the infrastructure that results makes many other businesses work better. We need roads, bridges, flood control, commuter trains, etc. etc.

6. Get a handle on health care. This is another huge mess so I don't want to get distracted by it. I will just point out how big of a problem it is. We already spend around 15% of our GDP on health care, which is far far more than any other country (many Western countries are around 5%). Furthermore, health care costs are still rising much faster than inflation. This is a huge drain on our economy that needs to get fixed. I think the basics of Hillary's plan are the right way to go, but reducing all the unnecessary expenditure and fraudulent profit taking is going to be politically very difficult in our pro-shyster culture.

7. Do something serious for the environment and energy independence. I don't want to oversell the importance of global warming, but it's high time we started doing something serious to reduce fossil fuel usage; if you like you can pretend that the reason is for strategic energy independence. It could also be very good the economy. There are lots of obvious steps that make sense. IMO the best thing would be huge taxes on water, electricity and gas, which would allow the market to figure out the best ways to reduce use and make it all financially driven. I'm sure that's a political non-starter, so instead you could start penalizing companies that are extremely inefficient with their energy use or waste disposal. Put a bunch of money into researching alternative energy. The amount of money needed for this is microscopic compared to the benefits. A few billion dollars is a HUGE amount to fund research, but is a drop in the bucket compared to our oil imports. A few cents of gas tax could easily fund alternative energy research. Tax subsidies for the shitty alternative energy solutions we currently have do nothing but give free money to businesses.

What are our current political parties actually about? They talk about this and that, but what have the parties consistently actually accomplished in office :

Republicans are :
tax cutters
deficit spenders
grow the defense/military budget immensely
anti-environmental controls, anti-parks
anti-gun-control
anti-abortion
anti-gay-rights

Democrats are :
????

It's hard to remember anything that the Democrats have really done in recent history. I mean if we list things done under Clinton it was mainly free trade, cutting welfare, etc. I don't know if we can really draw a distinction in terms of foreign policy, both parties tend to fuck around all over the globe in rather retarded ways that generally have negative long term consequences. Both Clinton and Bush allowed genocides to occur on their watch. I'm not really sure what the Democrats stand for other than being "not Republicans".


02-01-08

I'm coining the use of the word "sitcom" as an adjective. Sitcom means basically "formulaic and an exaggeration of a stereotype, similar to the lowbrow repetetive comedy on bad sitcoms". An example would be like if your wife is mad at you for leaving the toilet seat up, you could "god honey that is so sitcom". Another would be like if a girl goes black and she never goes back, that would be sitcom. It can also refer to just banal typical daily living stuff, like "yeah we went over to Glen and Margie's house, we talked about their kitchen renovation, it was totally sitcom".


02-01-08

The fully Bayesian approach to modeling says you should consider all possible models, and weight each model by the probability that it is the right one.

You've seen past data D. You know for a given model the chance that it would produce data D is P(D|M) , so the probability of M being right is P(M|D) = P(D|M)*P(M)/P(D)

Now you want to make some new prediction given your full model. Your full model is actually an ensemble of simple models all weighted by their likelihood. The probability of some new event X based on part observations D is thus P(X|D) = Sum on M { P(X|M) * P(M|D) }

P(X|D) = Sum on M { P(X|M) * P(D|M) * P(M) } / P(D)

Note that the P(M) term here is actually important and often left out by people like me who do this all heuristically. It's a normalizing term in your integration over all models, and it's pretty important to get right depending on how you formulate your model. Basically it serves to make the discretization of model space into cells of equal importance so that you aren't artificially over-weighting models that are multiple covers. You can also use P(M) to control what models are preferred; for example you might make smoother or simpler models morel likely. eg. if you're modeling with a sine function you might make lower frequency waves more likely, so you make P(M) like e^(- freq^2) or something. This makes your model less prone to overfit noisy data. This whole step is referred to as "estimating the priors" ; that is the a-priori probabilities of the parameters of the model.

A common shortcut is just to use the most likely model ; this is aka "maximum likelihood". This just means you pick the one model with the highest P(M|D) and use that one to make predictions. This is potentially giving away a lot of accuracy. People often do a lot of hard work and math to find the model that is maximum likelihood, but we should remember that's far less desirable

In simple situations you can often write down P(D|M) analytically, and then actually explicitly integrate over all possible models. It's common to use models with Gaussian probability spreads because they are normalizable and integrable (over the infinite domain), eg P(D|M) of the form e^(-(D-M)^2).

An alternative to Maximum Likelihood is to use just a few different models and estimate the probability of each one. Again this is sort of a strange discretization of the model space but works well in practice if the chosen models span the state space of reasonable models well. This is usually called weighting "Experts" , also called "Aggregating" ; in Machine Learning they do a sort of related thing and call it "Boosting". There are a lot of papers on exactly how to weight your experts, but in practice there's a lot of heuristics involved in that, because there are factors like how "local" you want your weighting to be (should all prior data weight equally, or should more recent data matter more?").

In all the weighting stuff you'll often see the "log loss". This is a good way of measuring modeling error, and it's just the compression loss. If you think of each of the models as generating a probability for data compression purposes, the log loss is the # of bits needed to compress the actual data using that model as opposed to the best model. You're trying to minimize the log loss, which is to say that you're trying to find the best compressor. Working with log loss (aka compression inefficiency) is nice because it's additive and you don't have to worry about normalization; eg. after each new data event you can update the log loss of each model by just adding on the error from the new data event.

As I said there are various weighting schemes depending on your application, but a general and sort of okay one to start with is the previous probability weight. That is, each model M is weighted by the probability that that model would've generated the previously seen data D, eg. P(D|M). If you're working with a loss value on each expert, this is the exponential loss, that is e^(-loss). Obviously divide by the sum of all weights to normalize. Roughly this means you can use simple weights like e to the minus (# of errors) or e to the minus (distance squared).

My favorite tutorial is : Bayesian Methods for Machine Learning from Radford Neal

If you like to come at it from a data compression viewpoint like me, the canonical reference is : Weighting Techniques in Data Compressiom , Paul Volf's thesis. This work really shook up the data compression world back in 2002, but you can also look at it from a general modeling point of view and I think it's an interesting analysis in general of weighting vs. selecting vs. "switching" models. Volf proved that weighting even just two models together can be a huge win. When chosing your two models you don't really want the best two; you want two good models that don't contain any inherent inefficiencies, but you want them to be as different as possible. That is you want your two models to sort of be a decent span of "model space", whereas the single best model might be in the middle of model space, you don't want that guy, you want two guys that average to the middle.

A Tutorial on Learning With Bayesian Networks by David Heckerman is more mathematical and thorough but still half way readable.

More modern research mainly focuses on not just weighting the experts, but learning the topology of experts. That is, optimizing the number of experts and what parts of the past data each relies on and what their models are, etc. A common approach is to use a "Simple Bayesian Estimator" as a building block (that's just a simple model that generates random numbers around some mean, like a Gaussian), and then figure out a topology for combining these simple bayes guys using product rule of probabilities and so on.

Here's sort of a cool example that game-tech graphics people should be able to relate to : Bayesian Piecewise Linear Regression Using Multivariate Linear Splines . The details are pretty complex, but basically they have a simple model of a bunch of data which is just a piecewise linear (planar) fit, which is just C0 (value continuous but not derivative continuous). They form an ensemble model in a different way using a monte carlo approach. Rather than trying to weight all possible models, they draw models randomly using the correct probability weighting of the models for the data set. Then you just average the results of the drawn models. The resulting ensemble prediction is much better than the best found linear model, and is roughly C1.

This last paper shows a general thing about these Bayesian ensembles - even when the individual models are really shitty, the ensemble can be very good. In the linear regression models, each of the linear splines is not even close to optimal, they don't search for the best possible planes, they're just randomly picked, but then they're combined and weighted and the results are very good. This was seen with AdaBoost for classification too. With boosting you can take super shitty classifiers like just a single plane classifier (everything on the front is class A, everything on back is class B), but if you average over an ensemble of these planes you can get great fits to all kinds of shapes of data.

I wrote about a way to do simple weighting back on 12-16-06 using very simple variance weighting.


01-31-08

This Salt-cured steak method looks interesting. The last steak I did I tried a quick refrigerator "dry age" ala Alton Brown or Jeff Steingarten ; it turned out pretty great but it's hard for me to tell how much of that was the drying. These methods don't really accomplish the aging part of the enriching of flavor, they just do some drying, which reduces water and thus concentrates the flavor of beefiness.

The RealThai Food Blog is what travel food blogs should be.


01-31-08

Historically economies have always gone through boom & bust cycles. I guess it's an open question whether that's a fundamental characteristic of the system or if it's just a flaw we can fix. In any case, the US Fed in recent years has been on an unprecedented campaign to attempt to smooth out the down cycles in the economy. Certainly in theory if you have an economy that is going up and down but always trending slowly upwards, you should be able to reduce the swings and make it just a more stable rise. That, however, is not what we have been doing. Bernanke has specifically said that he doesn't think it's right for the Fed to try to prick bubbles or slow down overheated growth. On the other hand, any time we have a slow down we pull out all the stops to try to get the engine moving again.

Part of the reason for the subprime bubble was in fact the Fed's actions trying to get us out of the dot com bust, when they put the funds rate down to 1%, which pumps a shit load of cheap capital into the big banks, then the banks have to find instruments to put that capital in to get a profit. At some point you pump in so much cheap capital there just aren't enough good loans out there to make, so you start looking harder and harder to find somewhere to make a profit from that free money. I have no idea how this "zero recessions" philosophy will work out in the long run, nobody does because it's never been done before, we're in the middle of a big economic experiment.


01-30-08

I bought an Ab Wheel so I can work up to an ab wheel rollout . OMG it's harder than I expected. I'd love to be able to do an "L-Seat" too but hamstring flexibility is gonna be my limiting factor with that for the forseeable future. That site (Beast Skills) is great, and I find it a fun motivating factor to pick some cool bodyweight move and work up to that. I would absolutely love to be able to do a "flag", but I need to get a handstand pushup first, and before I even start that I need to do more shoulder rehab :(

I should emphasize that the Ab Wheel and pretty much everything on Beast Skills is a super-advanced physical exploit and should be not be attempted by beginners and should never be done without proper warmup and technique. Now of course I'll probably go ahead and ignore my own advice and hurt myself, but you've been warned.


01-30-08

I need to get a job. If I could find something where I just write interesting code and don't have to manage or do meetings or any of that junk, that would be pretty sweet.

In the last year I've rediscovered my love for computers. I've gotten back to what I fell in love with originally - just writing little apps for myself. I think of some algorithm or project that I'd like to have for myself, and then I just go research it and write it, and FUCK YEAH I can make this little metal box full of electrons do what I want, it's so much fun.


01-30-08

Apartment search notes for myself. The little things to remember that you might forget.

A gas stove is crucial; a broiler is nice; a decent fume hood & vent is pretty important; a garbage disposal would be nice. Laundry in the building is a pretty big plus. Hard-wired smoke detectors are a disaster.

Quiet is important. Avoid street-side, especially first floor. Don't be adjacent to the building's main door or mail boxes - the mail boxes are surprisingly loud. Don't be adjacent to the laundry faccility, people will use it at all hours. Similarly don't be near the dumpster. Don't be adjacent to parking, especially if car head lights will shine in your windows. One thing you'd really like to test is how noisy people walking on the floor above you is, or you can avoid that by getting a top floor unit.

Insulation and heating. Those single unit gas heaters really suck, especially if there's no good air flow around the unit. Bottom floor units will have really cold floors, upper floor units will benefit from the neighbors' warmth.

A professional complex or management company is almost always a better landlord than an individual; they will follow all the laws, they will have a full time repair guy who will come quickly and fix anything, they will respond to complaints, etc.

Being south-facing the get lots of light is semi important to me.

In general the most important thing is what it's like to sleep there; good sleep is crucial, without it your whole life sucks. The second most important thing is location. Walking to bars & groceries is a big plus. The third most important is the kitchen because I love cooking. The least important is what the living rooms are like and what it looks like; these things make a big impression when you're touring apartments but don't actually matter too much, you can make do with whatever and can always decorate to hide problems.

Also for allergic people like me, the allergen character of a place is really important. I think that pretty much rules out old places. I love the architectural details and character of old buildings, but they are full of mystery allergen sources. In particular holes into the wall spaces are really bad, as are those gas room heater things that so many apartments in SF have. Carpet is really bad, especially the retarded places that have carpet in bathrooms or kitchens. Wood floors and tile and waterproof materials are much better. Curtains need to be washed often.


01-29-08

My bud Drew finished his XBLA game "Poker Smash"; it's an idea he had when we were working at Oddworld. We used to play Tetris Attack on an old SNES to burn off steam (ba-kak!), and we were all totally into poker and thinking about capitalizing on the poker boom, Drew had the idea that it would be cool to do like a Tetris Attack but where you have cards coming down instead of just colors so you can make straights and flushes and stuff, not just match-three. So Drew actually took the dive and went indie, and against everyone's expectations he actually finished! Along the way it became a lot more than that simple idea, the bombs and freezes and stuff actually make the gameplay really different. Anyhoo, help the man out and Digg Drew's Game ; more at Void Star website


01-27-08

Okay I put up the binary of the guitar tuner app

It's a little ugly but I think it's pretty functional. There are more things I could do on it for sure but I'm gonna try not to waste any more time on it.

This was built on the IGJ4 sound engine which is really cool and should really be cleaned up and released to the public domain some day. There's no IGJ4 web page, but it was made by Atman Binstock so I'll link to him.

My references :

YMEC software - Introduction to simple sound measurement using a notebook computer
The Scientist and Engineer's Guide to Digital Signal Processing
Tartini - The Real-Time Music Analysis Tool
SWA Services Page
Ron Nicholson's Digital Signal Processing Page
PARSHL An AnalysisSynthesis Program for Non-Harmonic Sounds Based on a Sinusoidal Representation1
Music Technology Group - News
Music 147 Final Exam Preparation
James A. Moorer's Published Articles Page
Introduction to Pitch Detection
Introduction To Filters
High-Speed Analog-to-Digital Conversion - Google Book Search
Harmonic oscillator - Wikipedia, the free encyclopedia
Guitar tunings - Wikipedia, the free encyclopedia
Goertzel algorithm - Wikipedia, the free encyclopedia
DiscreteTFDs -- Time-Frequency Analysis Software
Digital filter - Wikipedia, the free encyclopedia
Butterworth C
biquad filter cookbook
Biquad C


01-27-08

So this interesting dude Bret sent me some audio help and I checked out his site : WorryDream ; there's a lot of entertaining stuff there, but the "Bio" section is one of the coolest geekisms ever.

Jammin Power / James A. Moorer has a bunch of cool articles from the infancy of digital music processing (the scanned older papers are the good ones); they're both amusing for the historical context, and also because the technical content is quite clear.


01-26-08

BTW if anybody has good links to highpass/lowpass/bandpass filter info, drop me a line. I've read some articles about generating filter coefficients, and it's all very complicated. I found some Java applets to give you coefficients, but they take the frequency as numeric parameter which is horrible, I want functions of the frequency. I have some code from Jeff which appears to be a biquad filter. Biquad filters are 2nd order discretizations of the traditional analog filters; biquad filters are a general class which can be gauranteed to be stable, unlike other filters which can blow up badly at poles (not all biquad filters are stable but the ones you want are). The site "MusicDSP" has this cool BiQuad Filter Cookbook . Actually there's a lot of fun stuff at MusicDSP. Their other code section has a lot of random contributed code snippets, some of which are intended as optimizations and are totally retarded, some of which are interesting.

Well, all these biquad ones I've looked at are just junk because they're so low order. Low order means they introduce very little delay which is nice, but they're like super low-order shape approximations of what you want. Just playing around it seems like need to get to like order 20 or more to get decent filter response that actually looks like what you want, eg. in my case a highpass with a very narrow transition band and a very quiet stop band. Also sort of an odd thing but it seems hugely preferable to filter in non-real-time. That is, instead of only filtering using past data, which is asymmetric and introduces a big delay, it would be better to filter time-centered on offline data.


01-26-08

This pitch detection play has got me thinking about two general algorithm/programming things that have been running around my head for a while.

One is how nice it is to work on things that humans do better than computers. Humans are just AMAZING at pitch detection. We can pick out the pitches of individual instruments or voices even in the middle of massive amounts of noise. Obviously there are tons of other things humans still do better than computers - play poker, compress text, recognize images, etc.

There are a few advantages to working on these things. For one thing, it lets you know for sure that there does exist a solution that's within practical reach. The human brain is complex but it's not many orders of magnitude beyond what a computer can do these days, so you know that if a human can do it, you could teach a computer to do it and actually get an algorithm that runs fast enough. If you didn't have that proof of feasability, you might spend a long time working on a problem and find out it just can't be practically done. The other big advantage is that you can sort of ask yourself how you do it. Copying the way that humans approach the problem isn't always the best algorithm for computers, but it is an interesting reference and lets you sort of get a personal closeness to the problem. It also gives you a really good sanity check, when some retard says "I've got this image recognition algorithm and it's state of the art nyah" you can just go "look dude, any human can do better so obviously there's a lot of room for improvement". It lets you know that it's worth still exploring, which is one of the most useful pieces of information to have; you don't want to spend a long time researching a better way to do something if there might not be a much better way - know how close you are to optimal is very valuable information. eg. the great ability of humans to predict text tells us that state of the art text compressors are still missing some crucial and it's worthwhile to explore further in that field. If you just looked at the way they've been asymptotically improving you might think they were near a dead end.

The other general topic I've been thinking about is the "physics way" of exploring new algorithms. In physics when you're trying to write down the equations of motion (you usually want to write a Hamiltonian or a Lagrangian), if you don't know exactly what they should be, you take guidance from the units and the symmetries of the system. Weinberg does this beautifully in his QFT book, but Feynman does it all the time in Lectures, and actually it's a really common trick in fluid dynamics.

Just as an example, say you want to write the Hamiltonian for a harmonic oscillator and you don't know what it is. You know you have a mass M and a frequency W to work with. Of course you also have the momentum and position, P and X. Your H has units of energy, so what can you write down that has units of energy? Well, PP/M is one. You can also make a fake momentum Q = M*X/W , so QQ/M is one too, which is M*X^2/W^2 , which is in fact the right term. But, there's more. You can also make a unitless number from (P/Q). Any time you can make a unitless number you can stick it anywhere you want. If we were in 3d you would require symmetry under rotations; eg. if I rotate my coordinates H shouldn't change, which would mean that only terms in X dot X and P dot P would be allowed, but I could still make a unitless number from (PP/QQ) and screw around with that, though these terms have poles so we can guess they're not the system that we want.

The point of view in physics is that any equations of motion that you can write down which have the right units and symmetry are perfectly valid physical systems which might exist somewhere or model some real physical thing. If you want to model some specific physical system, you have to find the one of those possibilities that matches your situation. Typically we want to start with the simplest possible model and see what kind of behavior that gives us; adding more terms is certainly allowed but generally models some more complex system.

I use this same kind of thinking to explore algorithms. Say you're trying to write a correlation between two vectors X and Y. You want the output to be unitless, so it's independent of the units or overall scale of the inputs. You want it to be independent of the length of the vectors (in a simple linear sense, obviously it will be effected, but if you just repeat all the same values twice it should be unaffected). So, there are various things you can write down that make sense. The numerator should be something like < X * Y > or < X - Y >. To make it unitless and unaffected by overall scale you need to divide by something like ( < X^2 > + < Y^2 > ) or < X^2 > * < Y^2 >. You want symmetry in swapping labels X <-> Y so other forms are no good. All of these possibilities that are allowed by symmetry and units are perfectly viable candidates. There's something else which is totally an option which is subtracting off the average of each vector. Using the physics/symmetry point of view, we should consider all of these to be possibilities. Fortunately in computers we can easily just try them all and see what's best.

There's a few things that should raise your spidey senses. Whenever somebody writes an equation which doesn't have the right symmetries or units or invariances. Like anything that's not immune to scale and bias in general should make you wonder why the units and zero-point have become crucial parts of your problem. The other thing is when someone is trying to write down an equation and just pulls some terms out, and you see that there are lots of other possibilities which are perfectly valid. Why was one chosen over the other? Often there is no good reason and you should consider the other possibilities.

A common example is averaging. You can take an arithmetic average : (X+Y)/2 , or a geometric average : sqrt(X*Y) , or a pythagorean average : sqrt( X*X + Y*Y ). All are perfectly valid depending on your problem - and in fact you can take the ratios of them to make unitless numbers which you can stick anywhere you want.


01-26-08

Some pitch detection followup :

I'm curious how those little hand held electric guitar tuner things work. I can't really find a description on the net, if anyone knows gimme a shout. They might use oscillators tuned to the notes and do some kind of interference thing? You certainly could make a pretty simple DSP thing using Goertzel response filters; you only need to track 2 filter responses, one on each side of the target frequency.

Paper : "A Smarter Way to Find Pitch" : this is basically just autocorrelation. They normalize in a slightly funny way, what they're doing is :

< x * y > / ( sdev(x) + sdev(y) )/2

That is, dividing the raw correlation by the average of the two variances. The true mathematical correlation measure is :

< x * y > / ( rmse(x) * rmse(y) )

Which is a geometric average of the variances rather than a linear average. It's unclear to me why you would choose one or the other, both are correlations in -1 to 1, but the latter is more well justified as a true measure of correlation.

I also went ahead and did fractional bilinear bin indexing for autocorrelation, but the fact is it's just too noisy for that to be worth anything (also the parabolic interpolation is probably a better way to do the same thing).

Paper : "YIN, a fundamental frequency estimator for speech and music" by Cheveigne and Kawahara. The "YIN" method is autocorrelation with lots of heuristic improvements. One good idea there is that rather than sampling fractional bins you can just use parabolic fit of the autocor results to give you sub-bin accuracy. Another good idea is searching around trying different offsets for your autocor (sliding the phase of the autocor window) in addition to searching for different periods. In theory different offsets shouldn't matter but with noise and transients finding the best offset might help a lot. At the end they talk about noise and very ways to address it but it's all very nasty. This is one of those messes where it's hack piled on top of hack you just know this can't be the right way to approach the problem.

Paper : "Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates" : goes over mostly the same material I went over and is a pretty well written paper.

Most of the modern papers on the subject are about second level topics, like finding the fundamental frequency from a harmonic spectrum, or tracking pitches through time. There are lots of these, but the most elegant that I've found is "Maximum a Posteriori Pitch Tracking" which is quite solidly built. For their basic pitch likelihood they use the true autocorrelation.

There are some quite strange things with sound and pitch and human hearing. Especially with voice and some stringed instruments, the fundamental frequency, what we actually hear as "the pitch" can actually be very quiet. eg. you perceive the sound to be of pitch "F" , but the actual peak at F might be tiny, and instead there are peaks at 2F and 3F. Which is another weird thing, the sound at 3F is actually a different note (unlike 2F and 4F which are the same note at different octaves), but you don't really hear it as a different note, it's perceived as part of the timbre of the instrument. Over time as a string sounds, the intensity of the various harmonics can rise and fall; the fundamental might sound strongly at some point but may be totally quiet at other points; the funny thing is that we just sort of hear this as a pleasant variation of the sound, we don't hear the "pitch" changing at all. With a guitar the magnitude of the various harmonics can be very strongly effected by where exactly on the string you pluck it. If you pluck right in the middle of the string you will get more of just the fundamental. The sort of normal place to pluck is about 1/4 of the way along the string which gives you F and 2F very strongly, but quite often 2F is much stronger and we still perceive the pitch as F. I'm way out of my element on this stuff but it's interesting.

Okay, so 3F is the "Fifth" ; Wikipedia has a good page on Equal temperament . So like, when you play a "C" , often the fifth harmonic (a "G") is very strong at 3F, but if you play a "G", you won't hear a spike at "C", the fifth from "G" is a "D" , so hearing C+G = C with fifth , while hearing G+D = G with fifth, etc.

Also I found this research app : Tartini which is pretty sweet in theory ; it tracks pitches over time and makes nice graphs and can output musical scores and all that, it's got a nice useable GUI. Unfortunately they use a pretty ass pitch detection method, it's super noisy and not useable on my setup. If you had a professional mic and a sound booth and all that stuff it might be pretty nice.


01-24-08

Some notes on accurate pitch detection in sounds :

You have a sound buffer in time space buf[i]. You captured the last N samples. Obviously using more samples gives you more precision, but longer latency (less time resolution). This is like the Heisenberg uncertainty principle kind of thing. For accurate pitch detection, you have to eat some temporal delay, around 1 second, and use 16k-32k samples. (pitch resolution will be like 1/N - but if your sounds are shorter than N using too long of a buffer will give you lots of nasty noise from partials)

Obviously an FFT is the first place to start. If you have N samples, there will be N/2 frequencies in the spectrum. The bottom frequency is 0, and the top frequency is (samples per second)/2. The resolution is thus (samples per second)/N Hz. So for example if you have a 44k sample data and you do an FFT with 44k bins, your resolution is 1 Hz. That sucks bad so obviously we need to do more. Note that you're doing an FFT of only a portion of the sound, which is technically a "Short Time Fourier Transform" (STFT).

Before you FFT you want to apply a window to the buffer. The reason is that if you just FFT a raw sine wave, if it happens to line up exactly on one of the FFT bins you will get a nice single peak, but if it's between two bins, it will spread out in this evil spectrum. If you multiply the data by a window, then the spectrum will be the *convolution* of the original spectrum by the spectrum of the window. Nice smooth windows have fourier transforms that are like sinc functions, which act to sharpen the result around the true peak. In any case a "Hann" window (simple 0-1 cosine) works great.

Once you have the FFT you create a power spectrum with the square of the real and imaginary parts. You now have N/2 power samples.

Another note on the FFT. Your input buffer had N samples, but there's a funny thing you can do to get more resolution. Do your FFT with M samples (M > N) and just pad the buffer up with (M-N) zeros. Since you windowed your buffer it already goes to zero on the edges, so you can just tack on a bunch of zeros. This effectively improves your FFT res to 1/M instead of 1/N. There's some weirdism here I don't quite get; you're not really improving your resolution at all, you're just getting more bins in the FFT and the FFT of the zero-padded data is really just some interpolation of the non-padded data. Also the PARSHL guys do this thing where they shift the buffer so the zeros are in the middle; I'm not really sure why you would do that, applying a constant shift to the buffer before FFT is equivalent to applying a constant phase shift to the output of the FFT. Anyhoo.

A practical issue you're going to have to address is noise. Normal computer mics are just awful. I found that the noise is primarily low frequency. There are two hacky things you can do. One is set a treshold intensity for the power spectrum; anything below that is treated as zero. Another is a threshold frequency, any values below that are treated as zero (more generally you could set a region of interest, and ignore all entries outside that region). A more general thing that I found to work nicely is just to sample the noise; tell the user to shut up and read the mic for a while and average those readings. Then scale that up by 1.5 and set that as your noise base. Subtract that noise base off all future power spectrums.

So, now you can just find the peaks in the power spectrum and those are your principal frequencies. I'm working on pretty clean sounds so I have a simple heuristic (just find the one sample of highest intensity). The audio compression algorithms have complicated heuristics to track peaks coherently across frames.

Once you have a peak bin index you can refine its location. You want to try to make a guess at where the true center of the peak is, which may not be at an integer bin location. Since we windowed, we know that a pure sine will turn into a sinc, so we want to just fit a sinc near the peak and use the location of the sinc. In practice, you want to only use a few samples near the peak so that you don't run into other peaks when you do the fit. Also in practice you don't need to actually fit a sinc, you can fit anything with a smooth peak, such a quadratic or a gaussian. Finding the center of a gaussian fit is super easy. For an array of data[] the center of the Gaussian fit is Sum{ i * data[i] } / Sum{ data[i] }

Okay, the next thing you need to deal with are the harmonics. Most sounds have harmonics, which means if the fundamental frequency is at F, there will also be peaks at 2F and 4F etc. There are a few issues here. One is that with naive peak tracking the peak could jump around between the harmonics. That's easy to deal with by treating your frequencies in "octaveless" space, that is treat frequency as a toroidal value where 2F is equivalent to F. The other is a little more subtle, which is that the spacing between the harmonics can actually be more reliable than the frequencies of the haronics themselves. Some of the frequency algorithms use the FFT of the FFT, which will give you the spacing between the harmonics, there's this whole cepstrum thing, bleh I don't do any of that.

I should take a second to remind you that musical notes are geometrically related. Octave frequencies are 2* apart, and each of the 12 notes are *(2^(1/12)) from each other. Standard A is 110 Hz and all other notes you can find from there. You can put a frequency into "note space" by doing 12*log2( freq / 110 ). That will be 0-11 in the fundamental octave, or you can do a mod(12).

One thing you can do with the harmonics is wrap your FFT to improve accuracy. For one thing, the higher harmonics have more frequency accuracy. If you divide your spectrum up into octaves, each octave gets twice as many bins as the one below it. The bottom octaves only get 1-2 bins so you can't really measure notes at all there. The top octave gets roughly half the bins of the whole FFT. Actually this is one of the really shitty properties of the FFT, that most of your resolution is wasted up in super high frequency land where nothing good happens. Most music and voice is in the 100 - 1000 Hz range. Anyway, to take advantage of this you want to use the highest harmonic you can. The other thing is to take the highest octave in the FFT spectrum, and bilinear filter from the lower octaves and add them onto the highest octave. By adding them up the little errors in each harmonic average out and you get more accuracy.

Okay, at this point we should already have a really nice pitch estimate, but now that we have a really good estimate we can refine it. We've been working on the spectrum but now we're going to go back and work on the time-sampled wave data. A peak of frequency F in the spectrum means that the samples have a periodic component of period P = (samples per second) / F. (P is an integer number of samples). (one quick note : you can improve accuracy here by resampling your input buffer to give more samples per second; that's equivalent to working with fractional number of samples; in theory it would be best to make all your code work with non-integer bin sampling).

So we have a good estimate of the period of the data (P) which came from the frequency spectrum work. Now we can search around to see if there's a better P by trying P+1,P-1, etc. There are two standard ways to measure the period of the samples. Note that these measures go nuts if you allow P to roam globally (because it's garbage for small P and also the periodicity is itself periodic in P), so you need this very good seed value to start the search, then you just look for a local max.

1. Autocorrelation. This is the linear correlation of samples[t] with samples[t-P]. You're basically shifting and multiplying by yourself. Find the P with highest correlation. This thing sucks balls. The problem is that it is very highly affected by noise or other signals at lower frequencies. Basically for it to work ok, you need the autocorrelation of samples[t] with samples[t-2*P] and samples[t-3*P] to all be very similar. In fact they are not if there are big low frequency waves running through the data. Now in theory you should be able to fix this by running a high pass filter over your data with a frequency that's like F/2 to get rid of all the low frequencies and just leave the stuff near F. Hell maybe you could just do a bandpass around F and then autocorrelate on that. But there's no point to muck around here cuz there's a better way :

2. Noll "Maximum Likelihood". I prefer to call this "Coherent Average". Basically what we want to do is to look at the waveform in a periodic window of length P. You put your waveform in this window by wrapping on the period and averaging all the values that go in the periodic window. Any components which are periodic at length P will add up coherently; all other components will add destructively. This is actually a cool way to look at your waveform too. It's sort of like a tuned oscilloscope. If you have your oscilloscope tuned to the right frequency, the waveform will appear to stand still. If you tune it out of phase the waveform on the scope will move; here it will destructively add and go to zero. To find the best P, select the periodic window with the highest rmse.

There's another approach to doing this refinement on the frequency, and that's to use some kind of bandpass filter response. What we want to do is take the sample array and apply a bandpass centered on some probe frequency F. We then measure the total power output by the bandpass and that tells us how close to are to tuned to the actual frequency in the sound. Then you can search around with F using a binary search or whatever to maximize the power output and thus find your true frequency. The nice thing is we can explore fractional F without writing a bunch of code to do correct non-integer bin sampling.

The coolest bandpass for this purpose is the "Goertzel algorithm" which is a really fast simple filter nicely peaked at the probe. One page that actually talks about Goertzel as a bandpass . I dunno maybe there are actually more appropriate bandpass filters for this purpose, you want one that is kind of like Gaussian shaped or something with a single point max, not a flat top one.

BTW I like to think of this is as a damped resonant harmonic oscillator; you're banging it with a bunch of forces, and then you measure how much its moving at the end. It will only really get moving if you're banging it on resonance, so then you tune it to go searching around to find the resonance where it responds most. Looking at it this way it's really also just like a radio receiver. You're tuning your radio oscillator around the frequency that you think you should get a signal and you look at how much power you're receiving, and tune to maximize the output.

With these methods you should be able to measure a pitch to around 0.01 Hz (@ 110 Hz).

BTW there are more things you can do using frame-to-frame information, such as tracking how the peak is moving from frame to frame with phase information (see the "phase vocoder"). That could theoretically give you even more accuracy but I'm not yet doing any of that. When we converted to a power spectrum we threw away all the phase information in the FFT which is good information that we could use. Basically if you measured a peak at frequency F1 and phase P1 in the first frame, and then you see F2 and P2 in the following frame, you can figure out that that's one wave of a linearly varying frequency F + dF*t with a single phase.

Another note : frequency-space pitch detection is very accurate & robust for high frequencies, but breaks down badly for low frequencies. Really any octave below a guitar's low E at 82 Hz gets pretty ugly because the spacing between notes becomes close to or smaller than one FFT bin. On the other hand, time-domain pitch detection is really ugly for very high frequencies, because the period is only a few samples which means that the quantization due to the sampling rate becomes very hard to deal with and the period is very inaccurate and sensitive to noise, but conversely time-domain detection is quite good for low frequency pitches (as long as the frequency isn't so low that you have too few periods in your buffer; if you have a 1 second buffer you can't really work below 8 Hz or so).

There are other obvious things you could do that I'm not bothering with. One is dynamic resolution. When you detect that you have a nice long run at a steady pitch (when a string is sounding) you can use a big window which will give you better pitch resolution. If you use a big window all the time though it would give you horrible latency and bad noise from the transients and attack. So if you detect that the signal is highly varying in the big buffer you shrink your buffer down so that during times of rapid change you can still respond quickly, and you just have less accuracy during that time.


01-23-08

It's funny that the limitting factor in human success is generally "energy". Everyone could do anything they want so much better if they just work at it and hustle. The real advantage of things like assistants isn't that they save you so much time or they do things you couldn't do yourself, it's that they let you save your get-it-done energy for the things that are important. The funny thing is that there's no reason why we shouldn't all have infinite energy to do everything we need to do. All the hustling and studying doesn't really burn a resource that we can't replenish easily enough with some food and sleep.


01-22-08

I'm making Chicken Pot Pie right now. Frozen pie crust is fine. Thicken with a roux. Toss in whatever pleases you. Fennel and leeks make it hoity toity. Such good soothing cold weather food. The best reference recipe I could find was at this food blog : Miche Mache . There are a lot of great food blogs out there, but there needs to be some kind of recipe aggregator that points to them and rates them or something. On the flip side, there are these recipe aggregator sites like "cooks.com" but all their recipes are like "use a can of chicken soup and put pillsbury biscuits on top for the crust" and then the retards give it 5 stars.

I've given up on my master poker project. I was bored and aimless for a few days, but now I'm working a quick realtime sound play app in the igj4 framework. It started because I wanted a guitar tuner helper app and I downloaded and installed like literally 20 of them and every single one of them sucked absolute donkey balls. I'm about two days in now and it's looking quite promising, I should have something to post soon. In the mean time you can scope out some of my reference material :

YMEC software - Introduction to simple sound measurement using a notebook computer
Ron Nicholson's Digital Signal Processing Page
Introduction to Pitch Detection
High-Speed Analog-to-Digital Conversion - Google Book Search
Guitar tunings - Wikipedia, the free encyclopedia
DiscreteTFDs -- Time-Frequency Analysis Software
Digital filter - Wikipedia, the free encyclopedia


01-20-08

We went hiking yesterday off Bon Tempe lake near Mount Tam; it was probably the nicest hike we've done in the whole area. In particular the Kent Trail along Alpine Lake is just gorgeous, lots of variety, not too steep, good mix of trees and open spaces, some redwoods, views, plenty of sun. It was perfect. It's a little crowded on the weekend but I bet it's nice and empty during the weekdays. I also enjoy checking out the dams and pipes and pumpstations and all that stuff too. I also finally found a good trail map of the whole area. You can figure out whatever loops you want of whatever length.


01-20-08

"Free Lunch: How The Wealthiest Americans Enrich Themselves At Government Expense (and Stick you With the Bill)" looks like a pretty solid book with lots of specific case evidence. There are two good interviews online from rather different sides : on Democracy Now and Reason magazine (libertarian) .

One thing we should understand is that "selective libertarianism" actually makes markets less free and less competetive. There's a wing of the Republican party that supports any tax cut or reduction of regulation. They claim to be supporting the goal of free markets and libertarianism - lower taxes, less regulation, smaller government. That goal is understandable and reasonable in theory, however in practices we rarely actually get an across the board reduction of regulation. Instead taxes are lowered for some specific thing, or regulation is reduced on one specific thing. This is what I call "selective libertarianism" and it's actually worse for the competitive environment than just leaving the higher taxes and regulation that was more even. The supporters claim they are opening up the market for business, but in reality they are supporting one specific business or one specific practice, which is highly distorting of the market.

Low unemployment is often trotted out these days as a sign that economy is basically healthy. There are a lot of ways those numbers are questionable, but even aside from that I'm skeptical that unemployment actually means much. You can have low unemployment (eg. China or India) and still have people in abject poverty because the jobs are just awful. You can have very high unemployment (like much of Europe) and yet still have a high standard of living because the welfare is so generous (which is what causes the high unemployment, people won't take the shitty jobs because they don't have to).


01-18-08

Economists who argue that there is no disadvantage to job losses caused by free trade are being intentionally dense, and it's frustrating. The debate is centered on the affects on the US in terms of outsourcing - we lose some manufacturing jobs to 3rd world countries, and in exchange we get cheaper goods to buy. Basically their argument says the net benefit from the cheaper goods is greater than the loss of someone's job. They argue that someone losing their job because it's no longer as cost effective as overseas is nothing to be bothered by, and the types of jobs we do will wind up being more efficient and profitable.

This is all sort of true, but it's also very naive and "1st level" discussion. It's the background that everyone should already know that gets us started talking about the more subtle issues.

1. risk/stability of jobs : Every economist should know that making a "trade" (in this case giving up a manufacturing job for cheaper goods) is not necessarilly good just because it's +EV - you need to look at the risk you're accepting as well. Manufacturing jobs provide a powerful stabilizing base to an economy. It provides a certain number of jobs at a factory which you know with relatively high certainty will be there, because the company needs to keep the factory pumping. Having a stable job base like that has value far beyond its basic income level, because it smooths out economic bumps and it provides a reliable job for people in a certain location. Different types of job are more or less fluid. Jobs that require little specific training and low cost or time to set up are very fluid (call center operators is one example). Some jobs are very non-fluid, which brings us to :

2. industrial inertia and economic fluctuations : Having an economy based entirely on fluid jobs is very dangerous, because it means that those jobs can quickly leave for more suitable places if economic conditions change. Having an economy based on very non-fluid jobs such as manufacturing is very stable in the sense that even if it becomes more desirable to have those jobs elsewhere, it will still take a lot of startup money and time to get them set up there. The problem is that once you lose those manufacturing jobs, you are now in the position of having difficulty getting them back. Say the world economic conditions swing the other way and suddenly it would be best for you to get back into manufacturing. Because it's not a fluid job, you can't just do that, you don't have the trained people, the facilities, the infrastructure, and you may go through a very long recession before you can get back into manufacturing. BTW this is why there is some value to producing your own food and extracting your own natural resources even if it's not the cheapest place to do it, just to keep the capacity domestic so that you have some stability in case you need it.

3. nonlinear value of money : When a worker loses a job and we get cheaper goods, our economy may have a net benefit, but who gets that benefit? It's generally everyone but the workers who lost their jobs, even if they get other jobs the median salary of the new jobs is much lower. It maybe +EV in terms of total dollars in the economy, but it's unclear if it's +EV for the median (not the mean). In particular we'd like to judge using a nonlinear measure of value. Providing slightly cheaper goods to rich people is of very little benefit to the economic health of the society. Providing a slightly higher wage to the lower/middle class is of much greater benefit. Basically dollars should be valued at something like log(dollars) per person.

4. individual difficulty of job change : In general, there are a lot of people who say "if your employer is rotten, get a different job" ; they treat jobs as if they are a fluid thing where someone can just jump to the most suitable job, and thus the system will shake out so that the appropriate jobs go to the appropriate people. The reality is that job change is very difficult and of very high cost (both monetary and non-monetary cost). For example, someone who loses a manufacturing job may need years of expensive training to find an equivalent job; that's not only high cost, it's high risk since they don't know if that job will be there when they're done. They may have to move to a different part of the country; again that's high cost and high risk and has very high non-monetary cost, they may have various reasons they want to stay where they are in their home with family/friends etc.

Now, what should we actually do about it? Then we can get into interesting tradeoffs. Protectionism is very bad in a lot of ways; certainly the extreme French-style protectionism is very harmful. The US has long practiced (and still does) protectionism through subsidy and tarriff. On the other hand, contending that losing stable manufacturing jobs is not a bad thing at all puts us back on level 0 where we can't even start an interesting discussion because we're arguing about what the problem is.

This has been a brilliant tactic for scuttling progress on a myriad of issues. The most obvious is global warming, where instead of having a mature discussion about what we should try to do about global warming and what the costs would be, instead we're mired in this ridiculous farce of a debate over whether it even exists.

Let's be clear, free trade on the whole is generally good (though what we have now is really a complex set of rules that distort the system far too much to be called "free"). The benefits on the whole are too good to pass up, but that doesn't mean we should ignore the downsides. If we can make some small policy tweaks to shift the economics so that good quality stable jobs are viable in our country, that's clearly a good thing.


01-17-08

My TIPS are finally doing well. It seems to me we have a pretty high probability of going through the "theoretically impossible" (actually rare) phenomenon of recession and inflation. The domestic US economy is far weaker than it might seem by just looking at the S&P, because large US corporations are doing very well on the international market. The inflation will be driven partly by the extended period of low interest rates and tax cuts, and partly by the low value of the dollar which drives up the price of international goods in terms of dollars (oil is most significant but all commodities are sky high in terms of the US dollar). One might argue that inflation is being kept artifically low by the flood of cheap goods from China which is pegged to the dollar.


01-17-08

I'm a big Ken Loach fan; he's a master of capturing little everyday slices of life (usually lower class life in the middle of England). He certainly doesn't romanticize his subjects, his camera looks on with a quiet sadness. I really like the style of the movies; they're not quite false-documentary style but they're extremely simplified, always on location with natural lighting.


01-16-08

The honest encyclopedia entry on David Brooks


01-16-08

There's been some recent popular press on MRSA outbreaks in San Francisco and Boston. It's unfortunate that they're painting it as a "gay disease" like they did with AIDS, that makes the majority of Americans think that they aren't part of the problem. MRSA is Methicillin-resistant Staphylococcus aureus, a horrible bacterium resistant to common antibiotics. The fact is MRSA is booming all over the US; it mainly comes from hospitals, where workers are horribly lax about proper cleanliness and hand washing. Several outbreaks have happened recently at schools around the US. Ironically parents respond by giving their kids antibiotic hand creams.

There are many drug resistant bacteria out there, but normally they would be rare. They only thrive in areas that have been cleansed of non-resistant bacteria through use of antibiotic cleansers. When the microbial population is wiped out like that, the resistant bacteria step into the empty space and thrive.

We've been pouring antibiotics into everything for the past 20 years - livestock feed, household cleansers, treatment of viral colds, etc. It's only a matter of time before this leads us to new antibiotic resistant pathogens.

What makes matters worse is that bacteria don't evolve in the same way as other species. In normal evolution, each bacterium would have to evolve drug resistance on its own. Bacteria, however, can share genes through "lateral gene transfer" (most often mediated by bacteriophages which grab snippets of bacterial DNA and take it over to other bacteria). The result is that drug resistance in one species can be transfered to a completely unrelated species. So if even one bacterium develops a certain resistance, that can spread all over the bacterial spectrum.

The media needs to stop painting this as another "gay disease" and start raising awareness of how everyone's overuse of antibiotics could bring a horrible plague upon us all.


01-15-08

Wesley Snipes has accomplished something very impressive : making the scientologists look sane. He tried to establish a military training facility for Nuwaubians


01-12-08

Choosing Voluntary Simplicity is a good manifesto for how I've been living & wanted to live for some time now. Anything you make yourself is better than the fanciest junk you can buy. A smile is a better gift than money. Doing anything (making something, doing a chore) is better than watching anything. Fixing something or making do is better than replacing. Relaxing is better than being busy. The beauties of nature are better than anything man can make. The pleasures of our senses are the greatest pleasures we have. Not wanting physical things or status is the path to freedom. There's a ton of good self-helpy stuff on that site, but also some silly nonsense.


01-12-08

Perfect roast chicken used to be one of those benchmarks for whether you could execute well in the kitchen. These days with digital probe thermometers that you can just leave in, it's really trivial. If you can't do it now you're really just incompetent.


01-12-08

I'm reading "Banco" the followup to "Papillon". It's obviously a shitty concocted followup to make a buck after the first one did well, but I'm still enjoying it. While I'm reading I feel energized, like "this is the way to live!" and "I could be like this guy" just running around and having adventures. Unfortunately that feeling goes away quickly when I stop reading.

What can we learn from Papillon? Don't overthink decisions. Just make up your mind quick, don't really worry about the pros and cons and long term effects and everything, just do something and do it 100%. But don't stick to it either, if it's not working out, abandon it and do something else. Always have a goal. Papillon admires people who are "full of life" which means ready to charge head-on into anything at a moment's notice. Don't worry too much about failures, just put them behind you and move on to the next challenge.


01-12-08

Mavericks big wave surf contest is going off right now. You can watch it live.


01-11-08

Some things I now know better about my shoulder injury :

In the very early phases I should've tried to move it around more even though it was hella painful. Even using tons of painkiller and ice to be able to just have someone else passively move it around through full range of motion would have helped a lot.

In the recovery phase I think I tried to get back too quickly. I should have spent more time doing just physical therapy to build the stabilizers and stay away from other exercise, like 3-6 months of nothing but really serious therapy exercises.

Most of all, I should have taken HGH and steroids. Low doses in an adult have very little negative effect, and they massively improve your injury recovery. Every injured professional athlete uses them. Your injured tissues heal up faster and more fully, and it's far easier to restore the stabilizer muscle you lost. If I ever get something like an ACL tear, I'll definitely use HGH+juice to recover.

Doctors were pretty much useless. I saw orthopedic surgeons who were shoulder specialists, and physical therapists at some of the top sports recovery clinics, and none of them told me anything I couldn't pretty much find on the internet on my own.

Addendum : I just rewrote this as advice to Brian. Here's the new version :

Go see doctor right away; he won't do anything at all but it will make you stop worrying about whether you should've seen a doctor sooner.

In the first month, keep the shoulder mobile but don't put any stress on it. Move it around while lying on the floor, if possible get in a swimming pool and move it around (but don't actually swim). Move it using your other arm (passive range of motion). The goal here is avoid the shortening of ligaments and tightening of the shoulder capsule that will happen if you keep it too immobile. Try not to keep it in a sling for very long at all. Use lots of ice and NSAID to keep swelling down in this phase even if you aren't really in much pain; the swelling will make the shoulder move incorrectly which will cause the tissues to heal in the wrong location.

As the pain goes away and you can move it around more on its own - do NOT resume sports activities. Don't do anything that puts strain on the shoulder that's hard for you to control, like throwing a ball, yoga, rock climbing, weight lifting ,etc. I know you'll want to get back to normal activities to recover but don't do it. Do start working the shoulder muscles to restore function and strength of the stabilizer muscles, but only in easy to control low weight high-rep movements. You can find plenty of the weird little moves on the internet that will stress the stabilizers, or see a PT. Buy a set of resistance bands to have at home and just play with them all the time.

As you get tempted to resume normal sports activity, ease in very slowly, be careful, and again avoid dynamic moves like throwing a ball or tackling or cartwheels or anything that will apply sudden shocks to the area. Make sure that the stabilizer muscles are leading the way and are strong enough. You absolutely are going to lose some general fitness and muscle during this phase and that is fine, suck it up and let yourself lose some pec muscle, just focus on the shoulder stabilizers.

There are several complications you can pick up from any shoulder injury and your goal is to really avoid them. They are "frozen shoulder", "shoulder instability" and "sick scapula".


01-11-08

I suspect there's a very high correlation between people who can cook and people who are good in the sack. They're very similar skills. Really both are very easy. You need just a basic level of dexterity to be able to move your body around deftly and not hurt yourself. You need enough memory to remember what you've tried in the past and how it turned out, so that you can tweak things slightly when you try again. Then the process is simply using your senses to judge what you did, using trial and error to improve your technique, and practicing enough to be able to execute.


01-11-08

Countrywide took imprudent risks based on bogus finance, passed the risk on to others, paid its CEO at least $500 million, and has now been rescued. What a great gig.


01-10-08

When coding, I often want to have some random event happen roughly once per mean interval T. There are a few ways to do this.

1. Store the next time the event should happen. When that time passes, fire the event and generate a new time. The new time should be a function of the event time, not the frame time, which could differ by the duration of the frame. This method gives you the most flexibility since you can generate the new time from whatever spectrum you want. For example it could just be a constant probability across some interval, or it could be a Gaussian centered on the mean time.

2. Store a probability accumulator. When the accumulator goes over 1.0, fire the event. When the event fires, subtract off 1.0 The way you step the accumulator has some flexibility, but roughly it should be proportional to the frame duration and randomized as you see fit. The advantage of this over #1 comes if your events aren't always random, and for example if two or three events suddenly fire off, you want them to delay longer until they happen again. For single events this isn't very interesting, but for multi-choice events it provides a way of flipping a coin and reducing the chance of repeats.

3. Use a memoryless random event. This is the only way to do it without any side storage. You have some spot in code where you just want to generate a random event with a certain mean interval and you don't want to have to add another state variable. The only random generator that does this is a Poisson Process with rate lambda = 1/T.

Basically the probability rate lambda is the probability of an event per unit time. But you can only use that linearly over an infinitesimal time slice. For a finite frame length, to be properly framerate independent, you should look at the chance of an event happening over the course of the frame. For a frame of length "dt" the probability of any event happening is :

P = 1 - e^(-dt/T)

Of course technically you need to allow 2 or more events to happen per frame, but if we assume dt << T that chance is tiny and we don't care about it anyway.

So, to check whether you should generate a random event you just do :

if ( frandunit() < 1 - e^(-dt/T) )

You may be familiar with doing

if ( frandunit() < (dt/T) )

which is the non-framerate-independent version that sort of works okay for fast frames and long intervals.

Generally this Poisson random is not generally a good looking random event for graphical events, but is a really nice way to do periodic things without a state variable.


01-08-08

I wanna go ski but I really don't wanna go up on the fucking weekend when everyone from SF goes. I'm unemployed dammit, I wanna go for a Tue-Thurs trip, but Dan can't go, and going up by myself doesn't sound like fun. :(


01-08-08

Moral sacrifice comes from giving up personal good for the benefit of others. The highest level comes from giving to those who are very unlike yourself, very far away, and very needy, because helping them provides very little reward to you, but provides a huge cost-benefit approach. The very lowest level comes from giving to your own direct family; while it is slightly more admirable than just being 100% selfish, it is not really a big sacrifice because anything you give to your family is still very much benefiting you, even if it's only in terms of love, grandchildren, etc. To admire someone for being a "good family man" is very thin admiration indeed. It's a classic stereotype that even some of the great immoral or psychopathic people can still be very good "family men".

There are various ways to acheive long term happiness (as opposed to short-term acts of "fun" which can't really be strung together to create happiness). The easiest way is probably through having a goal, something to "live for". Whether or not that goal is actually important, if you believe it is and it motivates you, the constant pursuit and minor successes along the way provide a sense of purpose and a way to occupy your time which equals "happiness". For some people this goal is the pursuit of money (money itself doesn't help happiness that much, but the successful pursuit of money can make you happy), for others it's some more rarified goal such as a research discovery or trying to benefit the world. Probably the most common goal which people use to occupy themselves, however, is their family. They will say they "did it all for their family". It lets them go to a horrible job they would hate, if they believe it is a sacrifice to get money for their kids. It gives them an overall purpose, raising these kids, which gives them an overall long term happiness. Again, this is not particularly admirable, in fact it means they are rather weak and unambitious. Everyone needs some goal to live for. More dynamic people choose difficult and interesting goals. People who have no abilities or are afraid or lazy still need some goal to pretend that they live for, and these people generally choose "family".

I've been thinking about why Mike Huckabee gives me the creeps. Obviously some part of it is prejudice on my part against people like him, but I've been trying to force that out of my head to see what else is in there.

One problem with Huckabee obviously is his complete lack of political thought. He doesn't seem to have any policies whatsoever. Certainly he didn't when he started running, but he has cooked a few up. One of the few unique things he pushes is this "fair tax", a national sales tax proposal, which is just one of the most insane things any major candidate has supported in recent memory. Not only is it insane in the sense that it's a political non-starter, it has no chance of going anywhere, but it would also be a huge disaster for the economy. To support the government with no income tax, the sales tax would have to be absolutely huge, prices would shoot through the roof, spending would go way down, there would be a huge black market of people avoiding taxes, and it would be very very regressive.

Okay, aside from the lack of policies, I'm really bothered by someone running a campaign primarily on the basis that he's a good Christian and a nice guy and a family man. For one thing, none of those are really related to the skills you need to run the US. But even more than that, the idea that you would tout your homey Christianity as something to admire is something that I find deplorable. Now, I have no problem with someone being a Christian, or whatever, obviously most of our presidents have been very religious and that's fine, but to be a man who is a Christian running for president is different than being a Christian running for president. It implies an exclusiveness, that we should vote for people of our own type, and it's us against them in the world (combined with all the rhetoric from various candidates about the evil of Islam, the message certainly is being sent that it's our faith against theirs). To me running with your Christianity as a selling point is not that different than running with your Caucasian-ness as a selling point.


01-07-08

I like to follow the news and politics, but I just can't stand all this election nonsense. I know it's hugely important, the excuse for apathy that "they're all the same" is pure garbage, but the whole US campaign system has become so shallow. The press coverage is almost all about who's ahead in the polls, or trivia about the campaign, who's their press guy, what kind of bus do they have, which state's babies are they kissing now. The actual interviews and speeches are targetted at the retarded press and special interest groups. God forbid you actually ever talk seriously about an issue and admit that it's complicated and you might not stick with the party line, you'd be destroyed.

A parliamentary system would be far better. The American public has no qualifications to choose a leader. We'd do better just voting for a party, and then letting the party choose a leader, presumably they actually have some idea of other politicians's abilities and who would actually make a good leader. Also the kind of political in-fighting that you need to get ahead in a parliamentary system is the same skill that's needed to be a good leader. This is in contrast to a public election, where the ability to win the popular vote is a completely different skill which favors actors and false bumpkins. It's sort of inherently obvious that you should have a selection system in which survival of the fittest produces a victor that is well suited for the job. You wouldn't use an IQ test to pick candidates for World's Strongest Man.


01-04-08

Gah, Firefox is so retarded. I Adblock and then I randomly click in the whitespace on my browser and get taken to god knows where.


01-04-08

The real problem with C++, for my money, is with medium-sized repeated patterns. For major chunks of complicated operation, you can bundle it up in a class or a library or whatever and that works okay for sharing code, because the pain of wiring and complying with the interface is very small compared to the savings. The problem is that the majority of code is made up of these little tiny patterns that you use over and over, but they're too small to bundle up with the weight of C++. That is, the pain of wiring is too great, you would write almost as much nasty possibly-buggy code to do the wiring as you would to just dupe the tiny little snippet.


01-04-08

Huge rainstorm here today. I'm expecting the power to go out any minute now. Either that or for water to start gushing through these shitty walls.


01-03-08

One of the hard things about dating as a way to find your mate is that people are such phonies for the first year. I don't mean they necessarilly even intentionally lie, but everyone tries to show their good side when they're dating, you hide your insecurity and your baggage and your criticism and just try to impress the other person and get along and do things together. Of course it goes well, you're going out to restaurants, you're travelling, it's so easy to get along and like each other in that mode. It's not for perhaps a year when you move in together and start seeing all of each others' sides that you know what a person would be like to marry. Now if you find you don't really like them, you've put a year in to find that out.

Many people get around this by just hurrying up and getting married during the initial easy phase. Then after marriage they find out what they're really like to live with and either make it work or get divorced. That's not a horrible way to go about it. I've always been in the "trying living together and make sure you like it first" camp, but I think people in that camp basically just never get married.

Of course I'm one of the worst about this. When I'm dating I can be charming, talkative, social, I go out and do things and all that. Eventually that just becomes too exhausting for me and I give up and go back to being a grouchy hermit, and my girlfriends are left wondering what happened to the man they fell in love with. It's not even that I'm trying to fool them, it's more that I'm trying to fool myself. Each new relationship I tell myself is a chance to start fresh and be a better person; that this new girl doesn't know what a dork I am and I can just play a new role and live a new kind of life.


01-02-08

Today I watched the "No Reservations" about Beirut, where Tony goes to make a food show and winds up trapped in an evacuation from the Israeili bombardment. It inspired me to go buy some Arak, the Lebanese anise alcohol. Mmmm, yeah, it's pretty much identical to Ouzo, I can't really tell any difference, and it's also equal to Pastis but without the extra herby notes. Anyway, the show reminded me of that sick episode of recent history that was so little covered here in the US. Lebanon was sort of making some decent progress towards democracy and getting out from under the thumb of Syria, then some nutters on the border kidnapped two Israeli army guys, and in response the Israelis bombed the holy living fuck out of the entire country. Real smart, guys. I'm sure that's not going to further radicalize the country at all. Hard line reprisals have worked great against terrorists for the last 50 years, right?


01-02-08

I was thinking Google would be way better if they provided a simple Thumbs Up / Thumbs Down mechanism on search results. Then people can look at the links and provide feedback on how well the results matched the search. That feedback can be used to train the search. People love their Google and would be willing to put a lot of time into this. The problem of course is that it would be abused by all the search-rank-boosting jackasses. There are various heuristics you could use to counteract that (like any user that puts in too many signals just gets ignored), but the best soltuion of all is of course a Network of Trust. That way each user is only affected by the thumbs up/down of their neighbors in the trust network (aka their "friends"), so that the spammers would have a minimal effect on the populace since presumably most people would not "trust" them.


01-02-08

I think Dick Wolf should sue the creator of "House". "House" is a direct rip-off of Law & Order with just the setting changed. Every episode starts with the cheezy exposure of the crime/medical problem. Then the detectives (House's assitants) look into it, usually screwing up a few things. Then they call in the D.A. / House for advice. They start with one suspect and file a first arrest / make an initial diagnosis and try a treatment. That never works, so they send the detectives for more clues, and some key piece of evidence always miraculously pops up (this is the funniest part - on "House" the assistants actually go and gather evidence from the patient's homes, they couldn't think of any substitution for this part of Law & Order). Finally with the last bit of evidence Jack McCoy makes a brilliant diagnosis and saves the patient's life - though he often has to confront a moral dilemma of his own which brings up characters from his past.

It also is sort of bizarre to me that people who seem to be pretty smart, like Sam Waterston and Hugh Laurie, would choose to spend year after year doing such formulaic dreck.


01-02-08

Over Christmas I ran a sprint against my littlest brother. He beat me (barely), but the real insult and injury was that I pulled my hamstring in the process. My middle brother beat me in a sprint a few summers ago, so I am now officially dethroned. It sucks getting old, I'm more and more injury prone all the time, and all my little core disfunctions keep compounding and have become systemic. Things like camping or backpacking are becoming more and more impractical for me.

I've always thought it was ironic that my skills are primarily mental, but the things I enjoy are physical. It's not so simple as some Psych 101 nonsense that I want to prove I can do the things I'm not so good at, the things the jocks used to lord over me in school. No, the pleasures I take in physical activity are the purely chemical pleasures of endorphins, the engorgement of the pump and testosterone, the zen meditative monotony of a bicycle pedal's cadence, the smell of the air and the constant variety of hiking in nature.

In the bigger picture of life, the one thing I've constantly really wanted since I was 16 or so has been to find the right woman, settle down, have a house and kids, be a dad, play with the family, cook, garden, all that stuff. It's sort of a similar irony that that is the exact thing I'm completely not wired to do. I'm a mean bastard and I suck at relationships, I shut people out and make them feel shitty. I would be really really good at being the single bachelor guy who works a lot and makes money and hustles around, that is what my skills are made for, but that disgusts me, I have no desire for that life.


01-02-08

Michael Pollan is one of the masters of the modern trend of turning a completely obvious 1-sentence throwaway into a whole book.


01-02-08

Software linkies :

DS Software has lots of cool tiny little techy apps, mostly encryption stuff and crc's but also just some little utilities. Good and functional programmer-ware.

tinyapps.org is a huge site full of lots of great tiny useful apps.

Anandtech Freeware list is a good aggregation of downloads.


12-29-07

If you look at the Bush administration as if they are trying to run a successful government, you would be shocked and amazed at the depth of their incompetence and poor judgement. Time after time, they seem to intentionally appoint people to head agencies that are grossly unqualified, and in fact quite often they appoint people who are specifically ideologically opposed to the mission of that agency. If instead you think of it as a clever way to destroy the government, it makes a lot more sense. The Republicans have found you can't really democratically kill government agencies, because the voters actually want government services, as much as they cry about taxes and fatcats in Washington, they want disaster aid and environmental protection and so on. On the other hand, if you just run the agencies horrifically badly, partly by restricting funding so they can't do their job, and partly by appointing incompetent leaders, the people will lose their love of those agencies; private sector alternatives will spring up that do the job better and more and more people will switch to the private solutions, see how bad the public agency is, and eventually be happy to let the public agency die. This isn't just insane conspiracy theory talk, it is in fact the explicit plan laid out by many of the more right wing think-tanks, the idea to starve government agencies to death through executive-branch administration rather than legislation. Of course no one in government could admit that they were actually doing this.


12-28-07

I've always been attracted to girls who are very emotional, who are very communicative and passionate. I'm sure I would be more compatible with a reasonable, nerdy, rational, stable girl, but I'm just not attracted to them. I guess it's a pretty cliche' opposites attract kind of thing, but it's the defining dilemma of my adult relationships.


12-28-07

I get a lot of value out of using Perforce for solo work. I hardly ever use the history to revert things, the main thing I get is just the work pattern. I still atomize my checkins in functional transactions with descriptions. This quantizes my work nicely into packets in a way that just helps me mentally to tie up loose ends and check things off my todo lists. I also still diff my checkins against the depot so that I can see what I did. I try to make myself read the changes as a 3rd party observer, though I often get lazy and don't do that as well as I could. Still I will often catch some temp debug code in the diff that I need to comment out to check in, or more importantly I'll see the diff and there will be a little one line change with no comment and it will make me realize I really need to go back and mark what I was thinking there.


12-28-07

Dan has been waking up crazy early and falling asleep right when she comes home. I sleep pretty normal hours, so we are seeing each other only like 1 hour a day. I told her it's like the movie "Ladyhawke". I briefly thought of renting it for us, but unfortunately that awful musical score makes it totally unwatchable now. Ladyhawke is a super cheezy fantasy tale of this prince and princess who are cursed (or something like that) so that the guy turns into a wolf every night and the girl turns into a hawk by day, thus they can only be together for a second at sunrise and sunset, but they're still in love. Or something, it's laughably awesome. It's right at the peak of the 80's fantasy period which included such silly awful flawed gems as Legend, Krull, The Dark Crystal, Conan, Neverending Story etc. (they seem to all be in 82-85)

We thought "Batman Begins" was almost unwatchably awful. I really hate this mix of trying to make things more realistic, and yet still being just totally retarded and laughable. The idea of putting a super hero movie in a more realistic setting where you could imagine it's almost our world and it's really happening is a good one; Bourne is of course basically a super hero, but perhaps "Unbreakable" is the best recent real-world superhero movie. Anyway, I'd much rather have comic book movies just go totally fantasy world surreal. "Sin City" is probably the best, but I also like the old comic movies, "Dick Tracy" was great, but I also like the '89 Burton Batman. The new Batman greatly reduces all the cool designs and visuals, gives us these action scenes that are just closeups of limbs flying; actually let me stop, the fight scenes are some of the absolute worst fight scenes I've ever seen in any movie. Over and over Batman literally drops into a mob of guys, and then the camera goes into "close up on the elbow" mode where you can't see anything, but for some reason that whole gang stands around while Batman fights one guy at a time, and for some reason all the guys who had guns are no longer there. That would be okay if it was a silly "zap pow bang" Batman, but it's not supposed to be. The whole training/genesis story is so overdone now and this is one of the worst I've ever seen, it's so teenager goofy with the citadel on the mountain.

A cooler Batman movie would've been to actually go 100% realistic and make him not a superhero at all but just a guy with more realistic gadgets and ninja skills. Then make the whole first movie genesis, and spend way more time on his wandering period, make it more of a road movie ala "Into the Wild" where he's not sure what to do with his rage and uncertainty, maybe do the whole series-of-teachers motif where he spends a year here and there at different places around the world learning different skills from different people and picking different tools that will help him on his quest. Also the way to update it for the modern world is to make Batman's quest more about CIA corruption and corporate malfeasance rather than just cleaning up the streets which feels awfully quaint now.


12-27-07

More images from neardd :

1 2 3 4


12-27-07

I dropped my laptop over christmas and broke the screen. It still works plugged into a monitor, so I'm gonna keep using it for the foreseeable future, but man this fucking sucks. Christmas wound up costing me at least $2000 when laptop replacement is considered. It's kind of ironic because I was just thinking that one thing I really wanted to get was an attache-style aluminum briefcase for my laptop that locks and has rubber corners so it's drop proof. I was thinking if I become a nomad and wander the world I could keep my cash and my computer in a secure locking case, not that it would really do much good, people would just steal the whole case since it's obviously valuable.

Anyway, my broken laptop gave me the idea that there should be a "mini pc". Basically it's just a laptop, but with no keyboard, no screen, and no battery. You can carry it point to point (home to office) but need to plug it in to gear to use it, which is pretty much exactly what I've always done with my laptop. Eliminating all that junk should make it really tiny and light and cheap. It should also be able to run super cool and quiet since it has like no moving parts. It's different than just a "small pc" because it's running all laptop parts not normal PC parts so it can be really tiny and cool and low power (eg. it uses a power brick instead of a big power supply). Well of course it already exists : AOpen MiniPC ; oh yeah, I guess there's a mac one too that looks pretty rad if you're into that flavor.


12-27-07

How a proper paint program should work :

The artist should work with infinite resolution. There should be not a single pixel in the user's interaction. When you zoom in on an area, it gets finer and finer and finer. The disk data format is simply a series of commands - eg. stroke from here to here with this brush. Stroke coordinates and brush sizes should all be floating point. You can of course "render" the image out to a bitmap format, but that should be considered an "export" not a save. Since you always have a full execution sequence, you can of course go back and change the brush that you used for a given stroke or whatever you want to do.

The colors the artist picks from should be floating point and light-linear. These are of course only the palette colors, the rendered out (exported) bitmap will be in the appropriate finite color space.


12-26-07

A while ago we went to the Sugimoto exhibit at the Asian Art Museum. @@


12-26-07

Traditional Blues music was spawned from the Black American condition in the South - the songs are about how hard life is, how the system's keeping you down, your woman's out runnin' round, landlord's after you for the rent again, you can't find no job, and your only comfort is in a bottle. The singer is certainly not a saint, but the blues comes from the outside world, how hard life is. "Songs Ohia" is Modern White Man's Blues. It's the blues of people who have every priviledge in the world and yet still can't be happy. Why didn't I make the right choice, why did I fuck this relationship again, will I ever be a better man? This is a dumb armchair analysis but the point is - Songs Ohia is really whiney.


12-22-07

Radio silence for Christmas. How will you live without my snarky cynical commentary on life's trivia? Just another seasonal suicide.

I'm anticipating wanting all these wondering familial holiday moments, singing carols at the piano, playing catch in the yard, and them somehow not working out. I'm anticipating trying hard not to roll my eyes as people pontificate on subjects they know nothing about, or convert every story into something that relates to them. I'm anticipating having to eat horrible unhealthy food in order to not hurt peoples' feelings. Noel.


12-21-07

At Oddworld we used to joke about making a Tetris-like game to trick people on the web into doing our lightmap packing for us. (optimal chart packing aka the pants problem is NP). Well, guess what, people are doing it. ESP Game gets people to label pictures on the web. This is different than the dumb thing Google has had for a long time where you label pictures, because the ESP Game actually has some reasonably clever game design elements to make it sort of fun and actually motivate people into playing. Their other game Phetch is pretty okay too. It's fucking retarded that you have to sign in to play though.

The dream is that someday you have "Ender's Game" where you can put these various video games up on the web and people just think they are playing a fun game, but actually you've converted various hard problems into game form and had them solve it. For example, you could do something like convert the real stock market into a little management game type of thing and let people play the game and use their decisions to do real trades. (actually all of Web 2.0 is sort of based on this, you build "communities" where people think they are socializing when really they're creating free content for the site to make money from)


12-21-07

So I made the caramels and fudge yesterday.

I made the Good Eats Fudge recipe because it's more chocolaty than the Joy of Baking . One note : I think the Good Eats recipe has a misprint. It says 1 tablespoon of vanilla. Every single other fudge recipe in the world has 1 teaspoon of vanilla. I compromised and used a half tablespoon (=1.5 teaspoon) and it tastes good to me. I also added a tiny bit of salt. The texture came out great, and good walnuts are absolutely crucial, it would be insipid without them.

For the caramels, I found there are two general types of recipe. There's the one pot recipes, such as the New York Times recipe, where you don't seperately brown the sugar. Then there's the two pot recipes, such as the Gourmet Magazine Salted Caramel Recipe . Brian Sharp has a big thing about how you need the caramelization (browing) flavor from browning the sugar, so I went with the two-pot method. But I noticed that the NYT recipe uses a lot more salt, and I really want the nice salty caramel effect, so I used 1.5 teaspoon instead of 1 teaspoon, and also put fleur de sel crystals on top. Note that using fleur de sel inside the caramel is totally pointless since all the salt is the same when it dissolves.

Later I found the much more instructive Jacques Pepin Caramel Recipe which is a joy to read. Jacques is one of the few chefs in the world that doesn't have his head up his ass. He respects the proper techniques and is an exacting gourmet, but completely without pretention, and he has no problem with using half assed and lowbrow ingredients when it doesn't make a difference. And he seems to really enjoy food and cooking, and he has good taste. I can't think of anyone else in his league, all the modern stars are such fucking retarded pricks. A lot of the problem is that mediocre people don't really understand why they're doing things, so they do understand when it's okay to cut corners, and they can only talk about the way things "should" be done.

BTW I used my digital probe for the candy temperatures and it worked fine. There's no need for a candy thermometer these days. Actually the probe is way better because you can set an alarm temperature. Also, I had the problem that Brian predicted with heavy pans. I have Calphalon pans and they have a very large "heat momentum". When you're heating the caramel up to 248, you can shut off the heat but the pan keeps going. I panicked for about 3 seconds then just poured the caramel off into a cool empty pan.

Today I dipped half the caramels in chocolate. I tempered the chocolate using the simple seed method . I think it worked out okay to restore the chocolate to good temper. There's a big thing about the crystals involved in tempering chocolate at wikipedia ; it gives you an idea how you can have various qualities of temper; the seed method doesn't really give you a super hard super shiny chocolate like you would dream of, but it's better than nothing. After dipping the caramels I had a bunch of melted chocolate left over, so I dipped a few pieces of fudge. HOLY ZOMG WTF BBQ !!! Chocolate dipped fudge is the fucking bomb.

BTW every time I see a comedian or a NYT article make a joke about Wikipedia, it just reveals to me how fucking retarded they are.

Tasting notes : I think the caramels with the fleur de sel were the best, rich and buttery and when you get a big salt crystal it just explodes with a zing in your mouth that's quite pleasant as a contrast to the caramel. I know, I know, it's so 2003.

Oh, I also made the cookies the next day. A few little notes for myself - the walnuts in cookies are a little tricky. If you just put in raw walnuts they don't toast enough with the cookie; if you put in fully preroasted walnuts some of them will burn; the ideal thing would be something like half-roasted nuts. I still haven't found a good chocolate chunk solution. The pre-chunked ones in stores like Nestle chunks are ridiculously overpriced and also shitty ass quality chocolate. My solution this time was tasty but labor intensive. I bought the 54% Pound Plus bar at Trader Joe's, put it in the oven briefly to just get it soft but not melted, and then cut it into chunks. If you cut it cold it slivers into tiny pieces that aren't good in a cookie. I then put the cut pieces in the fridge to solidify. It worked fine but it's not worth doing for normal occasions. I cut the bar pieces into 4 chunks, they were still a bit too big that way.


12-20-07

More and more programmers are becoming responsible about documenting the code, describing what it does, what you need to pass in. That's all well and good, but it's still missing a huge aspect - documenting what's NOT in the code.

Code that just does something really has very little value. If you know what the code should do, it's easy to write. What does have a lot of value is a record of the knowledge and experimentation that went into a piece of code. If someone spends months trying all these different ideas, and finally comes up with a great solution and writes the code - the most value piece of that is all the things that were tried and ruled out. You need to write up why you are not doing the alternatives.

Often the trickiest bits of experience-based code look really trivial and don't get commented at all. Sometimes the nastiest weird case bug fixes just consist of doing a check that seems redudant. These are the things that really need to be richly commented for the future.

Without this stuff, the code picks up a lifespan where its usefulness dies as the knowledge used to make it dies.


12-19-07

Just because someone can have a draw doesn't mean you can think about calling. For example if someone only jams the flop with sets and draws - you must fold everything to them. In that case the chance they have a draw is quite high, the problem is even when they have a draw their equity is still good (30-35%). Either you have 0% (vs a set) or only 65% (vs a draw). You need to be actually beating some of their range to consider calling. Most people are way too loose about calling just because "the board is drawy, he could be shoving a draw" ; this is why it's so great to shove real hands, even as weak as single pair overpairs on drawy flops.


12-19-07

I made a little test app that does local regression on k-NN. It's a crappy test app, maybe I'll post it. Anyhoo, while it had bugs it made some really cool images. I didn't actually cap the best ones cuz I was just fixing bugs, but then I realized man these visualizations are cool looking so I did cap a few as I fixed the last few bugs. Unfortunately I could never intentionally recreate the bizarre bugs I had that made such cool images.

Okay I put up the the exe


12-19-07

Watched the movie "Police Beat" last night. It's definitely a flawed movie; a lot of the police incidents are just too random and don't fit, and I didn't like the ending at all, some of the supporting actors are terrible. Overall though it's brilliant. Basically the entire movie is one man talking to himself, thinking inside his own head, and yet it's engaging and interesting. It has a beautiful bizarre feel to it, and lots of great little-known Seattle scenery.


12-19-07

The walnuts we get at farmer's market are such a revelation; they're so sweet, and have these floral notes, and they completely lack the bitterness that makes your mouth pucker when you eat normal walnuts. I'd always heard that nuts go rancid quickly, but never thought it was a big issue, now I realize that pretty much every single supermarket nut is well on its way to being rancid. It's quite a treat to discover how good these basic ingredients can be, but on the other hand it leads me down the path to being a horrible food snob who says things like "oh, where did you get these walnuts? from safeway? no thanks, I don't eat supermarket walnuts". good grief.


12-19-07

Back in the old C days, to do fast allocations I would use various manual allocators. A standard one was a simple pool with freelist. These days I usually just replace the global allocator with some fancy thingy and let it go. I was looking over some ancient code today (the Genesis curved surfaces with view dependent tesselation, that was fun), and it reminded me of some advantages of the old custom allocator scheme. (I'm talking about an allocator which is used only for one type of object; some tree node for example)

Obviously you have the advantage of being able to custom design the allocator for your usage pattern to optimize speed. That's not a huge edge these days, but there are things you can't do with a normal allocator :

1. Just tossing the whole object. When your object is some complex tree, you can do all these little allocations, then when you're done with it you can just reset the whole pool, you don't have to walk the tree. This is not so much just a speed win as it is nice for simplicity and code reduction, and when you're in the destruction phase you don't have to worry about tracking pointers or who owns the pointers or anything.

2. Getting a linear memory iteration on the nodes. This is the really cool thing. So you build up this whole tree using some complex logic. Now you want to do something where you have to visit every node. If you descend the tree it will be in random memory order and be totally horrible for performance. What you really want is to walk the nodes in linear memory order. Of course you could maintain this as a side structure with a normal allocator, but if you have a pool allocator, the linear hunks of nodes are right there. You just get the memory blocks from the allocator in iterate over them.

3. Other nice accounting things, like easily being able to ask your allocator how many are allocated and have it give you the answer for just this exact type of object.


12-17-07

Doh! I had all these 2032 batteries for my old graphing calculators (I was a Casio man, I liked that it was slimmer and the buttons had a lighter tough than the HP and TI which were built like bricks). I threw them away cuz I thought I'd never use them, and now I find it's the same battery that failed on my motherboard (the 3V CMOS battery). On the plus side, the computer works totally fine without it, it just doesn't save your CMOS settings when you unplug it, so what.


12-16-07

I'm just so disgusted by the self helf pseudoscience new age mumbo jumbo that PBS peddles these days. FYI all the "detox" regimens are just complete nonsense. People are misled into thinking they are beneficial because they sort of feel like they should be. In many cases they can give you a pleasant feeling and perhaps a feeling of being energized and reinvigorated. Basically what you've done is starve yourself. With low blood sugar you get a euphoric feeling which is rather pleasant. Your body also supplies adrenalin under mild starvation, presumably to help you get some food. None of these things are actually beneficial. You can get a better euphoric effect with less body damage just by doing some drugs.

Also, why do they have to do fundraisers all the time for BBC programs? Can't they get BBC shows nearly for free?


12-16-07

There's this weird nonlinear thing about understanding. If you don't think about something too much, you can do it alright. Or if you really deeply understand something, you do alright. But in between, if you sort of understand something but not really, you can get totally confused and do completely the wrong thing. I saw it all the time in physics (and of course went through it myself). There were the engineer types in classes who would just learn the equations and how to use them to do problems and they were quite successful. The people who were trying to really understand it on a deep level but not getting it would develop all sorts of confused and wrong ideas - especially in a field like quantum mechanics that's quite deep and hard to get, you'd get people thinking confused things like "quantum mechanics can't be a whole solution because it postulates an external observer that can collapse the waveform via measurement" (of course with deeper understanding you would understand the observer is a quantum system and the "collapse" is just via loss of entanglement through ensemble decorrelation).

I seem to go through this spectrum almost daily with one thing or another. I've figured out certain rules to just follow to get the right answer, and that works fine. But then I forget why I'm doing that way and start questioning and thinking too much, and then I start doing really retarded things. After spending quite a bit of time I come back to really understand it and see why those rules were right. It would be better just to follow the rules and not think so much.

One area where you see this immensely is poker. Poker is such a good field to study human behavior because poker itself gives you very indirect feedback about your actions. Because of the randomness it's really hard to tell when what you're doing is right (compare to say, stabbing yourself with a knife, which gives you very immediate feedback that you did something wrong). Because of the lack of feedback people will do what is natural to their brain and not correct their mistakes. Anyway, you see people who just read a book and follow the directions, and they can actually do quite well. Some of the worst people are actually those who sort of understand the game and kind of get the logic and start thinking things out for themselves. These people do absolutely retarded things because they've developed strategies like slowplaying their big hands to balance their range and deceive their opponent.


12-16-07

We may never have reasonable health care costs in America. The problem is that the American public has this incorrect idea that allowing people to take huge profits out of the system is a crucial and necessary and even admirable part of capitalism, which is the inalienable right of every American and drives innovation and improvements in service. This is questionable in any market, but it's just completely wrong in a non-competitive area like health care. The most important part of capitalism is that the sellers have the freedom to provide the services that they think the market wants, and the buyers have the ability to choose between various sellers with good information about the costs and quality of the different choices. Both of these are completely missing in health care. The problem is when you tack this onto an American ethos where people believe they are entitled to massive profit, you wind up with an inefficient bloated mess. To fix health care, we need to do things like motivate the providers to cut costs, which would encourage them to do more prevention, we need to get way more primary care doctors and less emergency room visits, but primary care doctors make way less. In America any system such as forcing more doctors to do primary care instead of higher paying specialist jobs is a non-starter because of our whacked out misunderstanding of the benefits of capitalism.


12-15-07

The Bladerunner rerelease should remind us all how much better movies were before these fucking computers got involved. Look at the beauty and simplicity and naturalism of the special effects in the movie. No bizarro shiny surfaces, no crazy 1000 g-force camera sweeps, just good art and design. Of course it does have a lot of those 80's-movie touches which are so unwatchable today.


12-15-07

One thing I was trying to say with the "ergonomics" post is that everything is related, and often the flare ups are just symptoms of this larger system. You might develop elbow pain, and you think the solution is to change the way you're holding your arms. Most likely that's just a noticeable sign of a much larger hard to see problem. Bad body use is like the underground mass of a mushroom, the little pains you get are just the mushroom caps shooting out. If you just attack the symptoms you never cure the problem, and it will keep manifesting in various ways. One of the more subtle ways that bad body use can affect you is just be increasing your likelihood of injury. You might actually injure yourself skiing or biking or playing catch, but it was the mushroom body which caused your muscle support to be imbalanced, or your ligaments to be too tight, which made you more prone to that injury. p.s. I know this is borderline chiropractic/holistic mumbo-jumbo.


12-15-07

Basic method of cooking mustard greens : start sauteeing some onion. Chop the greens, making very small pieces of the bottom thick stem parts, and very large pieces of the top leafy parts. Salt & pepper of course. Once the onion is transparent, toss in just the stem parts of the greens; add some butter to keep the fat content high enough that you get some real browning. Once the stem parts soften toss in the leafy parts. Toss and sautee just briefly, then add a few tbsp of stock to make some steam, put on the lid and turn to low for about 6 minutes.

Quick fruit cobbler : prepare streusel topping; basic streusel is 1 cup flour, 1/2 cup brown sugar, 1 stick butter, cut the cold butter into the flour and sugar. It's quite flexible though, so you can add nuts (pecans are best) or oats; you can also add baking powder to puff it a bit but I don't recommend that. I like both nuts and oats, in which case you reduce the flour to 1/2 cup.

Peel and chop 1 apple and 1 persimmon (Fuyu non-astringent type). Meanwhile, put 2-3 tbsp sugar in a pan and heat up until it liquifies and starts to brown; just as it starts to brown toss in 2 tbsp pat of butter. Stir to melt the butter then toss in the fruit. Flatten it so it cooks, stir occasionally until fruit is soft. Salt it. Season the fruit with a little cinnamon and fresh grated nutmeg; not too much! we just want a slight accent not a pumpkin-spice explosion. Dump the fruit in a ramekin, it should be piled almost to the top; cover with lots of streusel. Put in 375 degree oven for about 30 minutes (start checking at 25).

Apples are really a shitty fruit to make desert from, they have no taste at all. I hate killing them with pumpkin spice, I found this is a really nice solution, lots of butter and caramel flavor, and the persimmon is a nice subtle companion.


12-14-07

Pinochio robot game idea @@@


12-13-07

It's really cold here (like 36) and our apartment is awful, so I've been using a hot water bottle every night to heat the bed. They're really amazingly effective, a very pleasant kind of heat, and you can put it right down by your feet where you need it. I also find it charmingly old fashioned. An even cooler old fashioned gizmo that you never see these days is the wood-framed bed warmer. They're like a body-sized wooden boat with struts over top, and you heat up a stone in your fire and then put the stone in the boat and put the whole thing in your bed. The struts tent up the sheets so they don't touch the stone and burn, and the whole thing makes the bed toasty hot. This is pretty close to what I describe ; here are some other different forms of the same device. I just found that OldAndInteresting site, it's pretty great. here's an Italian bed warming dealy


12-12-07

Techie trends that are horrible for the body :

1. Messenger bags and satchels. These things are asymetrically weighted and generally apply unilateral pressure (pressure to just one side of the body). Aside from concentrating the weight in hot spots, they cause leaning which is just horrible for the shoulders and the spine. Try to always use symmetric whole body carrying devices with good load distribution, such as backpacks.

2. Text messaging, small mobile devices. People are using these things more and more, and the tiny keyboards make you put your hands together like claws. They're like 10x more powerful than a mouse at generating RSI, plus you have a tiny screen so you stick your neck out to get closer and look down. On the plus side, you might be standing up and walking around which is great, but heavy use is still going to destroy the hands and wrists.

3. Laptops and cute little desktops. As computers become more design driven and people want to hide them in the living room, they become smaller and people are not as willing to have monitor raisers and proper desks and such. This is directly choosing appearance over health. In particular, not having a seperable keyboard and monitor is just awful. Hopefully we'll get some better laptop designs soon where you can detach the screen and stand it up, but you still really need a wider keyboard and a screen that can be raised to neutral height.

I also found these "Computer Guy" workouts at T-Nation which are pretty good : part 1 , part 2 . They're basically strengthening to fight kyphosis and promote scapular retration and stability, which is what you should focus on. Again, it's a crazy lifter web site, so ignore the retarded side bars and don't go browsing around, but the content of these specific articles is good.


12-10-07

Most people aren't just retarded and inconsiderate and lazy, they're willfully selfish and vindictive and greedy, and what's more, they're proud of it.


12-10-07

The problem with video games is there's no sex. I don't mean virtual sex in the game world, that's awful, I mean in the industry and community, amongst the fans and the fan-sites. Sex is what drives most art. Why do boys want to be rock and roll stars? To get sex. All the parties with celebrities and musicians are so exciting because the people are beautiful and everyone is having sex. What do the crazed fans of musicians dream of? sex. Why do the interns and roadies work in those industries for crap wages in shit positions? Because they want to be around the sex and get some cast-offs. What made myspace so popular? Fans trying to hook up with bands, impress each other and bed each other. Lots of sex.

The sex in these industries is not just relevant to the people who are specifically in it for the sex. In fact less than 10% of the people are activity involved in seeking or having sex, but the affect spills out to the whole fan base. The presence and competition for sex creates an excitement, an energy, that fills the whole social interaction. It brings in girls, and makes everyone want to impress each other. Everyone tries way harder to seem "cool" because that leads to hookups - and then the peripheral people who aren't involved in the hookups also try to seem cool to keep up, or to impress the cool people who were drawn in.

Look at something like Extreme Sports. Sure there are a few people who are actually into it for the excitement of doing it, but that group is very tiny. Then you get a huge female fanbase that it's in it for sex with the stars of the sport. Maybe not actual sex, but fantasizing about them, thinking how cool they are, etc. This creates a huge explosion of guy fans who dress up in the style of the sport and try to do the moves and act like the stars in order to get the cast offs. This leads to even more girl fans dressing up who are just interested in hooking up within that subculture.

The huge websites like myspace and facebook are basically driven by sex. They were tiny and not much used until they became a hookup site, and that led to an explosion. Not only does it draw in lots of cooler people, it motivates everyone to put more effort into their pages, to actually post pictures of themselves, it also made it cool for popular people instead of just being nerdy. Having a good page became a way of peacocking for partners.

Note that sites and activities that are specifically *for* sex don't really have this same effect. Nobody wants to admit that they're after the hookups, and certainly the popular people that you need to drive the pyramid can't be seen actively seeking hookups. You need to be able to at least pretend you're there for a different reason, and there does need to be a legitimate networking activity underlying the site, since only 10% or less of the traffic is actually for sex.

Video games are totally lacking this. There are no sexy video game makers, no parties, no reason why anybody would want access to the industry, the people who play games are not sexy, the fan sites don't lead to hookups, etc. The closest thing that video games have are very social simple MMO games. I think there's a possibility for an explosion in that genre, but at the moment all those types of virtual worlds are basically worse than Facebook and really provide zero reason to play them. For one thing, seeing stupid 3d avatars is not hot, you want to see actual photos.


12-10-07

The AMT has gotten a lot of flack, but it's basically without merit. A lot of people will be affected by it, but the canard that that is just because of inflation is not true. Yes, the fact that it's not adjusted for inflation brings a lot of people under its domain, but the reason those people are so widely affected is because of the Bush tax cuts, which lowered their regular rate but not their AMT. People without a lot of deductions generally aren't affected at all.

Basically the AMT is a flat tax with a large deductible, something like a 28% tax with a $60k deductible. That's an extremely simple and fair system, and again the claims that the AMT is "too complex" or "unfair" are preposterous. It's the simplest and fairest tax we have. I'm quite sure all the anti-AMT pressure is coming from the super rich, who are the only people that are very heavily affected by the AMT.

The AMT disallows lots of deductions and income hiding schemes. Even with the AMT in place the super rich seem to generally find good ways to not pay taxes.

Rather than repeal the AMT we should repeal the whole regular tax code and just adopt the AMT. (A few little fixes to the AMT would be warranted, such as allowing the deduction of local and foreign taxes paid).


12-08-07

Treats I'm thinking of making for Christmas :

Some real fudge, cuz I've never made actual real fudge before, only stuff like "Million Dollar Fudge" and the other easy faux-fudges.

Salted caramel. Cuz it's really delicious and super easy and trendy.

Chocolate chip cookies. Cuz I make the best in the universe.

Roast peanuts. Probably nobody will appreciate these, but it will at least be something on the treat table that I myself will enjoy eating.

One of my favorite cookies we used to always have around christmas was Mexican Wedding Cookies; we always called them "Pecan Balls" , or "Russian Tea Balls" which seems to be an identical concoction. I guess some people call them "Russian Tea Cakes" which is a bizarre thing to do. Anyway I think I probably won't make them but I will fantasize about them.


12-08-07

Sensitive men in their 20's try to treat women as human beings, as peers; they show them the true respect of having high expectations of their competence, the true respect of sometimes not agreeing with them or not letting them have their way and expecting to not have a tantrum about it. Older men tend to treat women like pets or retarded children; they often speak of how they love and respect women, and will be very gracious and courteous to them, but they simply lie and emotionally manipulate and avoid problems while never really letting them do anything important. The vast majority of women seem to prefer the latter.


12-08-07

Trivial way to do iterative game theory solutions : player A goes first. Try all possible moves for player A. Choose the move that optimizes his EV. Assume that player A knows that player B will be playing the best possible way for himself. For each move of player A's in the EV computation, simulate all possible moves, eg. go to player B's move and try all possible moves, assume player B knows player A's strategy but not his actual hand, and let player B choose the strategy that is best for him. (the branch is not actually on the # of moves, it's on the # of strategies, which is much larger).

Note that this does not necessarilly give you the correct game theory solution in all cases since it's a greedy search, but in simple games it will (games with a piecewise linear EV shape). It's also exponentially branching, but it's not actually that bad. The reason is you're not actually trying to find a full solution into the future, I just want player A's next move, then after he moves I'll do this again from scratch to find player B's next move. Player A's best move has a decreasing dependence on the future moves (in simple games anyway, this is crucial, I don't want moves N into the future to suddenly be more important than earlier moves). That is, A's best move is highly dependent on player B's next move, less so dependent on the next move and even less dependent on the next move. What that means is you only need to search a few moves ahead, and then you can just use some heuristic EV evaluation of the situation and terminate the branching. You also don't need to simulate branches that are obviously horrible for the person making the choice (actually even good branches which you can determine are definitely worse than some other branch can also be dropped).

In poker in particular you can usually stop the sim when the current round ends and just do a heuristic EV for the next round. eg. simulate all the possible player actions in the current round, but whenever somebody caps the betting or just calls, you simulate drawing a future card and just evaluate a heuristic EV based on the probabilities of improving. The heuristic EV still needs to be reasonably complex, it should include factors for position and being the leader, the fact that people on draws will put more money in if they hit but just check-fold if they miss, etc. but it doesn't need to simulate every possible action on all possible future cards.


12-07-07

The cold seeps into our apartment through the floor, it radiates through the windows like negative sunlight, it creeps through the cracks all over, at the front door, the baseboards, it's inescapable.


12-07-07

There's a new movement that's growing, I'm not sure if it has a name yet. It hasn't made the Style page of the New York Times yet, it hasn't been labelled by Rolling Stone. It's about the unpretentious creation of joy. It's about random happenings, art that doesn't mean anything, creative alternative histories, it's about dressing like a hipster but not ironically, it's about playing ethnic and forgotten music not because it's funny or a curio but because it's great music, it's about punk diy art, making your own bikes, making your own everything, it's about rejecting commercialism and norms but not in aggressive anti-society way rather just peacefully choosing not to be part of that. This movement is by its nature small and will die when it gets discovered and becomes popular and exploited to sell products. Some of the people who live the joyous life will quietly continue to do so.


12-07-07

So placing a credit report fraud report was really easy. Just call 800-680-7289 and it's an automated phone thing and it gets sent to all 3 agencies. Why I can't do it online I dunno, but whatever. Anyway, it's kind of a cool thing to do even if you haven't actually had any warnings of fraud. All it does is make them send you notification and get confirmation if anyone tries to open a new account in your name or get credit. Obviously something they should be doing all the time, but they want you to pay for it.


12-06-07

Followup on my previous eSATA report :

The best drive is the Samsung F1 which is very fast, runs quite cool, and is also nice and quiet. The Western Digital GP is even cooler and slightly quieter, and has a longer MTBF and better head parking, so if you just care about backing up data really safely it would be a better drive (but it's a lot slower).

All the SATA PCMCIA CardBus cards seem to be about the same. CardBus can run at 132 MB/sec which is theoretically slightly less than SATA can do; in practice I can't imagine it will be a limit on any real world drive use. ExpressCard is even faster of course. BTW they provide eSATA ports.

Now you could of course stick this in a normal enclosure and be good to go. It's good to pick a cool drive cuz all the enclosures suck pretty bad for cooling, even ones with fans. They also all pretty much suck for noise reduction. Despite the manufacturer advertising, the sealed all-aluminum enclosures are not particularly good for noise reduction, because the drive is bolted to the enclosure it just transfers vibration and acts as an amplifier. The ones with fans pretty much all suck for quietness as they have ass-tastic fans. Your best options seem to be : CoolerMaster X-Craft fanless enclosure but with good thermal design (open vents - presumably very loud), Apricorn EZ-BUS fanned enclosure is supposedly decently quiet (I haven't found reliable reviews on this), or the Rosewill RX-358 crappy noisy fanned enclosure, but it uses a standard 80mm fan so you can replace it with one of the high quality silent fans and presumably get a decent result.

But there's another way to go, which is basically just running your SATA drive bare. What you do is just take the bare drive and run a SATA-to-eSATA cable and plug it into your eSATA port. Then you just power on the drive with an AC-to-molex or AC-to-eSATA power cable. this blog is the closest thing I've found to a "how to" on that simple operation. One tricky bit I'm finding is just finding the power adapter. I want one with a hard power switch and they're really hard to find. WTF I just want an AC to DC-molex (4 pin) power brick with a switch, how is that not cheap and standard? One option for getting this power cord is to just buy a USB to SATA box and not use the USB part at all and just use the power supply.

The final piece of the puzzle is instead of just sitting your drive on your desktop bare, you put it in a Scythe Quiet Drive . I don't know why more external enclosures aren't designed like the Quiet Drive. It's got noise dampening heat-conductive foam, so the whole box acts like a heat sink (just like all the aluminum enclosures) but it has excellent noise reduction properties. To make it really quiet you should suspend the whole box in an elastic web, which you can of course do. Quiet Drive should not be used with a drive that runs hot, which is why we have to buy one that's reasonably cool. It does make the drive cooler than just sitting it bare on a desk.

Even though this is just a bare external drive, it gets expensive. The SATA-eSATA cable is around $10, the external power supply is $15-$20, and the Quiet Drive box is $35-$40, making a $70 enclosure.


12-06-07

"The Departed" is a really trashy movie. Sure it's entertaining, plenty of movie stars hamming it up and lots of violence, but it's devoid of any intellect or character. The whole first 20 minutes is one big ridiculous Irish stereotype. Ah, look at the mick cops drinking whiskey and fighting and talking about the mothers. In fact, this movie is the straw that pushes Scorcese into the ridiculous category for me. He seems to have voluntarily pigeon-holed himself as a director of gangster flicks, and they all center around ridiculous stereotypes. His Italians are all "wassa mada you" and "you lookin at me?". He's touted as a great American director - but really he's only capturing the American experience in the fact that we love racist depictions of minorities. His movies are still good fun, flashy overwrought camera work, lots of violence and money, and the more and more tired use of classic rock - they're superb exploitation flicks.


12-06-07

Real Belgian Waffles at the Waffle Truck are pretty exciting. They're at Civic Center market on Wednesday mornings. The truck is run by two guys from Belgium who go around to different farmers markets. They're real yeast-raised waffles with pearl sugar and everything. (addendum : I just read that they import their dough from Belgium flash-frozen. WTF that seems so retarded, how can they not make it themselves?) (they do apparently have a real cast iron belgian waffle press; yes, it's just a press, you heat it over a gas flame just like a frying pan, I've been wanting one of these for years but they're super hard to find in the US)

I've been thinking about buzzing off all my hair again (I do it every few months), but when I was at market yesterday this gay black homeless guy talked me out of it. Apparently he used to be a hair stylist before his life went off the rails. He was drunk at 10 AM and still had half a six pack left, carrying around the beer cans by the loose plastic rings of the ones that were gone. I gave him half my waffle.

I made kind of a fancy dinner but it didn't come out that great. Chanterelle risotto was good, the texture was almost perfect; actually risotto is one of those things that's really easy if you can cook at all but people think it's way harder than it is so it's impressive; anyway, the problem is chanterelles are too mild and it wasn't really a good use of them. I'd rather do a Porcini or Morel or King Trumpet risotto, and the chanterelles would be better just sauted and tossed with some plain pasta with butter and garlic.

Main course was Porter-braised Lamb Shank. I made up the recipe, sort of inspired by the idea of cola-braising or osso bucco. Basic prep : brown the meat, remove, toss in mirepoix and saute in pan juices, add tomato paste and cook out the raw flavor, add lots of garlic, deglaze with porter and chicken stock, now boil hard to reduce a bit, return meat to pan and bake at 350 for 1.5 hours with lid on, remove lid and bake a half hour more. Braising liquid should be way reduced to a thick sauce. That all worked but I made a few mistakes. It wasn't the ideal cut of meat, it was too lean and got dried out; something like beef short ribs or pork shoulder would've been better. I also made the mistake of adding a bit of brown sugar to the liquid to enhance the sweetness, but I shouldn't have, it was plenty sweet without it and it made it too sweet.

One thing that did work really well is I roasted carrots and pearl onions seperately to plate with the meat. In the past I would've tried to cook them in the braising pot, but it's so much harder to control and get everything to finish at the same time when you cook them together. It's way easier to do what restaurants do, which is cook everything seperately and then just assemble a plate as if it was done together, drizzle the sauce around and everyone's happy. Roast carrots and pearl onions was an excellent accompaniment; I'm just in love with plain roast vegetables these days.

In other food news, I made some roast chicken the other night that was some of the best I ever made, cuz I cheated. I wasn't really planning on making it and just picked up some random pieces at the store and tossed it in. The secret, I believe, was that it was not an actual whole chicken, but rather just breasts and legs. Having it pre-cut lets you cook it hotter and faster which makes it easier to get that crispy skin with meat inside that's just cooked. You can also start the legs 5 minutes before the breasts so they finish at the same time. I just rubbed the skin with butter, lots of salt and pepper, and stuck whole rosemary twigs between the skin and the flesh (easy to remove when it's done, you don't want to eat rosemary). Cook at 400 for 15-20 minutes (+5 more for legs).


12-04-07

I wrote about Hugo Chavez a while ago so I feel compelled to follow up now that his constitutional changes have failed to pass. I'm quite surprised and pleased (all the commentators in Venezuela were quite surprised as well), I think it's a sign that democracy is still somewhat alive in Venezuela and perhaps his power will not unchecked. Unfortunately, the legislature is still illegitimate due to the lack of opposition parties, and Chavez will be able to pass most of the changes as laws rather than constitutional ammendments, though a few things will be forbidden such as his ability to be president for life. Anyway, a positive sign and hopefully good news for Venezuela going forward. Unfortunately at the same time Russia has slipped even further away from democracy.


12-04-07

"Calorie Restriction" is still a relatively new area of research, so let's not presume it's right yet, but it's not really a huge surprise that it would prolong life. Basically what you're doing is putting your body into semi-hibernation. If you're on a severely calorie restricted diet (they recommend 1000-1500 calories), and you've acheived a steady state (no longer losing weight), that necessarilly means that your physical activity level is very very low. Basically you're consuming very little and doing very little. All the pathways of your body slow down and do less, your metabolism does less, your mitochondria don't have to work as much, etc. If we assume that each element in your body has some fixed failure rate, like eg. cell division has a 1% chance of screwing up each time - the less you work the longer you can postpone destructive failures. (this fixed failure rate is in fact a good model for DNA/RNA transcription and also for production of free radicals and misformed proteins).

To put it another way so that you can see how retarded this idea is - if you severely reduce the amount of gas you use in your car, your car will last much longer. Yes, of course this is true, because using less gas inherently means you are driving less, and accelerating less, and the life of the car is roughly based on how much you use it, not calendar age.

On the other hand this may be part of the reason why some people who seem super fit don't live longer. Naive people often think it's ironic when a serious runner dies young. They think all that running was for nothing because it didn't prolong life. That's almost as foolish as thinking that running a marathon is good for your health. Moderate exercise probably prolongs life (though there are so many other factors that it's not a 100% correlation). Very heavy exercise, however, probably shortens life. For one thing being in a near-starving state as distance runners often are is very hard on the organs and the brain. For another thing, the opposite of calorie restriction, which is a high "g flux" (consuming a ton of calories and burning a ton of calories), almost certainly shortens life, because you are constantly breaking down and creating new cells and proteins which is putting a big strain on your body and increasing the chance of mistakes happening somewhere in all that molecular work.

I personally choose to live the high g-flux lifestyle myself, just as I choose to use alcohol and drive fast and do many other things that are likely to shorten my life. I totally don't understand the desire to slightly increase your predicted lifespan by giving up quality of life today. Are those extra 0.5 years when you're 85 really going to be awesome? (of course people do retarded things in the opposite direction too, like choose to not wear a seat belt because they don't like the feel of it; okay, you choose to greatly increase your chance of severe injury in an accident because you don't like the feel of the strap, good decision, let me make sure you are never my manager).


12-03-07

It would be cool to have a manual coffee dripper cone thing with an adjustable spout so you can slow down the flow even more if you want to. It would be pretty easy, two metal crescent moons connected by a bolt so you can screw them closer or farther apart.


12-03-07

I'm listening to Jon's talk at Montreal on game design. It's worth a listen, but it's kind of long, so you can just read the PPT slides and pretty much get the point (I listened to it, there's not a ton in the dialog that's not in the slides).

I definitely agree with the general idea that games are uninteresting and could be so much better and aren't. I'm also very glad guys like Jon and checker are out there shaking up the industry trying to get people to do better work.

Jon does a really good job of presenting it as sort of an attack on the game industry, which makes it sort of controversial, but without being too offensive, and also sort of making it a challenge for the industry.

Games are an interactive medium which could be an art form which could let the user experience a wide range of discoveries and emotions and different intellectual and physical challenges, but they rarely get outside of a very narrow band. Almost all games (and not just video games, but also board games and card games and sports) are in the mode of "work on a skill, get rewarded when your skill improves, repeat".

For one thing I object to the idea that games take advantage of players and are only enjoyable in Pavlovian "drug-like" sense. Good multiplayer games certainly hit the exact same mental pathways as sports or board games. I don't think that anyone claims that sports or board games are mentally destructive or that the pleasure they give you is somehow inferior to other forms of pleasure. Now, of course that is not the only form of pleasure that games work via. Another is the "slot machine" pleasure which is indeed "drug like"; this is almost a trance-like mental state, and again I don't really think there's anything inherently wrong with it. Pretty much every Popcap game works on this level, and it's not really too different from sitting and playing Solitaire with cards, or even to watching TV. It's not really a high form of pleasure, but criticizing people from wanting a low form of pleasure or companies providing it is pretty goofy; 90% of consumer products cater to simplistic "low" forms of pleasure, be it TV, junk food, booze, sex, etc. it's no surprise that tons of games work on this same level.

In general, it should be no surprise that 90% of games suck. It's the same way with TV and movies and books. The ideal is just that there's a small portion of games that are more interesting and appeal to a more refined consumer. To some extent, those games already exist.

What Jon is really pining for is "games" that aren't actually games, in the sense that you don't play them and you don't necessarilly win, they're just interactive experiences. When people read a book or watch a movie it's not necessarilly to have fun, it's to experience something different, and in theory "games" could be the same way.

The thing that I think Jon gets wrong is the idea that game designers are not trying to get outside of the box. (sure, some of them just suck and are trying to reproduce Doom, but those guys are not the innovators). Every really good game designer I've ever met really really wants to do different interesting things. And in fact, I'd say that 50%+ of games start out development with more interesting experimental mechanics driving them. But, during dev, things start going wrong. Really novel free form mechanics are hard to control and lead the player to getting stuck in unplayable situations, or ruining the game world. They're really hard to balance so you can't create a progression that works. Often they just aren't fun. You play test them and people don't get it, or get it and just don't enjoy it. So, the new game modes get stripped out or toned down into simple controllable mechanics that work in the tried and true forms. For the most part this is still within the "game" paradigm and is for the purpose of giving the player fun and challenges.

Making interactive "art" which provides an interesting experience and is also playable (in the sense that you actually want to spend more than 10 minutes doing it) is really hard. Pretty much all the novel interactive experiences I've ever seen are just not a piece of software that you would want to choose to spend your time playing with it.

In practice, in terms of making a game that's interesting for adults and people who don't like typical games, the most important things are easy install and quick loads, compatibility with all machines, great art and content and dialog and characters, not too much frustration and repetition, very forgiving mechanics for people who screw up or don't get it, never getting stuck for long periods, never having big long boring sections, a steady supply of new pleasing content and experiences, a good progression of difficulty that ramps up and keeps the challenge moderate, a good variety of play styles or movement styles to break up the monotony, etc. etc. Stuff like that.


12-03-07

I've been trying to find a new external HD enclosure for a 1 TB drive. I'm super excited by the WD 1TB GP Caviar because it runs cool and quiet which means I can just stick it in an aluminum enclosure with no fan and it won't ruin my rather quiet setup. These days there are a ton of very cheap USB 2.0 + eSATA enclosures which is pretty awesome except for two things : 1. I don't have eSATA, and 2. USB absolutely sucks balls.

I just tested my current drive with HD Tach. On Firewire : average read speed 34 MB/s , CPU utilization 4%. On USB : average read speed 27 MB/s , CPU utilization 28%. The slower speed is a bit annoying, but the ridiculous CPU utilization of USB is pretty much a no-go.

The sucky thing is that Firewire enclosures are almost an order of magnitude more expensive. USB + eSATA enclosures go for as low as $25. Firewire + eSATA you're looking at more like $100. (BTW try finding the Vantec NST-360UFS-BK which is their only Firewire + eSATA device. I dare you.)

Also, eSATA is sort of like blowing my mind. There's something I don't get. Basically they just took the normal internal SATA cabling you would use in your desktop, put some more rugged connectors on it, and ran the cable straight out of your PC to an external hard drive. Okay, that sounds awesome. So why in the fuck have we not been doing this all along with IDE and SCSI !?!?! Why have we suffered with these retarded USB and Firewire standards that are so much slower? Firewire 800 is the one that boggles the most. It came out pretty recently, it's quite expensive, it's a new cable type that's incompatible with Firewire 400 (though backwards compatible), and yet it only doubled the speed ?

ps. yeah I know eSATA has a max cable length of 2 meters while Firewire can go to 100 meters or more. pps. yeah I guess we've had external SCSI for a long time, in fact I had external SCSI devices on my Amiga, and there's also this new "SAS" thing, but in practice for consumer-level PC stuff SCSI may as well not exist. The price disconnect these days for "server" stuff is becoming more and more retarded as stuff like hot-swappable RAID arrays have moved into the consumer space; you can get a fast hot swappable eSATA RAID array for around $200, or you can get an equivalent SCSI "server" device for $10,000. I imagine that most IT guys are still going with the latter.

One of the awesome applications of eSATA is that you can basically have a desktop hard drive that you carry around with you. For example if you have a work and home dev machine setup, you can have your normal working hard drive be a hot pullable eSATA drive, and you just carry it with you, rather than lugging a notebook or whatever.

I need to stop ranting about PC hardware because I literally know dick about it these days.

The winner enclosure at the moment looks like the Wiebetech Toughtech FS. They seem to be a Mac-oriented company which means there's a 10% surcharge on everything.

D'oh, wrong. The winner is getting an eSATA PCMCIA card and just getting an eSATA enclosure.


12-02-07

I think maybe I don't actually really like chocolate chip cookies. When I think of making them I get all excited, but I think that's just the Pavlovian response from childhood and rememberances of mommy and all that kind of nonsense. When I actually eat them the first few bites are good, but after that they're just too sweet and insipid and the sugar gives me a headache.

Anyway, Nigella Lawson's recipe is the bomb. She calls for 2 cups of flour. Instead I use 1.5 cups of flour + 1 cup of oatmeal. You barely even notice the oatmeal in the resulting cookie, it just gives it a little more substance. They're pretty amazing the first 5 minutes after they come out of the oven. BTW That's why I think "Specialty's" bakery is the best store-bought cookie I've ever had; the quality of their cookies is not the best in the world, but they're constantly making new ones so you can always get them fresh out of the oven with the chocolate chunks still all melted, which is such a trump factor over all the fancy pantsy bakeries that serve hours-old gourmet garbage.


12-02-07

"Border Café" aka "Café Transit" is a pretty good little movie. It made me think about how liberal Iran is in some ways (allowing a very honest and not entirely favorable portrayal of islamic customs and law), and how our government and media have done such a disservice to everyone by trying to make Iran seem more fundamentalist and oppressive than it is (certainly compared to our good buddies like Saudi Arabia).


12-02-07

Holy crap, it looks like I'm being identity-thefted. I just got my credit card statement and noticed two strange charges from something called CLKBANK that I didn't remember. Turns out that's "Clickbank" an online payment thing, and fortunately they have this deal where you can look up where the charge came from. The two charges were done at www.gov-records.com and www.People-Records.net , which are identity selling services ; the fact that the charges were done in my name would seem to indicate they were pulling records on me.

Now I'm faced with the absolute retarded insanity of id control in the US. I can't change my Social Security Number. I can't change my Driver's License number. I can't even change my bank account numbers, all I could do is close accounts and open new ones. I'd like to do something preventive but I don't seem to have many options.

The credit card that's compromised is a Chase Visa that's like 3 months old and I've hardly used it. I guess there a million ways to steal credit card numbers but I thought it would be way more likely with a card I've had a long time. It's also ironic that I've just lately put my computer in total lockdown and scrambled all my passwords. Of course that only prevents electronic attacks, this feels like an old fashioned phone and paper attack.

It would be so easy to make credit cards very secure online. You just have to stop using credit card numbers. Instead you run a program on your local machine which generates a temp code that's only active for one charge or one day or whatever. That way the retailers and the various payments processors never get access to your number.

BTW yes I know I can put a fraud alert on my credit reports. This Call for Action group is pretty cool for helping consumers.


12-01-07

Short article on Kyphosis at Performance Menu.


11-30-07

Hard drives are the new floppy . For $40 you get a thing that fits in a 5.25" floppy bay on your desktop. It's got a door on the front and you just pull 3.5" hard disks in and out. No screwing or turning off your computer or anything, hard disks are literally like floppies. That's kind of rad. this one is even better (physically use a hard disk like a NES cartridge).


11-29-07

The first web server I interacted with was a VMS machine and it had this horrible ";1" ";2" etc. system for automatically keeping backups of everything. Among other things it sucked because I had a tiny disk space limit and the stupid backups would chew up my disk allocation. Now I wish Windoze had a decent auto backup thing. I should be able to set aside X% of my drive for backups, and set extensions that I want backed up, and they should automatically go in the backup dir in an LRU kind of way. Then any file you want you should be able to click and say "give me the backup". I could almost just write an app to do this using disk changed notifications. So much better than having retarded .bak files scattered everywhere. Also the backups could be delta-compressed which means as long as you're only making small changes you could have tons of backups of any given file.


11-29-07

X and Y are vectors, (or a series of numbers). You want to do a regular linear best fit, Y = m * X + b. If we use the notation that <> is the average over the series, then :

m = ( < X * Y > - < Y > * < X > ) / ( < X * X > - < X > * < X > )

b = < Y > - m * < X >;

This is super standard but it's nice and concise which makes it a nice thing to gather. "m" is very almost the "correlation". If we use the formulas

sdev(X) = ( < X * X > - < X > * < X > )

rmse(X) = sqrt( sdev(X) )

then :

correlation = ( < X * Y > - < X > * < Y > ) / ( rmse(X) * rmse(Y) )

Note that if you put the variables in "unbiased form" by subtracting off the average and dividing by the rmse (making it have an average of zero and rmse of 1.0), then the correlation is just < X * Y > , which is the same as the "m" in the linear best fit for unbiased variables.


11-29-07

Ryan's Blog is videogame related and pretty damn entertaining.

The best thing on Yelp are the Sushi + Japanese restaurant reviews by Toro Eater and Nobu K . Toro is a great reviewer, very analytical and thorough, Nobu is not so accurate but he's a brilliant wild man poet.

Sometimes when going to the movies I think how insane the $10 movie ticket is. Really it's not, in fact it's pretty much just a normal inflation increase from the old $5 tickets when I was a kid, it corresponds to higher rents and costs of power and benefits for employees and so on. The thing that's changed is I can get Netflix for $20 which is just insanely cheap (or get torrents for $0, well not really zero of course cuz of the price of power and the internet bill, but pretty much zero).


11-29-07

Checker poked me and it made me upload some more junk to Flickr. Bastards took away my "pro" so all the nice high res versions are gone. Lame. Anyway it made me realize I don't have any photos at all of the good stuff in SF. I'm not trying to do a photo journal, my camera is awful and I'm lazy about carrying it, but all the amazing sights I've seen here - the alleys full of graffiti, all the great old houses, the sun "rising" over the Transamerica tower, the TV tower poking through a wall of fog, looking back at the city from the Golden Gate Bridge, etc. etc. - I don't have photos of any of that stuff.

We saw "Into the Wild" yesterday which kind of made me want to go off into the woods alone and die. I know, I bet there will be copycats, and that's lame, but I have always kind of wanted to do that. I've done my own little mini-tramps, but I'm not really cut out for it. Some part of me really wants to just get rid of everything I own and completely get out of society and become a hippie or a tramp or whatever. Actually it was "Man vs Wild" most recently that really made me think of getting out backpacking in the real wild. Before that I always imagined having a little VW Camper bus or something and wandering that way, but I like the idea of the physical challenge of getting by in the woods.

Anyway, I thought of a more realistic option. I think it would be really fun to rent a totally isolated cabin for a month or so. I mean isolated like, not in a community or anything, completely out in the woods, no power, no running water. Presumably there would be a well and propane or maybe even just wood for heat and cooking. I don't think I'd want to live like that for a while, but it would be really fun to rent for a month and play old fashioned "house" for a while. I have no idea how to find something like that though. Not even sure if it would be legal, there are all these laws in the US about minimum functionality of rental properties.

We finally went to "Range" last night; it was something we kind of had to do since it's like 2 blocks from my house and it has a Michelin star. Mmmm it was very good but I don't think I would go back unless some friend really wanted to go. Basically it's normal American/French bistrot type food (braised pork loin, roast chicken, stuff like that), but it is executed really subtly and artfully - the same way that I try to cook at home. Anyway, I don't think it's really possible for basic French/American bistrot food to really impress me any more. It's just so easy for me to make that stuff at home, and there are a lot of advantages to doing it myself. I get all the fun of the cooking process when I do it, and I can drink a whole bottle of wine for less than the price of one glass.

I used to really enjoy dressing up and going to fancy restaurants and acting all sophisticated. It was a chance to prove to girls that I was rich and cultured and could have good manners despite my usual impolity. That entire element is gone for me now. I just feel kind of goofy and out of place, and the way everyone acts to each other and the interaction with the waiters and everything just seems so bizarre. I keep getting the impulse to chuck my plate at a wall and take off my clothes and go running around between the tables hitting everyone on the head.


11-28-07

I realized Creme Anglaise is just like not-frozen ice cream (proper ice cream made from cooked egg yolks, not the frozen cream stuff you get in america; these should really be two different names for frozen custard and frozen cream) - duh, I guess that's obvious but I never really knew what this creamy junk on my restaurant dessert plate was. I made some for the bread pudding the other day but now I'm out of bread pudding so I've just been drinking it in shot glasses. Yum.


11-27-07

So I've been writing this thing on fitness . I dunno, I'm not really happy with how it came out, but what are you gonna do? There it is.


11-27-07

On Ergonomics

I'm going to go over some of the basics which everyone should know, and also some thoughts that perhaps most of the ergonomics guys don't talk about because they aren't computer users, and some things I've learned through my shoulder injury.

Constant sitting and computer use is one of the most destructive things you can do to the body. Not only do you put the appendages in tightened positions, which can pinch nerves and cut off blood flow, you often place big pressure on the spine which can cause the vertebrae to shift, and the hours and hours of sitting with no activity cause the tendons to shorten and the muscles to atrophy. It's the atrophy of stabilizer mucles which may be the most harmful, because it means you cannot support yourself with your muscles and hold good posture, and instead you rely on your skeleton to support you, which leads to all the other injuries. It also means that any time you do something athletic you're not bearing the forces with your muscles, which leads to more injuries. People who sit do damage to their knees, hips, back, neck, shoulders, elbows and wrists !! (only your ankles are safe)

Let me start with the summary : it's good to know the basic "ergonomic" ways to sit (and good to have a bunch of different options), but that is only the first small part of the battle. The real solution is to get out of an inactive, sitting, hunched-forward, atrophied- muscle life. You need to do exercises to correct the bad habits and posture of computer users. You need to sit "actively" using your muscles and moving around. You need to change positions constantly, take breaks, stretch and rest. If all you do is sit and use computers, your body will be wrecked regardless of how well you sit.

First some review of the standard advice. Everybody by now should know the "90 degree position". Feet are on the ground, knees bent 90 degrees, sitting on your "sit bones", hips at 90 degrees, neck straight up, shoulders back, humerus straight down, forearms level. Okay, this is the "90 degree position" which is commonly advocated, but it's only sort of okay.

Basic sitting style : You want to sit on your "sit bones", not your butt. You can feel the two hard bones around the base of your butt where it meets your legs. To sit up on them, lean forward slightly and engage the ab muscles to hold your body erect. It may help to imagine that someone is holding your skull and pulling you upward by the head. Puff out your chest, engage the ab and back muscles slightly in neutral position. This is easiest to practice on a firm bench like a piano bench. Your spine should be in a slight "S", going inwards in the low back and outwards in the upper back. Okay, now you know this. You actually want a chair with a back when sitting so that it helps you keep this posture, but you need to imagine that you're sitting up with your muscles, and the chair back should provide uniform pressure across your whole back to just help you. Anyway, this is also bad.

The problem with both of these is that it's just too hard to hold them for any length of time. If you're going to be coding 8+ hours a day, you will not be able to hold these positions with your muscles and you will begin to let your weight rest on your bones and cartilege instead. These positions are very very hard on the body if not supported by muscle. BTW the very best thing you can do for your postural health is to get stronger muscles, you need a very strong abs & back and shoulders to be able to sit all day. Ironic, I know.

The head should be up and "back" and not looking down. I say "back" in quotes because really it should be a neutral position, but just about everyone has it forward, so you need to push it back from what you have be doing. The bones of the neck are in a neutral position roughly straight up when you're sitting right or perhaps very slightly angled forward. Your monitor needs to be high enough that you can look pretty much straight ahead. The ideal spot is roughly where if you look straight ahead that's about 1/4 of the way down the screen (pretty much no monitor is tall enough on its own, you have to put something under it). While I'm talking about monitors - the common dual screen setup that coders use is very bad. You should not be turning your head to either side for any signficant length of time. If you need to look to the side, you should turn your whole body. It's much better to have one large monitor than two. If you do have one large monitor, make sure your windows are centered, not left-justified, as that would cause you to be looking slightly left all the time. Bad neck position squeezes the discs in the spinal cord around the neck and shoulders, which can impinge the nerves coming out of the spine and going to the shoulders and arms. This can cause weakness, muscle spasm, numbness, constant muscle tightness, and pain. Once you get vertebra damage, it's basically impossible to fix by any means. Seriously, don't get it. Your head should be far enough from the monitor that you don't have to look very far in any direction to see the whole thing.

Shoulders should be back and down. Again, this is just the neutral position, but so many computer users are hunched forward that you really need to focus on getting the shoulders back. It's basically impossible to have them back too far, so go ahead and hold them back as much as possible. You should be retracting using the scapula muscles in the mid back (rhomboids), not hunching the shoulders up with the trapezius. The same goes roughly with getting them down - it's pretty impossible to hold them down too far for any length of time, and lots of people have them constantly hunched up, so just try to keep the shoulders down as much as possible. Note that arm rests on chairs usually get in the way of this, so you want a chair with no arm rests or removeable arm rests which you can take off. When reaching for anything - the keyboard, the mouse, etc. - the shoulders need to stay back, don't reach out with the shoulder. Basically this means that anything you reach for regularly should be within forearm distance of your torso. Elbows should be close to your ribs at all times. One way to be aware of the right position is to pay attention to how your scapulae feel against your chair-back. You should feel the flat surfaces of your scapulae flush against the chair back (when you lean back) - not the points of your scapula sticking into the chair.

Forearms should be roughly level, and wrists should be level or slightly down, and relaxed. In particular, the arm should be supported from the shoulder, not by resting the weight of the arm on the hand. Many people use these wrist pad things. Those are certainly better than resting your wrist on a hard surface, but they encourage a very bad habit of resting the arm weight on the pads. Split keyboards are nice, but the MS ones are awfully thick, which means your desk surface needs to be really low to avoid having the key surface too high. Usually this means that the desk needs to be as low as possible such that you can still get your legs under it. You should not be able to cross your legs under your desk.

The head forward shoulders hunched posture of the typical computer uses is called "kyphosis", which is a forward rounding of the upper spine. It's bad for the vertebrae as well as the function of the scapula, the shoulder muscles, and the load bearing function of the core. These disfunctions make simple activities like holding a weight over your head very dangerous. One way to feel if you're in danger is to run your hand over the back of your neck. You can feel the vertebrae. Feel near the level of the shoulders, the vertebra here is C7 and it will be a pronounced protuberance if you've had your neck too far forward for a long time. (BTW another contributor to Kyphosis is the modern obsession with pecs & abs; even people who do work out will often overtrain the front of the body leading to constant hunching forward).

Typical computer users are in a severe state of muscle atrophy. It may feel very straining just to sit up without back support, such as when sitting on a physio ball. Similarly it may feel very difficult on the upper back to hold the shoulders back. Neither of these should be difficult for someone with even a basic level of body function. An immediate course of physical therapy and stretching is warranted to correct these problems. I'm not going to go into a ton of detail right here about the best exercises and stretches, but they can be done as often as every day, and a full course would take about an hour a day. Exercises should start pretty light and involve isometric holds with each repetition, using high-rep sets, something like a 3x10 pattern. Once some function and stability and posture is acheived the exercises can be done in the more typical hyptertrophy range of higher intensity.

On equipment : buyers should procure chairs that are highly adjustable. If you can find a chair that perfectly fits your body in a 90-degree sitting position, that's fine, but for office managers you need chairs that can be adjusted to any employee. That means height adjustment, removeable arm rests, back tilt (back tilt should not tilt the seat), and adjustable lumbar support (preferrably in depth as well as position). Desks should also be height adjustable. The height of the mousing surface needs to be about 1 inch above the users waist in a 90 degree sitting position. Note that this also requires that the desk top should be very thin, and there should be no support bars under the desk top where the users knees will go. Height adjustable keyboard and mouse trays are one option, but in that case the desk top needs to be very high, and most cheap trays are really flimsy and horrible to use. Height adjustable thin-top desks are quite cheap and all office managers should procure them. The exact keyboard and mouse that a user wants is not really a big deal, they can use what they like. What is important is that they can put their hands in position on those devices while keeping the back and shoulders neutral. That can be hard to arrange, though it's easier with a track ball or a chair-mounted mouse tray.

Okay, this is a good start, but this will still wreck your body. For one thing, as mentioned before, it's just too hard to hold this position for a long time. But even aside from that it's bad. Your hips are not made to be bent 90 degrees like that for long periods, they need to be straight. The combination of hips bending and knees bent leads to severe hamstring shortening and hip tightness which is very bad and dangerous for athletic performance. A similar thing happens in the shoulders - having them down and immobile all the time leads to atrophy of the shoulder girdle. Something that most people aren't aware of is that the shoulder is not a ball and socket joint. Rather, the head of the shoulder is simply held to the glenohumeral joint through muscles and tendons. It's like it's just strapped on there with soft tissue, and when that soft tissue atrophies, you're at increased risk of dislocation, as well as soft tissue injuries like rotator cuff tears, "slap" injuries and seperations. Shortened and immobile joints also lead to nerve shortening and loss of blood flow. This "90 degree sitting position" that we've advocates is almost a fetal position with all your joints curled up and shortened and it's just horrible for you.

So, what's better? Well, not much. One common alternative that's advocated is a "kneeling chair" (sometimes called an "ergonomic chair"). These things provide pretty much no benefit, but could be used as part of position cycling (see later). Another device I have used is sitting on a "physio ball" (big blow up balls). Make sure you get a ball the right size and blow it up so you can be in a 90 degree position. This is a useful training tool to help your sitting posture, because it engages the muscles and makes you aware of posture, but you should not sit on it for more than 30 minutes or so at a time as it's very fatiguing. A simple bench or stool at the right height can serve the same purpose.

The real best thing you can do is two part : position cycling and taking breaks. You need to take a 5 minute break at least once an hour. I know this is really hard to do, but there's no substitute for it. The break should involve some simple stretching and active mobility work. Position cycling means not sitting in the same way for long, ideally using as many different positions as possible. One option would be to change positions every hour when you take your break. A better option is to just be changing positions and stretching constantly. Any time you start up a program and it's taking a second, stand up! When you compile, stretch your arms out to your sides, then up over your head. When resting or waiting for something, don't just sit there - move around. There are various free & not free programs to help force you to take a break. These can be very useful to get you in the habit because most of us won't take enough breaks if left to our own devices.

Basically you need to stop resting on your skeleton and ligaments all the time, and start using your muscles. But you don't just want to lock up your muscles and try to hold the "90 degree" position. You want to stay as relaxed and mobile as possible. You want to keep the body moving in natural ways and stretch and let the muscles move around and contract and relax. There's also really no substitute for getting plenty of exercise outside of work. If all you do is sit at a desk and then sit at home your body is going to be wrecked no matter how "well" you sit.

BTW I haven't mentioned the most imporant thing, which is using a computer less, because I presume it's basically not possible. One thing you should work on is getting away from the computer when you don't need to be at it.

As for position cycling, some of the useful positions : 1. regular 90 degree sitting, 2. sitting on a ball or a pogo-stick chair where you're "actively sitting", 3. reclining in a normal desk chair; this is actually a very good position, but you have to be careful. Recline from the hips with a straight back, not a slouch in the low back. Make sure you can still reach your keyboard and mouse near your lap, not reaching up or straining the shoulders. You may also need to be able to elevate the monitor to make it high enough that your neck can be neutral, not tilted forward relative to your torso. 4. standing up. Standing up is one of the very best work positions you can have. You will need to elevate your keyboard and monitor a lot, so you probably need a "sit to stand" desk, which is one of the best pieces of ergonomic equipment you can get.

Having no desk surface at all and having a wireless keyboard in your lap is an interesting option. The standard keyboard with numpad presents a lot of problems for mouse placement. Putting the mouse off the right side makes it too big of a reach. If you have a corner desk, the mouse can be in front of the numpad. Alternatively the mouse could be on a tray or on the chair, or it could be a trackball.

Frequent use of laptops is just horrifically bad. They do just about everything wrong to your body and really actively promote the hunched kyphotic posture. It's highly discouraged.

Let me sum up and emphasize that the solution is not any particular "ergonomic position" or any piece of equipment you can buy. It's a lifestyle. It's a mentality of listening to your body and putting your body before your work. It's about being mentally aware of how your body feels and keeping your "mind in your muscles" - feeling your abs and scapular adductors holding you erect, not just resting on your frame. It's about stretching and exercising and resting every single day. You need to start listening to your body. If you really listen, your body will tell you when you do bad things to it - it's just that you're so used to abusing it constantly that you automatically ignore it.

BTW if you want to do exercises that will be beneficial, some of the things you should focus on are strengthening the back, fighting kyphosis, strengthening the shoulders, in particular the posterior shoulder girdle such as the scapular retractors and the rotator cuff, hip mobility and hamstrength stretching movements, and in general extension and pulling movements. Rowing is actually a superb all-around full body anti-computer-use movement which does most of these things.

It may be impossible for someone who's chronically heavily using computers to really fix their neuromuscular patterns. One suggestion that might help is the next time you take a week or two vacation, try to really exercise and stretch and treat your body well during that time (do lots of swimming and rowing and yoga and good active mobility and extension work). Now when you return to work be aware of how healthy your body feels. When you sit down, keep that feeling. If the work starts to make that feeling go away - fix your work pattern.

Another addendum : if your workstation is not set up well, it doesn't matter how much good work you do away from work. It's valuable to know body-friendly positioning and desk setup, even though that is not the "solution". Basically sitting at your desk is wrecking you, and movement and strength is restoring you. If your desk has too much wrecking power, you can't beat it. You want your workstation set up to be as non-damaging as possible. It will *always* be damaging, no matter what kind of active sitting you do, but it's important to minimize how bad it is, as well as minimizing your time spent sitting.

"proprioception"


11-27-07

Fucking Christmas presents is going to be a nightmare hanging over my head for the next month. Somebody hit me with a car so I can get crippled and have a good excuse to just disappear for the next 40 days.


11-27-07

Poker notes

I want to write some poker thoughts before I forget them because I haven't been playing hardly at all in the past 4 months and I'm losing my edge. These are generally overall play frequency and style notes. This is for someone who can already play solid 2+2 TAG poker. If I'm reading this to get back up to speed I need to take it slow and just play good basic poker first; read hands and make the right play and don't force things.

The more randomly an opponent plays, the more you must take risks against them. This goes for both good and bad opponents and doesn't necessarilly affect EV, but it does affect variance. For example, against a really horrible player, you might not really know what they have, you can't read them because they have no idea what they're doing - you just need to go with decent hands that you might not normally play, stuff like top pair for big pots. Similarly for good players with well randomized ranges who can be making a lot of bluffs - you need to repop them a lot and accept the variance or you will get beaten up.

Any time you would never do a certain move with a certain hand, that's a leak and there's a way to exploit it. It might be a tiny leak that's very hard or rare to exploit, but it's still a leak. You can identify these in other people who play standard style. For example, most people will only check-raise the flop with very big hands or bluffs (often with draws, which is kind of a bad play), they never do it with decent made hands like top pair. That's a leak and if you know they have that pattern you can use it.

Good technical play is almost impossible to beat. "Technical" play is about getting your frequencies and ranges right. One technical issue I ran into at higher levels is cbetting too much. At lower levels you can almost cbet 100% of the time. At higher levels you need to check more, and then sometimes check-fold and sometimes check-raise. Also when you do decide to cbet, then you need to have good frequencies on the turn. Again on the turn you want to be value betting some percent, second barrel bluffing sometimes, sometimes check-fold, sometimes check-call to catch bluffs and sometimes check-shove. All those options should have a reasonably balanced frequency. In theory you want to keep balancing ranges on the river, but it's harder to do on each street and I never really got a good balanced frequency of river actions.

Playing too nitty in small pots is a very very very small leak. When in doubt, fold early. Playing too nitty in big pots is bad. In general I want to just give up on the tiny pots but I want to win the big pots. Similarly an opponent who folds too much in small pots is indeed slightly exploitable, but only barely, and you need to be careful not to give up your EV against them when you choose not to fold.

Somebody who plays a lot of hands aggressively from position is very hard to deal with. You may think they're often playing junk and it's a leak and you can exploit it by playing tighter. That is true, but don't kid yourself - their leak is very small, and if you try too hard to get them you can easily spew. You will also have to accept a lot of variance to get after them, reraising a lot preflop and check-raising flops.

Any time somebody's ranges aren't balanced across streets or actions, that's a leak. For example, say you open a lot of hands preflop - that means you need to be willing to bet and bluff with a lot of hands postflop. If not, opponents can easily take you off the hand postflop. A lot of people have this un-spread aggression - they're very active preflop and on the flop, but then get scared on the turn and river, and in particular very rarely make big river bluffs. If you try to bluff raise these guys on the flop they will call, but if you wait to the river, they fold. To be unexploitable, you need to have a balanced activity level on every street. One example that's come up a lot recently is preflop 3-betting - if you are 3-betting a ton preflop, you also need to potentially call a lot of 4-bets or shoves. If you 3-bet a wide range, and then call 4-bets with a tight range, that's a leak.

You never want to make moves that you wouldn't make with very good hands. My goal when playing my tight/solid/aggressive game is to ALWAYS be making moves that I could make with a monster, or just fold. For example, say I just call a raise from the big blind, then I check-call the flop, I check-call the turn, I check the river. NO NO NO. I would never do that with a big hand, so I just won't do that ever. Instead, I will either check-raise the flop or just fold. (this is just an example). You also want to make a wider range of moves with your good hands sometimes, but you don't want to make certain types of plays (leading the betting) with good hands, and other plays (passive) with weak hands. I want my hand to never be defined, I want to always be representing a monster. Any time you do show weakness, it's intentional to induce a bluff or just fold. For example, say I raise preflop, I cbet the flop, now I just check the turn. I'm showing weakness on the turn. I do that on purpose because I'm just going to fold, or to call a bluff on the river.

Your bluffing and value betting should be balanced. Are you thinking of value betting top pair on the river? Do you ever tripple barrel bluff? The more you bluff, the more you can value bet. If you rarely bluff, you shouldn't value bet so thin. If you're playing very nitty, as I sometimes do in wild games, then you need to stop thin value betting so much. On the other hand, if you are bluffing a lot, as I was doing in the high stakes games, then you can go ahead and value bet top-pair-no-kicker (especially if you hit top pair after the flop). For example, you raise AT in position, get called. Flop blanks, you cbet, get called. Turn is a Q, you decide to rep it and bet again, get called. Okay, now you're giving up and won't bet again, but the river is an A. Go ahead and value bet if you would ever bluff.

One of the ways you can make a lot of money is by having an image that is different from how you actually play. Any time your actual range doesn't match your perceived range, that is a value opportunity for you. For me this usually means that people think I steal and bluff way more than I really do. I play a lot of hands from the CO and Button when I can be the first raiser, but that doesn't really equal "looseness". I will bet and tripple barrel when I think people are weak and my line is consistent with a monster, but I'm really not wildly bluffing. But people think you're wilder than you really are. That means they call too much and keep paying you off and that's how you make money. Once in a while you can find people who think you're nittier than you really are - or even that you wouldn't bluff in a certain situation - they think your range is very tight, which means you can profit by opening your range and bluffing more. For example playing against someone who really respects your game you have lots of bluff opportunities, eg. if you are nearly all in and push for $400 into a $1200 pot, they will think you can't be bluffing and will fold a lot of hands.

Related to that, any time that your opponents peg you on a certain style, such as the "typical 2+2 solid TAG style" you can make money by slightly deviating. For example, if the flop is drawy and you bet-3bet shove, they will put you on a combo draw. You can use this knowledge of how they think you play.

If someone is really bad and lets you, you can maximize EV by waiting for later streets. Think of it this way - say your opponent turns his cards face up and you see he has a draw. Why bet the flop when he still has a good chance to improve? Just wait for the river until you know you are winning and then let him bluff. Generally, preflop your edges are very small (eg. if you have AT vs KJ or 88 vs QJ). On the flop, people can easily still have a 30-40% equity with bad hands, on the river if you are ahead you have 100% equity. You can only do this against people who are very bad and let you do this, but if they do let you then you should go ahead and do it because it greatly reduces variance. In general you want to put money in when you get as much value on it as possible.

Design your play to make them define their hand, even if that means losing the pot. eg. if you bet and they raise, but they would only raise with hands that beat you - that was a great outcome. Generally this is done by playing aggressive, especially against people who will only raise with the goods. You bet bet bet and apply pressure and they only continue with good hands, so you know exactly what they have.

People tend to chase way too much preflop and on the flop, they just love to see more cards, which makes it a bit of a bad spot for bluffing. You want to bluff when people can't call, which means bluffing the river, bluff-raising. Dry flops (drawless) are the best to bluff; when you bluff wet flops, people will put you on the draw if they have a made hand, or they might well have the draw and shove it.

Don't make big bluffs that win small pots, make small bluffs that are likely to win big pots. For example, if someone 3bets preflop and you shove, you're risking 100BB to win like 24BB , that's retarded. In some cases you can make very small bluffs into big pots and that's awesome because they don't have to work very often to be +EV. Part of why this works is people are so retarded about pot odds. One thing people don't do correctly is count the bets already put in as part of the pot. For example, the pot is $50, somebody bets $30, someone calls, now you raise to $100. That's not a huge raise cuz there was $110 in the pot, but people think of it as a $100 raise into a $30 bet and they fold, in fact you bet less than pot size which is a good bluff. Another awesome situation is when someone is almost all in, they will often fold because they don't want to reload. In some cases you can bluff the river for like $20 into a $100 pot and they will fold stuff like bottom pair or ace high if they would be all in.

Part of the awesomeness of being aggressive early is that you are always threatening to build a big pot and it lets you make big river bets. The bigger the bets are the more you profit. Maybe you only have a 1% edge in each pot, if you play small pots you never make any money, but if you are always jamming it up you play big pots and you then get to either make a big bet on the river, or check-call a big bluff - generally taking your edge but on a bigger bet, hence more profit. If you're 2nd barreling and even 3rd barreling a lot, you can value bet thinner and start winning some really big pots with only decent hands. Of course you know this with the obvious semibluff hands like a flush draw, but those are actually more obvious. A hand like AK overs is also an awesome hand, because if you spike an A or K on the river you can value bet it and take a big profit.

Against better players you need to jam more when you have an edge or to build pots because they don't pay off as much once they're beat. Against bad players you don't need to do that. With draws against bad players you can just take cheap cards and try to hit. Against good players you need to keep your ranges balanced and always be playing like you have a monster.

If you have good equity but don't know your spot - just jam. This is something I really like but don't see it discussed much. This applies mainly againt good players, or bad players who are hard to read. In the situation that you know you have very good total equity, but you don't know if you're drawing or not - go ahead and jam now. If you know that you are ahead or know that you are drawing, then you can make different decisions, like maybe jamming now or maybe just calling or whatever. Getting all in is protection against not knowing your situation on later streets. Getting all in is also protection against the disadvantage of being out of position on later streets. If you have good equity and are out of position, you want to jam as soon as possible in the hand. In particular I'm talking about spots where your hand might be best or you might be drawing. One example is if you have an ace high flush draw. There's a good chance your ace high might be the best hand, he might have a worse flush draw or just random whatever. If you can't read his action well, just try to get all in. Another would be something like if you have a weak pair + a draw. Maybe you have like 88 and the board is 679 , so you have a pair + straight draw. The wilder your opponent is, especially if you're OOP, the more you just want to get all in and jam it now in these spots because you don't know if you're drawing or not.

In terms of playing profitably at low levels, none of these things are as important as tilt control, focus, game selection, etc. You really need to just stay basic and play solid. "Solid" doesn't necessarilly mean nitty/weak though, it just means stick to basic +EV decisions, mainly playing value because people call and bluff way too much and don't fold enough. That leak is far more important than any other leak.

One of the hardest things for me in practice is getting into the right mental state. You need to be active and engaged and always going after +EV spots - but not too active, not bored, not pushing, you still have to just be patient and see it as a long grind and you need to wait for your spots - but don't let yourself just go into a trance and start playing just by some "rules". It's obvious when you're frustrated and bored and just pushing too much. One of the lazy things you do when you're grinding and sort of turning off your brain is you start thinking about hands in only one way. eg. I have a flush draw, I'll see if I hit, okay I missed I give up. When you're playing right you reevaluate based on each card and each action. It's exhausting really grinding right. The best way for me was to play 1-2 hours then take a break for 1-2 hours, then play another 1-2, etc. Ideally the break is exercise as that really freshens the brain.


Charles Blooom [cb][at][cbloom][dot][com]
Send Me Email

Back to the Index

The free web counter says you are visitor number