Tuesday, February 07, 2006

pair programming, sponsored links, and fuzzy matching

I got my ass kicked by the flu last week. Last tuesday morning I felt more tired than normal rolling out of bed and it got progressively worse throughout the day. I left work about an hour early but thanks to the prompt and speedy service of Muni, I arrived home at the same time. I waited for about 45 minutes in the cold for a bus to come and barely managed to squeeze on. I wish it were hyperbole but I literally was standing in the stairwell of the bus next to a cell phone yapping fool. When I got back home I downed some Tylenol and from then til about Friday morning I was in a haze. I promptly caught a cold after the flu and it wasn't until yesterday that I was feeling my normal self.

My first project coming back was helping my coworker pull some data to give to a client and that was actually a nice one since it was relatively easy and got me back in the flow of code. We've been trying to do more pair programming and I'm all for it since more eyes means less errors. Retrieving multiple datasets that aren't stored consistently isn't exactly the most exciting project either so having someone to discuss different strategies and keep the focus on usually speeds things up. I've pair programmed before and I found it to be very fast when both programmers are at similar skills levels. The converse is that it takes twice as long when you're paired with a rookie but that's also a great way for some knowledge share provided the rookie isn't overwhelmed ala Michael Jordan and Kwame Brown.

That took up most of my morning and the rest of the day I programmed solo and implemented a sponsored link section on our site for the ad team. After I finished that, I worked on fuzzy macthing private schools that our staff had manually entered into our database against a set of data provided by the government. Initially I was going to use mysql's built in full text queries on the name and the addresses but the queries required can get a bit tricky and I wasn't ready to implement them in hibernate's query language.

Instead my algorithm was to go through each school in the government's file and then grab all schools in our database that's located in the same city. I computed the Levenhstein distances on the school name and also the street address and that gave me some really good matches for distances of 10-20.

No comments: