A software development war story follows. It will bore you to tears if you are not interested in software development. On second thought, it will probably bore you to tears if you ARE interested in software development. Because it truly IS boring. But it was painful, so I purge here now.
We finally went live with our web-based legal case management system for Kentucky Department of Public Advocacy and, as is typical of any software deployment, we encountered problems. A nasty search performance issue cropped up. Searching for trial cases in the app took over two minutes on average. Which is really terrible when you consider how little time it takes for Google to search the entire fracking internet !
So, while troubleshooting for many hours, we extracted the SQL and ran it by hand. Was there an issue with the database itself, or the indexing, or the query SQL?
Nope. The results returned from the database instantaneously.
Weird. We thought, “OK, is there maybe some weird connection pooling problem?”
Nope. Pools were working fine.
“Funky garbage collection janitor thread in the Tapestry web components?”
Nope. The thread was well-behaved.
“Slight differences in the syntax of the SQL generated by different JDBC query modes?”
Nope. Trying different modes resulted in the same fast queries.
We spent about three long days poking at this really obnoxious problem. Finally, through obsessive Googling, I found some obscure forum post explaining a peculiarity of SQL Server’s Unicode-based parameterization of SQL. Turns out that under certain situations, Microsoft SQL Server 2000 is really inefficient when sending Unicode-encoded parameters to queries. So inefficient that it turned our blistering fast search queries into depressing mush.
So, we’d have a database query like this: select [blah] from [table] where [column] = @P
where @P is a Unicode-compatible parameter (in SQL Server, it is of type nvarchar). Then, I added a setting to turn that Unicode stuff off. The result is that in the SQL, the parameter type of nvarchar is replaced with varchar, and the searches went from taking 2 minutes to taking a second or two.
nvarchar vs varchar.
One character. THREE DAYS of lost time. The needle in the haystack almost kicked our asses.