Mixing the Two Main Debugging Styles

As a followup to my recent post on debugging, when confronted with an extremely difficult bug, I find there are essentially two mental processes you mix:

Deductive: Spend enough time to fully work through all of the code paths to try and find an intuitive solution to the problem. In some cases, you can eliminate the debugging step entirely, particularly if you sleep on it and take the time to unleash your subconscious problem solving abilities. How awesome when you wake up or get out of the shower with the fix ready to apply?

Inductive: Figure out how to get as much context for the problem as quickly as possible in the form of either log files or a targeted debugging session before spending too much time on difficult thought processes. Try to break the debugger in the affected code path when it is in as close a state as possible to the problematic code path and step around. You may trip across the bug or at least guide your deductive thinking with more information.

Back in college, my professors taught us to focus on the deductive process. You should understand your problem domain, ensure it is well designed from top to bottom. Avoid relying on the narrow insights you gain by observation to guide code changes. Leaping into the debugger may help you quickly patch the problem but lead to an overall less stable system. Maybe the fix you find in the debugger only addresses one case of many or a category of similar bugs could be fixed at a higher level with a different change.

But to counter that wisdom, it’s often much faster to understand the problem with a more targeted approach. Time spent visualizing the code paths in your head may be more productive with a debug session in front of you so you can more quickly trace code paths, and avoid drawing out data structures.

In debugging, either approach may waste time. For the inductive process, you can waste time writing unnecessary logging code or throwaway code to catch a complex code condition with a breakpoint, or stepping through a morass of code and learning nothing. Your deductive approach may lead you to waste time relearning a piece of code at a level of detail you’ll quickly forget again when the problem is something simple or unrelated to your initial assumptions. We programmers tend to find it hard to account for debugging time so learning how to debug more efficiently is key to making deadlines. So let me reveal the key to debugging productivity: use induction to focus deducation and deduction to focus induction. Make sense?

Some more related tips. When your system’s turnaround time between code change and test is low, your ability to use inductive processes increases, leading to less frustrating time on thinking which derails other creative thought, like the next new feature. But don’t let ideal conditions lead you to lazy thinking. You’d be a fool to eliminate deductive reasoning during debugging entirely. When your system’s turnaround time is high, use deductive thinking while waiting to plan each test to make most use of your next session.

I’ve found the best state is to let your intuition guide you and mix approaches at least enough to avoid wasting too much time on the other one. A debugging session to learn more or writing simple logging or conditional breakpoint code can be a mental break from too much deductive thinking. Some deductive thinking can lead to inspiration when you are mentally taxed by stepping through too much code.

Now go make your stuff work so we have less crappy software to infuriate our lives 🙂

Debugging Hard Problems

The complexity of the solutions you can build is limited by the complexity you can debug. When code gets complex, debugging gets harder. More data is involved, setup is more difficult, problems harder to reproduce. Code paths are more complex. Data structures become unwieldy nested graphs. Stack traces many hundreds of frames deep. Multi-threaded timing problems, deadlocks, and intermittment errors. Do you yield at this point or do you dig in and find and fix the problems? If you walk away in fear, your system faces failure or at least may live as a buggy, hated thing people want to replace, instead of a solid, stable system that runs for years.

One of the reasons I’ve been successful in my career is that I’m good at debugging the hard problems. I’m not afraid to tackle a more complex design as I am confident I can solve the more complex problems that will arise. When working with teams, I can save a lot of time by helping others find those problems that can suck up days and weeks often in much less time.

I’ve wanted to write a post about debugging for a while but it is a dauntingly complex problem, worthy of a book. I learned most of my tricks by sitting over the shoulders of great programmers while they debug problems. I’ve learned over the years that there’s no substitute for taking an intuitive approach in the time savings involved. Even with a list of the many things you to find a problem, the secret is applying them in the right way at the right time.

Despite the difficulty, I took some time to write down some approaches I’ve found useful over the years:

Have the right attitude. You will find the bug, it’s only a matter of time. The more frequent a bug occurs, the easier it is to find because of all the data gathering opportunities. The longer between occurrences, the more time you have to prepare for the next occurrence so you can catch it.
Familiarize yourself with all of the processes, threads, data structures involved.
Even if you can’t easily reproduce the bug in the lab, use the debugger to understand the affected code. Judiciously stepping into or over functions based on your level of “need to know” about that code path. Examine live data and stack traces to augment your knowledge of the code paths.
When you can’t reproduce the bug, you may need to instrument the code with additional logging. Make sure the skeleton of major operations has adequate logging to understand what’s happening in the system. Investing some effort in improving the targeted quality, readability of these logs will go far in the overall lifecycle of a complex system. Too much logging swamps performance and hurts readability. But with time-stamps, user-ids, user-agent strings, session-id, basic operations, you learn a lot about the running system and why it might have failed for one particular user. Logging is crucial for multi-threaded interactions.
Generate theories as to what might be causing the problem and test those theories. Keep an open mind. Generate as many theories as possible before you start the longer process of testing those theories. You may decide to test more than one at the same time.
If you have no theories, you need to learn more about the system, particularly information relevant to the code paths causing the bug. Adding additional logging is a good way to do that when sporadic errors cannot be reproduced.
Familiarize yourself with all of the layers of the system, at least at an intuitive level, from the hardware on up. This will help you visualize what’s going on in in your mind’s eye so your intuition can help steer you towards the most likely source of the problem.
For certain types of complex code, I will write debugging code, which I put in temporarily just to isolate a specific code path where a simple breakpoint won’t do. I’ve found using the debugger’s conditional breakpoints is usually too slow when the code path you are testing is complicated. You may hit a specific method 1000’s of times before the one that causes the failure. The only way to stop in the right iteration is to add specific code to test for values of input parameters. Aways do a System.out.println or log some visible, unique consistent token. This makes it easy to find and remove these code snippets when debugging is complete. Once you stop at the interesting point, you can examine all of the relevant state and use that to understand more about the program.
Some people start out by drawing pictures, flow charts, entity-relationship diagrams of their data structures, and detailed state tables. I will do this only as a last resort or for documenting the project as it is time consuming. Examining a real instance of that data structures in the debugger is much faster, more accurate and more informative than any diagram. The JavaDoc or structured code browser are enough for me to understand the entities and relationships. I try to visualize data structures in my head and only resort to a drawings, or state tables when necessary. I think that over time, this has made me faster at visualizing and building systems.
If you get stuck, take a break. Sleep on it, approach things fresh the next day. You may not have enough information and may need more to get the next piece of the puzzle. Too much frustration impedes your motivation and ability to focus. For the best debugging approach, you need to research all relevant aspects of a system, simulate that all in your head, use your intuition to flush out ways things could be going wrong. Instead of trying to find the problem, perhaps you need to learn more about the failure. Under what conditions does it happen? What’s unique about those cases? How can you learn more about those unique code paths?
Do not spend too much time on minor bugs but do keep in mind the value of true reliability in a system. Your pride is not relevant. Your customers’ experience in using the software is all that matters.

If you become good at debugging complex problems, your confidence as a programmer will grow, letting you tackle bigger, more relevant problems. When things go wrong, you’ll be able to step up and make things right again.

Did I miss any of your favorite debugging tips? Continue the discussion in the comments!

Choosing Your Programming Language – The Inside Scoop

Many programmers prefer typeless, interpreted languages like PHP and Ruby for several reasons. They are more concise and easier to read and write for a novice. They tend to be interpreted languages, not compiled, which are simpler to use and typically offer faster round-trip time between making a change and seeing the result. They support a “google, cut and paste” type workflow more easily which, frankly is how many programmers operate these days.

And yet still strongly typed languages are more wide used, particularly as the complexity of the project and the number of the developers grows. I have discussed this issue with a number of colleagues and wanted to write down my thoughts. It’s important to choose the right language for the right job and today unfortunately, there’s no one size fits all answer so knowing the details may help. My opinions were formed by poking around into the guts of the JVM, Python, PHP, Ruby, and Flash interpreters, and from coding in Java, C, and C++ extensively.

Typeless versus Typed

One reason I believe typed languages are used is the robustness of the code itself. Typeless languages offer a single-point of failure with each code construct. If you misspell a variable name, you do not find out until runtime and only by debugging the problem or through code inspection. With a typed language, each misspelling is caught at compile time because every name must occur in the program at least twice, once for the type definition, once for the usage. This fact alone will often make up for the extra key strokes you need to use in a typed language.

With typed languags, more is known about the system during the code editing process. This makes the tooling opportunities richer and reduces keystrokes which can make it faster to write code in a typed language than an untyped one, even though the typed language is more verbose. For example, handling imports, completion of member or method names. The “find all usages” feature is extremely valuable at tracing code paths and doing refactoring. Typeless languages may offer such features but they are much less specific as they must do only name matching, not type+name matching. The ability to change a field or method name and reliably update all references is a big time saver when modifying a large existing project.

Another reason people prefer typed languages of course is runtime performance. But why exactly do typed languages run so much faster? The biggest reason is that they offer a much faster way to evaluate “a.b” expressions and do method lookups (a.b()) at runtime. With a dynamic language, every single indirection requires a hashtable or binary search which turns into dozens or 100s of instructions. With a typed language, a compiler can frequently generate an “a.b” with just a few instructions using a “load from fixed offset” pattern. That’s why a typeless language will run usually at least 10X slower than a typed language no matter how many engineers Facebook puts on the problem.

Some folks today are trying to infer types in typeless languages to improve runtime performance. In limited cases they could compile typeless code to use fixed offsets. That may well be an area of research which could improve the performance of some typeless code. I suspect though that the code which will speed up will need to be well organized around common types and so written a lot like a typed language.

It also perhaps poorly understood that even typed languages do not always realize the a.b speedup for using fixed offsets. For example, when you use a feature like interfaces in Java, you do end up with some searching to find the right method in the general case. You may not see this all the time because Java employs a trick to cache the offset for the last type seen which sometimes eliminates that search in many cases. I have a project in which changing one interface to an abstract class improved performance by over 50%.

One other poorly understood performance factor in comparing typeless and typed language is when interpreted code calls native code. For example when PHP or Java calls some C function. Native transitions are usually substantially slower than normal method calls because of the extra work they need to do in translating data types, pinning down memory used by the native code, copying memory from an unmanaged to managed environment etc.

Though both typed and typeless languages suffer the same problem, in general typeless languages use more, higher level C libraries. That’s probably because writing them in the language itself is too slow or just the effort involved in writing the code itself is too high given the limited commercial support for typeless languages. With more native transitions, the performance hit for this design increases so just moving more code into the native layer may not make things faster when you need to make lots of native method calls.

Of course more use of native code turns into an advantage when you have a small amount of typeless code which just strings together a few efficient but long running native methods, like copying a file. In these systems the typeless language is almost as fast as C.

In general typeless languages have faster round trip times between changing code and seeing the change. Because they are typeless, when you update a module, you do not have to update the entire application. Changed code constructs can co-exist with unchanged constructs. In a typed language however, you have to update the type in a way that preserves the stricter typing contracts. Since the code itself relies on fixed offsets, when those offsets change, you have to update all of the code atomically which is hard to do and get right. Most typed languages cannot do that seamlessly and worse still, there’s no way to know when it will or won’t work making “Class patching” useful only in special cases where you can isolate all dependencies on that class that is being changed.

Interpreted versus Compiled

To get good performance as a project grows, even interpreted languages these days must cache compiled descriptions of the code. They do however retain the ease of use benefits in most cases because this is all done transparently, by the browser or the runtime engine. When the code changes, these caches are updated automatically. Without such a feature, interpreted languages bog down as code sizes grow. Each time a process restarts, too much code must be interpreted before you can use the system.

Thread Architecture

Java, C, and C++ are all multithreaded using operating system thread scheduling. In general, this means that all code must be “thread aware” though in practice, frameworks try to reduce the likeklihood of thread conflicts. When a framework is well designed, the burden of synchronization is not imposed on application code.

You need a threaded architecture when you need to share a large pool of memory or efficiently perform I/O with a bunch of sockets or files. You can more easily leverage a multi CPU environment with OS threading.

In contrast, even multi-threaded VMs like Python may have a global interpreter lock or will do VM based thread scheduling. Either of these architectures eliminates opportunities to do parallel I/O unless you switch to a multi-process model. For example, PHP will run each HTTP request in a separate process and so achieves some form of parallelism that way. But in doing so, it eliminates the use of shared memory which reduces the efficiency of memory caching. It also means that any data structure used by all HTTP requests must be replicated across all PHP processes further increasing both computation and RAM usage.

So for PHP, you’ll need even more memory and more CPU to populate that memory. You do still benefit from OS level file caching of course.

What about the Future?

I tried to be neutral in my analysis but you can probably tell from the above that I like the benefits of typed languages. When you consider long term costs, and include modifications, enhancements, transfer of code between developers, runtime efficiency for either large scale or mobile deployments, strongly typed languages win out.

I agree however with Ruby and PHP developers that we are not there yet when any strongly typed language today will beat out PHP and Ruby for any given project. As long as the code is easier to read and edit for most people, the typed language advantages may easily be outweighed by availability of people, cost, and the poor workflows that exist between complex typed languages like Java, C, C++ for designers, analysts, and admins.

To bridge the gap, we need a strongly typed language which has:

simplified tools – the Java IDE is too complex for entry level programmers and others who work with PHP and Ruby code today
syntax improvements to eliminate imports, use inferred typing, and in general simplify the syntax will bring typed languages much closer to untyped languages in readability/brevity.
mixed interpreted/compiled modes and a way to migrate code between them as it solidifies
updating of types for the common cases for immediate code updates. When that’s not possible the ability to know as soon as the code is changed that a restart is required.
built in compilation, dependency management for automated builds, updates, deployments. Maven, ant, and IDE configuration are too complex today.

What do you think? Did I miss any important issues that affect your choice of a language? Let me know in the comments!

Evolution of Forms (More about Why I left Adobe)

An article of mine about evolution of forms technology was published on The Register. The need for this technology is why I went to work at Adobe and why I left when I realized they would not market LCDS this way.

BTW, Froyo – aka Android 2.2 update arrived on my Nexus One July 1. My phone runs Flash! Congrats to my friends at Adobe for creating the first/best universal portable runtime for rich UIs. As a stock holder, I just wish you had a better monetization vehicle for it (hint, hint). Thanks Google for not being afraid of Flash, plus all of the great things you did with android: tethering, navigation, my tracks, maps, gmail, etc.