My article on the register.
Source control plays an essential role in software engineering. I’ve been using it ever since my first job and it transformed how I code. But like every tool it seems, it can be your best friend or at times your worst enemy. Most painfully, CVS, SVN and P4 for example all are terrible at merging a branch the second time. They lose track of what was already merged and start registering false conflicts.
At Adobe, on some complex projects during lockdown you’d have to coordinate with someone before each checkin. He’d bracket batches of commits with tags, then carefully merge a set of deltas one batch at a time. Not a fun job – everyone’s waiting on you, while you are trying to juggle lots of code you did not write at a critical juncture of the project.
The other time source control let me down in a big way was on my trip to India a couple of years ago. Access to the source control system back in San Jose was so poor, it made me change how I worked – in a bad way. I did not verify the diffs and checkin comments for affected code before making changes. I batched up all sync/checkins during breaks (and yes took more breaks).
The reasons Git is superior:
Local history, local branches
I started a new project by creating a git repository on my local machine (git init, git add). A few months later, I wanted to share the code with a friend. I cloned my local repository into a bare repository on a hosted linux vps, then gave out that URL (git clone ssh://myserver.com/var/git/myapp.git). Now I can “git push” and “git pull” changes to/from that remote server as needed to share or backup my work. Each repository maintains the entire history of shared branches so even if there is a central repository, you use it less often. When you have conflicts trying to push or pull, there’s one straightforward process to merge and resolve them.
Occasionally you need to put work on hold to fix some other more important bug. Git lets you stash away your changes in a temporary branch (git “stash”), do the fix, then bring your changes back with “git stash apply”, all without touching a server.
Because you can check in changes to your own repository without affecting others and without having to run the complete test suite, your checkins tend to be smaller which improves the quality of your version history. At Adobe I was known for massive checkins sometimes with as many as 10 bug fixes. That’s because the test suites would take an hour or more to run. I could run them at most two times a day without interrupting my work. Later this cost me time when trying to identify or merge a particular fix. With Git you make checkins to your local project at natural intervals for history. You push/pull at natural intervals for synchronization.
On all but the smallest projects, you need a way to test environments that are isolated from active development prior to release. Usually you tell coders to stop checking in changes during lockdown or you might create a branch and start merging. Either way slows you down at the most critical phase of the project.
With Git you define a separate server repository for each level of isolation that is required. You might have a development repository which developers sync to, a staging repository for testing primarily used by QA, followed by a live one that is used to mirror what is actually released or to be released. During normal development, you might have staging automatically pull from development so QA stays on the latest. But after lockdown, you turn this off. QA can move changes as needed from the development repository into staging and sync that to live as needed. Any developer can change their default repository and sync to either staging or live as needed when problems arise.
So far, I like the performance characteristics of Git. Given the architecture, some things are faster, some things are slower but I suspect that since Linus wrote the core, most things you do day-to-day are faster even on large projects. Version information is maintained per-repository, not per-file so getting the changes which affect an individual file can be slower – i.e. the “git blame” command (similar to cvs annotate). But commits, push and pull commands have so far been very fast for me. Despite the fact that Git does not store changes as “diffs”, but instead stores everything as a compressed blob file-chunks, space has not been an issue.
Smarter Than You’d Expect
Renaming a file? Git figures that out automatically by comparing SHA1 hashes. Git can even figure out when you refactor a big chunk of one file into another one. It does fancy ascii-art during each push/pull to show you added/removed chunks.
Verifies All Files
Kernel programmers tend to be paranoid (a good thing). Git verifies the integrity of all files using SHA1 hashes. If any bit is out of place, it will barf with some cryptic error that may require a google search to fix. But this has already paid off for me. One problem I had with Git on windows was running it in cygwin without newlines getting destroyed (it only works in one of cygwin’s binary mode). Git complained which prevented me from checking in any corrupted files.
My favorite app server, Resin, is now using Git behind the scenes to sync files across a cluster of servers. I like that use since a) it is pretty fast, b) it makes it easy to make an isolated change on a live server while tracking that change robustly, c) you can check the history even on production, d) The verification comes in super handy here – any local changes can be detected and traced.
As with all new technology there are caveats. Git is still fairly low-level, has numerous options and did not fully follow industry standard conventions (i.e to revert: “git checkout file”). It takes more thought to set up repositories and workflows, and the two-phase commit/push process requires some mental re-wiring. Because it is so flexible, people are still figuring out how to use it best for different purposes. Since no one is making money off of git (except maybe github?), it is evolving fairly slowly in the “polish” area. But from now on, for me it’s gotta be git.
Like many software architects, I’ve built quite a bit of framework code in support of applications because the design patterns I wanted to use were not present in the core language. As a software engineer who cares for the whole life-cycle of the products I build, I’ve always been looking for the best way to get declarative programmers more involved in development and customization of applications. In 1996 I had finished building a visual programming language called AVS/Express with a sophisticated data binding system. I realized AVS would never market it as a horizontal software tool and yet I was intrigued by the power these designs might have in the broader marketplace. Like others, I believed Java would be the next big thing in software engineering and luckily found a great company looking to innovate in Java web platforms, ATG. We designed a page template language, an IOC framework called Nucleus and a sophisticated ORM solution called “data anywhere”, all of which were keys to ATG’s success as both a platform and a customizable e-commerce solution. Despite my advice to create horizontal offerings with these APIs, these remain high cost, five figure “enterprise” products and lost out to free offerings Spring, Struts, Hibernate when those were developed.
Today, ATG makes money regardless but their customers must spend a lot of money on headhunters given how many unsolicited phone calls and emails I get looking for trained ATG developers. I bet their current e-commerce business would be even better if more developers were trained on their system. I also suspect they are feeling saddled by a large platform code base to maintain that hinders them as much as helps now. The cheaper stuff is evolving faster because more developers are using them.
I joined Adobe in 2005 because I believed they could push these design patterns into robust platform software that would get broad enough adoption to be successful. With Flash/Flex they had a competitive UI but still were largely ignored by mainstream business engineering companies. They needed server connectivity, round-trip UI to database tooling and improved standards compliance to really advance software engineering efficiency in a significant way among the corporate developers. At the time I had come to believe that big companies were the best way to build software platforms because they could make large quantity, lower cost products successful and had the resources to plan for long-term investments. It seemed like a perfect fit.
During the first Adobe internal developer’s conference I attended, the theme was “Platforms” and I was encouraged early on. Sadly, I learned many lessons of big company politics that led me to learn their limitations when it comes to innovation. In Adobe’s case, they are all about platforms on the client but when you get to the server, it becomes a political mine field. When I was hired, I was told Flex would be much cheaper than its low five figure price at that time. Shortly after I joined with the merger just starting things changed. My product would have a free version but would cost an even higher five figured price for an unlimited one CPU clusterable license. I never liked that pricing strategy but recently it just got worse. They dropped the free version and raised the price on the unlimited version by another 20%. For a set of tools so widely applicable (forms, persistence, etc.) and evolving elsewhere simultaneously that’s price will ensure other technology evolves more quickly to fill this need and Adobe loses their last/best monetization vehicle for Flash.
While I still believe I was right about big companies being the best places to evolve software platforms, I now see their limitations more clearly. A big company has a complex political landscape and the deeply hierarchical management structure makes it more difficult to make good engineering decisions unless the leaders understand the vision. Platforms combine standards with meticulous design and must include an efficient process for rapid evolution to ensure the success of its solutions. Big companies will always be tempted to use their leverage and momentum to steer technology projects away from the path of efficiency towards a tactical advantage. This type of decision making is the path to brittle and slowly evolving tools infrastructure. To me software engineering is still engineering first and foremost. We build mission critical components just like other engineers which become important public investments so corporate misdirection and bungling of technology advancements which harm efficiency particularly pains me.
I’ve spent the last 8 months part time researching the state of the industry looking for the next big language – the one like Java was in 1996 – that would help us make even better, more solid and maintainable software designs going forward. I want maximum portability from mobile to desktop to cloud. I want to leverage all of the language idioms we’ve all learned, leverage all of the library code we’ve built but improve the design integrity, flexibility and robustness of the designs. I’ve looked at Ruby, Scala, JavaFX – the three major contenders and found them all to be an unsuitable base from an engineering perspective for my purposes.
I’ve also found what I consider fragile support for each as well:
JavaFX: With Oracle taking over Sun, I believe that we have lost a major supporter of quality open source engineering languages and tools. Oracle’s record of doing what’s right for engineering efficiency and promoting standards even if means potential loss of leverage is not good. Sun was ok at best but it will in all likelihood just get worse from here. JavaFX is not as open as Java and is not fully compatible with Java so it almost looks like Sun was trying to fork Java back into a proprietary language before the merger. Can Sun help Oracle? In my experience, Macromedia had a great affect on Adobe’s culture in terms of opening up engineering, raising awareness of standards and treating developers as an important constituency. But at the end of the day, Adobe’s management determined what went down and that did not change after the merger. I don’t expect Oracle to be more open than Sun or more successful at advancing such a core technology smoothly. Their business unit managers like to make decisions about technology. I have not met any Oracle BU heads personally but I feel like I know a couple of them because every meeting I’ve been at with Oracle starts with a discussion of their thoughts. This is in contrast to Google where it appears like the engineers make decisions about which technology to use for a particular solution. So I am not optimistic about JavaFX’s long term prospects to solve our core engineering challenges.
Ruby: A dynamic language, without strong typing has very little chance of ever competing on performance with a compiled language. If you can’t turn “a.b” into “load from offset” and instead need a method call you are sunk from the get go. I believe a language for writing languages is a great prototyping tool but a poor way to enforce design practices and build a single consistent language adopted by a broad community. In terms of support, there is no one officially paid to work on the original Ruby runtime that most people use but there are paid projects trying to migrate them to Java and .NET. Without compatible native plugins though, these efforts are guaranteed to fragment the community and so far have not helped really improve Ruby’s performance or toolability.
Scala: The best attempt yet to make an advanced functional language suitable for the masses but advanced functional languages are not suitable for the masses. It seems that most of the interesting languages these days are coming from the academic world but I think academics have a different focus than commercial programmers. Systems tend to be built and maintained by the same person in academia but in the corporate world, code will get handed off and probably to someone with less programming experience than the author. Scala is also a language for writing languages – again a poor choice for mainstream engineering.
We are all used to free languages but this has left a void of credible language and tools companies. JetBrains, one of the few has recently open sourced the core of their Java tool suite so that they can compete with Eclipse as a tools ecosystem effectively. I hope their market is still solid so they can continue to innovate as Eclipse would not be nearly as good without them.
Google shines in this space, continuing to make incredible contributions across the spectrum. They have yet to show that they are using their leverage unfairly. You do not see them making corporate bets on any one language – they support Python, Java, C/C++ and have released two languages Simple and Go in the last few months. Neither of these look particularly strategic to me. Simple is another visual basic-like language designed to help entry-level programmers be more productive but does nothing to improve their workflows with other types of programmers. Go might be interesting as an alternative to C but like C only targets systems programmers.
I should mention IBM as they have been big supporters of open source projects in the past and I do think they’ve done a lot of good in the industry over the years. But their challenge is that for any truly horizontal product, there are dozens of competitive efforts internally and so they are not likely to drive change as quickly as would benefit the industry as change for them involves considerable risk and retraining.
Along with researching the industry, I’ve also been building a new platform so the question of how best to market it in today’s climate is something I’ve been doing a lot of thinking about. All I’m sure of is that it will not be easy. We developers are tough to sell to. We have a reluctance for lock-in, desire for all source, and complete control over every link in the supply chain without royalties. The risk of going overboard of course is that we do not invest directly in those tools and lose out on competitive advantage to tools which improve our productivity. For now, my project will be independent but I’m interested in any ideas you have for the best way to market platforms in today’s climate.
I’ve spent most of my career building frameworks which enable efficient delivery of large scale, expensive, mission critical systems. These solutions usually have six digit budgets breaking down into license revenue, services and consulting to build the solution, and support maintenance and upgrades over time. Because of the money involved, these types of solutions tend to drive a lot of the innovation in the software industry. Unfortunately, just because a buyer in an organization has a large budget does not necessarily mean they know enough about software to make educated decisions on what they buy. The complexity and number of variables in making such a decision is large: license cost, type and cost of developers to build the system, system maintenance and administration costs, etc. Each solution in this space is unique and each vertical may have a relatively small market so getting reliable references is tough, particularly as it is likely these other customers are your competitors. Further, I’ve noticed that when people spend a lot of money on software they are reluctant to be vocal about its failure. The amount of money involved gives everyone skin in the game – buyer and supplier and so there are some pressures for people to try and cover up problems, hide the reality that some enterprise product’s customers are mostly unsuccessful or unhappy.
One thing about being in the industry a long time though is that you do see that quality prevails in the long run. When an enterprise software vendor does get a public black mark in a particular vertical, it can be a swift blow that ends their market potential quickly. I’ve seen a few recipes for long term success in selling to enterprise customers and a few recipes for long-term failure. These may not be on everyone’s top-ten list when they are starting a company in this space but these are the qualities that in my experience ultimately differentiate the winners from the losers.
* Don’t ignore support and maintenance costs. Your buyer is probably not too focused on how hard the system will be to backup, how often they’ll need to upgrade the software to deal with security threats, how difficult it will be to find programmers to extend and maintain the system but as their vendor you need to take care of these aspects for them. You need to provide responsive support and need to have the ability to spin patches the very same day the bug is found. In the frameworks I’ve built, support was encouraged to contact developers directly when they found what they believed was a serious bug. The developer could often confirm or deny and possibly spin a test patch right away. The systems allowed a quick way for the customer to drop the patch into a directory so it would be installed/uninstalled easily. Since often support cannot easily reproduce the problem, the customer could usually be encouraged to test it directly to expedite the fix. That willingness goes down the longer the patch takes. Most of the time this lean process leads to the quickest possible resolution of the problem and a happy customer. Of course if your first couple of attempts don’t work you need to get a test case but I found that 8 out of 10 support calls that would have led to escalations and messy processes wasting lots of time could be streamlined dramatically. It also forces developers to readjust to quality over new features to ensure existing customers are happy first.
On the other hand, I’ve seen enterprise companies which discourage engineers from discussing bugs with the customer. Every communication goes through a formal process involving some escalations team. Further, in many cases, there is no patch mechanism. Even a 2 line code change would have to wait for some massive roll-up patch, or would require a big investment in creating a special qualified patch. The “upgrade” process could be employed to install each patch which made wholesale changes to the system that were not easily undoable and more problematic technically. You can recognize these companies because they will deny the existence of bugs as a first reaction when you call support. To them, to admit to a bug exposes the company to a liability. Of course, the fact that your software has a bug is the real liability long term.
* Don’t ignore services revenue. People selling enterprise software are usually in it for the license revenue. You can potentially get much higher return on investment and are viewed by the market differently. Customers also like licensed products because they feel like they are not building a custom solution from scratch – they are getting a tried and proven set of components requiring minimal customization. But essentially all enterprise solutions are customized. Business requirements change and software needs to change along with it. This is a fact of life. You need to make services profitable and successful so you can properly support your customers through the entire lifecycle of the product. IBM is one of the most stable companies around and relies mostly on services revenue and customer satisfaction. They will make your solution successful and they will make you pay for it. They don’t get too bogged down on license revenue or even selling and using their own products. They adhere to industry standards and are constantly on the lookout for a better way to serve their customers.
* Require adherence to industry standards and open programming models. Some sellers will pitch their framework as “the secret sauce” which eliminates the need for developers. But cost and access to good professional programmers, designers, analysts, and admins who can work with this solution ultimately determines whether a customer will be happy with a solution long term. The buyers often don’t know they need standards and can’t tell an open programming model from a closed proprietary one. They don’t want to be told there will be maintenance costs or enhancements needed so this can’t be part of your pitch but if it is not part of your strategy long term you will fail. These days enterprise software companies and products come and go so if your solution is not standards based, you might need to rebuild it from scratch if a vendor stops supporting it.
* Beware the almighty direct sales person. Most people watching “All in the Family” think that Meathead was usually right but that Archie usually won the argument. This trend occurs in companies too – they are often unduly influenced by the polished sales person. They will make impassioned pleas for high cost products and if they don’t get the exact right commission plan they can let their own goals outweigh the goals of the company in how they use their influence. A good manager recognizes this and structures a sales person’s commission so they get a percentage of services and support revenue from their customers. They share commissions with a team of inside sales folks who can worry about the small fish. By working in teams, the inside sales person can improve the quality of the leads for the sales person. Support and services is difficult to monetize with direct sales as you need to make it easy to do business with your company and direct sales only cares about the biggest fish in the water at any given time. Without shared commissions, direct sales might spin inside sales as a waste of time as they take some low-hanging fruit away.
* The market should dictate license cost. Horizontal software typically is inexpensive and vertical, more specialized software gets more expensive. Sellers would love to sell all products for $20K/CPU or more but that might not be the right price target for your product to be competitive. You need to survey the market, figure out who your customers are and what they are willing to pay, then maximize the revenue by choosing the right price. Direct sales does need a high average deal size to be worthwhile but inside sales can profitably sell much lower priced products.
* Give developers a voice. Enterprise solutions are all about customized software which means developers are going to get involved at some point. Of course in the pitch, the seller will try to minimize this cost as customers want simplicity. Customers want to think of themselves as buying off-the-shelf, push-button software. But buyer beware. Quite often, there is still programming involved in using these systems – it just takes a more interactive form. That might be nice but since this is essentially reinventing how people program, it takes on quite a bit of complexity. Software must be version controlled, deployable, support collaboration, possibly on branches, use maintainable database schemas, etc. A shrink wrapped push button customizable software tool quite likely ignores all of these aspects. It locks the typical developer out which just means you need a different type of developer and you have to reimplement all of these processes that took us decades as an industry to refine by yourself using adhoc means.
* Keep a balance of power between product management and engineering. Product management needs to have control over the feature set as they represent the customer but there needs to be checks and balances because at the end of the day engineering needs responsibility to ensure designs work efficiently in practice. I’ve seen product managers who were quite confident they did not need engineerings input for specifications, but then made such basic snafus in their thinking regarding security, scalability, maintainability etc. it was almost mind boggling – “Of course it’ll be secure – we’ll use SSL”. If you have a rigid top-down management structure which prevents free-flow of information exchange and makes escalations so difficult as to be painful, you’ll get a bunch of developers who just are coding till 5 and don’t take responsibility for the customer’s success.
* Beware the “enterprise solution iceberg”. One of the challenges in selling enterprise software is that the markets tend to be fairly small and fragmented. For some companies, it is tempting to expand the markets by combining a large number of various software components into a single package to increase the size of the addressable market. But to do this in a scalable way requires skilled engineering of packages, modules and dependencies. Well supported in many industry standard software environments but easy to mess up if roll your own framework. The result will be extremely long startup times and a very high memory footprint. Software engineers smell this type of system instantly and run but your average enterprise buyer does not look closely under the hood before they buy. They might argue that memory is cheap and some systems rarely need to be restarted so who cares if it takes 15 minutes? For developers, that 15 minutes means a trip to get coffee and they might need to do that 20 or 30 times a day. I try to keep my round-trip times to less than 20 or 30 secs max so I’m most productive. If your system ever needs to scale, an app that could be run on a $15/month 128mb slice of a linux server now needs a quad core dedicated system with 2G maybe $150/month. Just to store 1G of code of which you are using only a few %. Finally, because of the large number of lines of code you have you are exposed to security threats, required upgrades etc. And because of the massive footprint, you’ll suffer long downtimes and late nights by IT when they have to perform them all contributing to the pain and cost of system maintenance.
Ultimately, to sell enterprise software effectively you have to realize why the customers are paying so much for their solution. They believe that more money they pay means more assurance of the success of the project. Whether you call it license or service is almost irrelevant to them as long as the purchase succeeds in the long term. As we developers know, for a project to succeed you need to follow best practices in the industry, support industry standards, and provide services and support for your customers as needed. These can’t be afterthoughts if you want to win in the long term with any piece of software. And buyers remember, you are buying a very intricate machine that needs to be ultra efficient and reliable not a slide deck.