Monday, June 16, 2008

Versioning: My Happy Place

There has been alot of discussion about the JSR-277 and OSGi versioning standards. I wont revisit the arguments here since people far more intelligent than me have lots to say on that matter. If you want to dive in, go ahead and Google it, and I will talk to you later since you will have forgotten to go back to this blog entry once you are done. For the rest of you, I have to beg your indulgence, I am going to assume that I actually have the qualifications to pontificate on this matter. So here goes.

Defining Versioning
Why do you want to version something? There are alot of reasons for it, particularly when it comes to Source Control Management (SCM) or Version Control. You need to know the version of a particular class that you are working on in that case. Using version control when you are working on software is an idea that goes without saying. There are no debates on versioning or the formats of those versions, you get the one that comes with the SCM you are using, period end of story. No, the issue comes up when you are considering versioning a component, which for OSGi/Eclipse is a bundle and for JSR 277 is called a module. So what does a version stand for in this case?

I like the wikipedia definition of a version in this case. It is pretty generic, but it goes something like this, "Software versioning is the process of assigning either unique version names or unique version numbers to unique states of computer software." The definition tries to cover all cases, so the point to focus in on is the unique identifier of the unique state of the software. So what kind of state are we trying uniquely identify? Well changes to the software you as a developer have made of course, but I think that they fall into two categories at least at the component level. One category are those changes that affect dependency management because the external API on which clients rely has been altered. The second category, I will term bookkeeping for this blog entry. I term it bookkeeping, because it denotes internal changes that should, notice I say should, not affect clients using the external API. "Bookkeeping" comes into play because Software developers are human, and even though a change should not impact clients adversely, they still do. "Bookkeeping" lets a system integrator/admin note that a version of the component X worked, but X+1 sure did not, which is exactly the information someone providing support wants to hear. It is really difficult to say one category matters more than the other. Changes in the first category will prevent a system from coming up correctly, whereas changes in the second category could help explain why the system later goes haywire.

I like to view a version as a shortcut, so that a client does not have to check the full external API at runtime to know if there are changes that it should concern itself with. In fact, ideally the "external API" version should be some sort of "hash", if you will, of the external API that could be automatically maintained. More on the ideal answer later. The "bookkeeping" component is in effect like a branch in version control, it lets the developer go back and retrieve the previous version when problems come in from the users.

I could go on about how many coordinates are needed for this or that or what have you, but there are many many people far more qualified to comment on this and, you know what? They have commented, so if you are interested, go Google for them and check them out. What I am going to do is describe what in my ideal world versioning would do for me.

Versioning: James Ervin's Happy Place
I am not a big Adam Sandler fan, but I am a guy, so I will say that in the movie "Happy Gillmore", I was moved by Happy's idea of his "Happy Place." So what I am going to describe is my "Happy Place" for Software Versioning, though I will say that the "Happy Place" described in the movie is far superior to the one I will go into.

First off, any versioning scheme on the component level has to take into account that all software changes fall into one of the two orthogonal categories that I defined above, they are either external API or, for lack of a better word, internal "bookkeeping" changes. Any tooling support should be able to distinguish.

Secondly, I would like the notion of backward compatibility to go away. You change the external API, you create incompatibilities period. Incompatibilities between versions are a two way street. I mean if you make so called "backward compatible changes", then you shouldn't break existing clients, well that's great right? The problem with this view is that it assumes that your new stuff will be automagically available everywhere, which really does not take a full lifecycle view of your component into account. What you do is allow for a set of clients to be written that will be unable to use your pre-existing components. I don't have a great answer for this, but I would like people to take a greater view into account, all external API changes could force incompatibilities, not just ones that break backward compatibility. Once external API is defined, you need to treat it like set cement. How to potentially better deal with set cement? I try and deal with it in the next point.

Thirdly, I would like better API tooling support. This comes from point two, I want better tooling to help with what is in effect an intractable problem. First I would like to mention I know about the new Eclipse API tooling push and I want to investigate it thoroughly later on. I think it is worth mentioning that I view tooling support here in two categories again, what I would call "static support" and "client feedback support".

Static Support analysis in my view is what the Eclipse API tooling effort is focused on. Give me warnings if I add things to the external API to change the "external API" component of the version and then provide an automatic way to update it. Give me an error/warning to update the "external API" component of the version if I change an existing method or other backward compatible breaking change. Tell me to update the "bookkeeping" component of the version when I make any significant changes to the component that does not impact the external API. In other words, give me some help to make it more automatic. I, as a human being, have and will continue to forget to update versions without warning/errors markers to guide my way.

Before I go into the second category, I want to emphasize again that this is my happy place, my ideal world. "Client Feedback" tools to me are a potential answer to the problem of external API being set concrete after released in the wild. The trouble is that most of the time a whole API is not in use, usually only a subset, but what subset? That information would be invaluable. It would allow a component developer to know what parts of the API could be eliminated or given better documentation so that more clients would use it. "Client Feedback" would provide the targets for refactoring and could prevent the kind of cruft that eventually dooms any software product. How do you best get this feedback? Some sort of tooling support of course and beyond that I dunno. Did I mention this was my "Happy Place"?

Conclusion
So what do I want in a versioning scheme or tooling support? Prevent me from making simple obvious mistakes that I have done again and again. I want support to make me take into account that when a component is released, people write clients to use it and I have to keep them in mind for any changes, period. I think what I really want is the Staples Easy Button, which if you know me, is a metaphor for an impossible to build tool. Still there is alot that can be done before we need an 'Easy Button' to make this problem less problematic.

No comments: