Archive for the ‘software configuration management’ Category
Do you now bear responsibility for the architecture of some code? I’d like to talk just with you.
I’d like to say that I think it’s really important to “listen and then lead”. You’re responsible to the customers of your code for its quality, and you’ll get a lot of pressure to “just do this” or “just do that”. The processes you put in place, to control the quality of contributions, are for the benefit of its users and the code as a whole—to maximize usefulness, not necessarily to maximize development.
Sometimes it’s okay, even necessary, to inconvenience a few people (including the contributors) in order that you don’t fail an even larger number of people. You’re not (at least I wasn’t) able to please everyone, and trying to do so can easily cause harm, both short term (immediate bugs) and long term (technical debt and bad architecture). It’s your, just your, job to balance all this and do what’s right overall, and it’s very much a judgement call.
Some of this might be replaced by a change control board’s decisions, but if you go that route, then you must have a change process that everyone follows, regardless of the demands of the moment. Some people are really slick and sophisticated in getting what they want, especially when it doesn’t hurt them if they end up creating a problem. (Heh, the more I reread that the more it sounds like parenting.)
I found I had to be confident enough to implement decisions and defend them, but flexible enough to shift policy and correct errors. On the plus side, I think these skills will help move you past your current position, and help you control the direction in which your career heads.
You’re doing pretty well if you don’t instantly take sides in conversation on the merits, which is better than I sometimes do. Even if you’re slow making decisions—and there is so much to learn—a little slow is ok, since it gives a chance for more reflection and more mature ideas. Don’t be afraid to be wrong: if you never make mistakes, never make a bold architectural decision, then you’re playing it too safe, and you could probably be replaced with a small shell script.
Take care and good luck!
The thing that struck me most about this article is how closely it matched the way my own team of coworkers, who came and went, contributed and moved on, helped me develop two significant software build systems over ten years.
Without a doubt I had the role of “master”, and as Peter Naur would say, had the theory of the program in my head. I developed the build system from empty directories twice, to build products for two different software systems. Each time I had key responsibility to fit the system to customers’ needs, and decide how it would involve internally. (I’ve also been programming for thirty years, since BASIC on my first computers in grade school.)
I delegated what I saw fit to a handful of people who grew in responsibility as they demonstrated they understood and could make creative improvements to the system (journeyman). One of the journeymen became committer when I recently left. Since no-one had the theory as well as I did, architectural responsibility devolved from my fiat to a change control board, so that everyone could bring bits and pieces of their expertise to together make better decisions.
Hundreds of small changes were made over the years by a few dozen people who did not really grasp the principles of the system, so needed a significant amount of guidance. I reviewed every line of code in the system, and know how it all works together. I expect that some of this efficiency will suffer in future, at the advantage—to the organization that owns it—of not having “the keys to the kingdom” in one set of hands.
I’m not sure how well this approach would sit with people as a formalized organization chart and pay scale, but I’ve seen it work very well in practice with a software project I managed. Further, I believe that if you look closely most software projects in practice run this way (eg Linux has Linus, Alan, various subsystem committers, and a hoi polloi of patchers.) Even further, a single person may be an apprentice to one project, a journeyman to another, and a master to a third.
Perhaps the thing to do is to compensate people outside the company by bounties for the code they contribute; set up a contractor arrangement for journeymen; and actually hire only masters. This keeps companies small, focused on their core competencies and most valuable people. Even better for the community of programmers, this provides a structure for programmers to put food on the table while contributing to as many projects, and at the levels of depth, as suits their inclinations and abilities.
Notes on Aspect Oriented Programming: Radical Research in Modularity, by Gregor Kiczales, for Google TechTalks on 16 May 2006.
Pointcut Language and Overall Program Structure
Ableson and Sussman: We have a language when we have primitives, a way to compose those primitives to perform useful work, and a mechanism to abstract and refer to compositions elsewhere.
Pointcut syntax is a little language, with syntax to select a set of join points (primitives), construct logical expressions with several sets (composition), and label an expression (abstraction).
This is the sole advantage of modularity: abstract a set of primitives in one distinct place, and combine them with tools in the language. Using the language, instead of hand-weaving, provides a more structured way to consider how changing the base concern may change the pointcuts.
Dominant model applies crosscutting concerns to join points in base concerns structured as block models within a hierarchy, instead of composing orthogonal sets of crosscutting concerns with base concerns structured as crosscutting concerns.
Procedural, Object, and Aspect Programming Methods
Say we describe the ordinary process of functional/procedural programming as a problem expert (eg, the task at hand) collecting knowledge from other disciplines (secure, optimize, distribute, synchronize, persist, debug/log/trace, quality of service) to apply to the problem. Obviously the programmer can integrate all of this knowledge on a deep level, since all the knowledge appears before the analytic capability of the human mind. However, there is a limit to the size and number of disparate disciplines which can be integrated, before the scale and complexity produces an error rate unacceptable to customers.
Object-oriented programming does not alleviate this situation, since each class must still integrate all the crosscutting concerns with methods, or statements within methods, in the class definition.
Aspect-oriented programming allows a team of specialists to each apply expertise to many basic problems. One person could do everything the team does, but as mentioned can only work on a few very large problems at once, before reaching human limits; whereas a specialist can dispense information as needed across many large systems. This optimizes resources at the cost of the mentioned deep integration of specialist knowledge with other specialist knowledge. That integration is left up to the resource manager (the weaver), who schedules and reconciles the contributions of the specialists and the basic problem’s manager. The basic problem’s manager, in aspect-oriented programming, can represent the customer, verify correctness of the integration, and validate the integrated solution.
Consider Pointcuts in Aspects, and Pointcut Specifiers in Base Concerns, When Changing Base Concerns
We must be careful, when editing base concerns, to not break pointcuts; or, breaking them, to remember to update the pointcut. This is less likely to cause error than the original method of weaving crosscutting concerns into the base concerns by hand.
With respect to my weaver for shell-scripts and makefiles, I am hesitant to use explicit comments in the base scripts as search patterns for the sed scripts. However, these make a contract with any transforming program: we can consider these points reliable enough to anchor transforms. Without this, especially if the search patterns refer to code statements, additional pointcuts may not find the points they specify still in the program by the time the weaver gets to them (but at least they will fail-safe and apply no advice).
The base concern’s comments should only refer to the base concern, not any crosscutting concern, and should certainly not say “at this point, insert that statement from the other crosscutting concern.” Those kinds of comments defeat the whole purpose of modularization, and do not recognize that we may eventually weave completely different crosscutting concerns into the base concern.
We should not try to apply a single comment across many base concerns, to reduce the effort to write the join point specifier. This binds, in a single relation, all the base concerns, and all the crosscutting concerns which use the comment. If the advice changes for half the base concerns, then half the instances of the single comment must change. If instead the pointcut is a combination of several comments, then we more easily split off a subset of these into a new pointcut, to receive new advice.
The Weaving-Loom Analogy
weave applies one crosscutting concern to one base concern. The variation lace customizes the base concern for an application (eg, by laying sed changes over the file).
loom weaves many crosscutting concerns into many base concerns, adding each crosscutting concern to the base concern as modified by previous crosscutting concerns. Under user control, the loom orders crosscutting concerns applied to each base concern. User control is essential since the user writes the crosscutting concerns, and knows which crosscutting concerns apply to other crosscutting concerns, so should weave last.
warp finds crosscutting concerns which modify a given base concern. It also finds crosscutting concerns and advice which modify a given base concern’s given code statement.
weft finds base concerns modified by a given crosscutting concern. It also finds base concerns’ code statements modified by a given crosscutting concern’s given advice. woof is already taken as a s(n)ide-acronym for another function in my build system, and is slightly silly anyway, although it does have the advantage of turning loom upside-down and backward.
Weaving New Aspects into Reused Legacy Code
AOP removes all code for crosscutting concerns from base concerns. This is better than macros or libraries, which require at least calls in the base concern. The base concern can be completely oblivious to crosscutting concerns, which allows us to weave unanticipated crosscutting concerns as easily as we add the ones we had in mind when we wrote the base concern.
This is key to reusability, and in fact aspects should unlock the entire previous pre-aspect code base to modification, without editing the current working, tested, and delivered code files. I don’t believe in silver bullets, but the separation of concerns here is the first opportunity I’ve seen to keep the old code separate from the new code, yet use them together in a new program, while easily incorporating any further changes to the old code.
For new code, aspects promote writing code for reuse, since each concern is a separate file, and can be bundled and released separately. Reuse is available for base concerns since they do not refer to crosscutting concerns (low coupling), and completely implement one concern of the program (high cohesion). Reuse is easier for crosscutting concerns because: (1) if the aspect has no join point to match, it fails safe and applies no advice to the base concern; and (2) the advice code refers only to methods and fields exposed by the pointcut, which hides (low coupling) the remainder of the base concern.
Integrating Proprietary and Open-Source Code
This method could be very useful in incorporating open-source code into proprietary codebases. If we can show that we did not directly modify the original open-source code, then we may not need to release our proprietary changes to the open-source code.
Before compilation, the reused open-source code is in its files with its copyrights, in exactly the same state as when downloaded. Proprietary code is in its files with its copyrights. Since we must redistribute the open-source code files with any modifications to those files (the files under copyright), we just redistribute the original, unmodified open-source files, since they were not changed.
I am not a lawyer, so this analysis could be way off: specifically, we could still consider the distributed, weaved version a derived work, which means that all code that went into producing it must be distributed. Definitely must speak to an IP lawyer before basing any decisions on this.
Dynamic Join Points
Since I’m writing a weaver, I should consider join points other than code statements (including comments) in shell scripts and makefiles. These implement a static join point model, which change the code before it runs. Dynamic join point models create advice which executes only upon certain runtime behavior: for example, if a sub-script or particular makefile is called with certain parameters or targets, or a certain shell variable or makefile macro receives a particular value.
We could of course explicitly write more complicated advice to implement this. But it would be simpler to write the advice if the pointcut matched the runtime state of the program. To do this, we could either (1) monitor the execution state at runtime (better suited to modeling languages) or (2) extend the pointcut language to describe a runtime state. With (2), the weaver translates the runtime state description into static code to detect the state and apply the simpler advice.
Since the crosscutting concern in my implementation “owns” the final executable form, I can prototype new changes to the base concern and affect only the product or platform which needs the change. Once it’s proven, I can move any new variables or code in the interface back to the base concern, to become a default part of the interface for other products or platforms. At this point, I may need to write advice to change the interface for products and platforms which must differently define its values.
We can extend this thinking to consider prototyping itself a crosscutting concern, and maintain several prototypes concurrently, weaving them in according to control options. Of course, this begs me to make control options crosscutting concerns.
Effect on Software Configuration Management
AOP introduces the weaving step between code checkin and compilation. For C/C++, this step constructs woven header and source code files by combining (1) base header and source code files with (2) header and source code file content in advice gathered in files which encapsulate each crosscutting concern. The order of application of crosscutting concerns should be specified in the target makefile.
Alternatively, we could make the Aspect C and C++ compilers available as alternative platforms (eg, *_aspect), and let the compilers sort it out. The caveat here is that I haven’t used one of these languages yet. I do know that they would integrate most cleanly if they were actually preprocessors which created new *.c/*.cpp (C/C++ source code) or *.i/*.ii (preprocessed C/C++ source code) files suitable for compilation with the build system’s cross-compilers.
Branching. Agile asks to trade baseline stability for ease of integration. Take the latest baseline available, make it work with your change, and merge it as soon as it passes your tests. This reduces the number of concurrent branches active at once, and therefore the amount of simultaneous variability in the code base. Developers will not need to rebaseline from the last official build to include changes already merged to the integration branch.
Testing. This does mean that problems are more likely to be merged and affect others, so automated unit testing helps avoid defects not due to the interaction with other components. Merging a change sooner gives it more second-order testing with other components before major builds, which increases the likelihood that interaction problems will be found before they can delay a release.
Quicker feedback about integration problems refocuses a developer’s attention on a change sooner, while the developer is still thinking about the problem, instead of at the next major build after a developer has moved on to other issues.
In September 2005, the Wall Street Journal reported how Microsoft changed its build process after it rebaselined Windows Longhorn.
The Windows division apparently had little or no integration criteria, to block unstable or poorly-integrated code from merge to the integration branch.
Mr. Allchin's reforms address a problem dating to Microsoft's beginnings. Old-school computer science called for methodical coding practices to ensure that the large computers used by banks, governments and scientists wouldn't break. But as personal computers took off in the 1980s, companies like Microsoft didn't have time for that. PC users wanted cool and useful features quickly. They tolerated — or didn't notice — the bugs riddling the software. Problems could always be patched over. With each patch and enhancement, it became harder to strap new features onto the software since new code could affect everything else in unpredictable ways.
The unstable codebase created by uncontrolled changes was fragile, and prone to unanticipated errors from subtle interactions. Without automated testing, dedicated developers (firefighters, probably not those that wrote the errors) spent all their time correcting builds.
Mr. Allchin says he soon saw his fears realized. In making large software programs engineers regularly bring together all the new unfinished features into a single "build," a sort of prototype used to test how the features work together. Ideally, engineers make a fresh build every night, fix any bugs and go back to refining their features the next day. But with 4,000 engineers writing code each day, testing the build became a Sisyphean task. When a bug popped up, trouble-shooters would often have to manually search through thousands of lines of code to find the problem.
This approach, and the architecture of Windows, differed from competitors', and those of other parts of Microsoft.
That was just the opposite of how Microsoft's new rivals worked. Google and others developed test versions of software and shipped them over the Internet. The best of the programs from rivals were like Lego blocks — they had a single function and were designed to be connected onto a larger whole. Google and even Microsoft's own MSN online unit could quickly respond to changes in the way people used their PCs and the Web by adding incremental improvements.
In response, the Windows division automated testing, improved and automated integration criteria, and focussed developers' attention on correcting errors.
By late October, Mr. Srivastava's team was beginning to automate the testing that had historically been done by hand. If a feature had too many bugs, software "gates" rejected it from being used in Longhorn. If engineers had too many outstanding bugs they were tossed in "bug jail" and banned from writing new code. The goal, he says, was to get engineers to "do it right the first time."
All to good effect: fewer defects and shorter cycle time.
On July 27, Microsoft shipped the beta of Longhorn — now named Windows Vista — to 500,000 customers for testing. Experience had told the Windows team to expect tens of thousands of reported problems from customers. Instead, there were a couple thousand problem reports, says Mr. Rana, the team member.
And last month, Microsoft delivered a test version of Mr. Gates's WinFS idea — not as a part of Longhorn but as a planned add-on feature. Microsoft this month said it would issue monthly test versions of Windows Vista, a first for the company and a sign of the group's improved agility.
The last word is significant, since automated testing is important to Agile Methods. The other major improvements are either decent SCM (eg, a merge plan) or management (if you write bugs, don't write new code).