Automating Commercial Off The Shelf Software Deployment

I currently work for a large enterprise. Our little corner of this company has been very successful in implementing and rolling out build an deployment automation across all of the applications that we work with. This has arisen out of the fact that a lot of our applications are fairly simple JVM based apps with an embedded container. The DevOps team that I work with has created a simple pattern for building, packaging and deploying these apps that we then apply to future projects that are of the same nature. When we find improvements to the existing method, we then work with the teams still on the legacy pattern and apply the change to their process as well. One of the unexpected outcomes of this success is that we are now having teams approach us who are currently not doing any automation and asking us to work with them on automating their build, packaging and deployment process.

These requests always put us in a bit of a bind. We have been successful due to the nature of the applications we are working with. JVM based apps invariably are simple to package up and deploy – especially when there is an embedded container and the application is therefore completely self-sufficient. When a new team comes looking for our help, overwhelmingly they are developing on a proprietary platform that does not lend itself to automation as well as a custom solution does. In this first post, I will discuss the strategy that we take when we approach these teams and how we know from day one if we are going to be successful. In the next post I will review one of the successes we have had in working with these teams – automating the packaging and deployment to an IBM DataPower instance.

This post is going to be relatively short and sweet. If you are trying to reliably automate a Commercial Off The Shelf (COTS) product, you need either an API or a command line based solution that allows you to:

  • Stop/Start/Restart/Reload the running processes
  • Remove and clean the currently deployed application
  • Install the new application
  • Update environment specific information
  • Load any application specific data packages or database migrations
  • Verify the status of the application post deployment

If you don’t have an API or command line that provides this functionality, you are not completely out of luck BUT, in my experience, any solution that includes any of the following workarounds:

  • Driving the GUI via a tool like Sikuli
  • Capturing and templating HTTP calls (POSTS/PUTS) to a web based admin console
  • Reverse engineering the application deployment in the DB or filesystem of an application

will require a full time team to maintain and fix as it will break frequently, will require a lot of re-work when a new version of the COTS product is deployed and will not account for any edge cases. In other words, it is doable if the business insists, and will even provide some benefit but don’t expect that you can ever walk away from it without worrying about it breaking. We only agree to work with teams that have a tool that provides API or a command line based interface that fulfills our criteria. We are not in the business of creating a solution that can’t be maintained by the development team that owns the product and so any solution that uses a custom set of tools to do the automation isn’t supportable.

My next post will focus on the work we did automating the IBM DataPower deployment for 2 teams that we work with. This is a product that does provide the necessary API driven interface to manage the deployment remotely but required a custom solution to be implemented to build the framework necessary to support Continuous Delivery for the teams using the platform.

Testing puppet manifests part 1 – Local Compilation

Testing puppet manifests

The pipeline approach we use to move our infrastructure changes from one environment to the next gives us the advantage of having some visibility into what will happen in an upstream environment. Still, it would be nice to be able to get some quick feedback on potential issues with out puppet codebase before we even apply the changes. We have come up with 2 mechanisms to do this that provide us with very fast feedback and some assurance that our changes won’t immediately break the first upstream environment. The first, covered in this blog post, is the local node compilation.

Node manifest compilation

In the same way that a developer compiles their code locally prior to checking in, the node manifest compilation step is a verification step that runs through each and every node we have defined in our puppet manifests and compiles the puppet code. This catches errors such as:

  • Syntax errors
  • Missing resource errors – i.e. a file source is defined but not checked in
  • Missing variable errors for templates

The code to do this is pretty simple:

  1. Configure Puppet with the manifest file location (nodes.pp) and the module directory path
  2. Use the puppet parser to evaluate the manifest file and find all available nodes for compilation
  3. For each node found, create a Puppet node object and then call compile on it
  4. Compile all nodes, fail only at end of run if any nodes fail to compile and provide all failed nodes in output
require 'rubygems'
require 'puppet'
require 'colored'
require 'rake/clean'

desc "verifies correctness of node syntax"
task :verify_nodes, [:manifest_path, :module_path, :nodename_filter] do |task, args|
  fail "manifest_path must be specified" unless args[:manifest_path]
  fail "module_path must be specified" unless args[:module_path]

  setup_puppet args[:manifest_path], args[:module_path]
  nodes = collect_puppet_nodes args[:nodename_filter]
  failed_nodes = {}
  puts "Found: #{nodes.length} nodes to evaluate".cyan
  nodes.each do |nodename|
    print "Verifying node #{nodename}: ".cyan
    begin
      compile_catalog(nodename)
      puts "[ok]".green
    rescue => error
      puts "[FAILED] - #{error.message}".red
      failed_nodes[nodename] = error.message
    end
  end
  puts "The following nodes failed to compile => #{print_hash failed_nodes}".red unless failed_nodes.empty?
  raise "[Compilation Failure] at least one node failed to compile" unless failed_nodes.empty?
end

def print_hash nodes
  nodes.inject("\n") { |printed_hash, (key,value)| printed_hash << "\t #{key} => #{value} \n" }
end

def compile_catalog(nodename)
  node = Puppet::Node.new(nodename)
  node.merge('architecture' => 'x86_64',
             'ipaddress' => '127.0.0.1',
             'hostname' => nodename,
             'fqdn' => "#{nodename}.localdomain",
             'operatingsystem' => 'redhat',
             'local_run' => 'true',
             'disable_asserts' => 'true')
  Puppet::Parser::Compiler.compile(node)
end

def collect_puppet_nodes(filter = ".*")
  parser = Puppet::Parser::Parser.new("environment")
  nodes = parser.environment.known_resource_types.nodes.keys
  nodes.select { |node| node =~ /#{filter}/ }
end

def setup_puppet manifest_path, module_path
  Puppet.settings.handlearg("--config", ".")
  Puppet.settings.handlearg("--manifest", manifest_path)
  Puppet.settings.handlearg("--modulepath", module_path)
  Puppet.parse_config
end

Code available here: https://github.com/oldNoakes/puppetTesting

Note that in our production code, we break up our nodes into subsets and then fork a process for each of these to compile in. Currently we run 20 parallel processes for over 400 nodes – typically takes about 45 seconds on a fast machine (i.e. our build server) and up to 120 seconds on a slower one (i.e. the worst developer station that we have).

Exposing facter facts via mcollective YAML plugin

At my current client, we use MCollective to support the deployment of code, configuration and test data amongst a large number of potential nodes. In order to ensure that we target the correct machines to run these tasks, we rely on the following:

  • A set of values in the /etc/mcollective/facts.yaml file that are application and node specific (i.e. deployment environment)
  • An additional set of custom facts that are deployed into the /var/lib/puppet/facts directory
  • The default set of facts made available by facter

To expose these to our mcollective server, we started off using the FactsFacter plugin along with a custom fact that read the contents of the /etc/mcollective/fact.yaml file. This has worked but we have noticed that the time taken to instantiate the facts on a give node can be quite lengthy. This can impact our configuration deployment (which happens via an mcollective puppet agent) because the time required to get the facts established (as well as other issues) causes the agent call to timeout.

In order to improve the speed of facts collection by mcollective, we decided to return to using the YAML plugin – we just had to find a way to expose all of the facts that we relied upon previously into a YAML file and then make that YAML file available alongside the /etc/mcollective/facts.yaml that already existed.

Our solution is a cron job that reads all the facts from facter as well as our custom facts and writes them into a secondary yaml file in the /etc/mcollective directory.

Here is the script (facter_to_yaml.rb) that generates the yaml files on each of the nodes:

#!/usr/bin/ruby
require 'facter'
require 'yaml'
rejected_facts = ["sshdsakey", "sshrsakey"]
custom_facts_location = "/var/lib/puppet/facts"
outputfile = "/etc/mcollective/facter_generated.yaml"

Facter.search(custom_facts_location)
facts = Facter.to_hash.reject { |k,v| rejected_facts.include? k }
File.open(outputfile, "w") { |fh| fh.write(facts.to_yaml) }

We then deploy this script and use it in a cron job configured via puppet:

  file { "/usr/local/bin/facter_to_yaml.rb":
    source  => "puppet://puppet/modules/mcollective/usr/local/bin/facter_to_yaml.rb",
    owner   => root,
    group   => root,
    mode    => 0700,
  }

  cron { "factertoyaml":
    command => "/usr/local/bin/facter_to_yaml.rb",
    user    => root,
    minute  => [13, 43],
    require => File["/usr/local/bin/facter_to_yaml.rb"],
  }

Finally, we configure our mcollective server.cfg to use the newly generated file (snippet only below):

  # facts
  factsource = yaml
  plugin.yaml = /etc/mcollective/facter_generated.yaml:/etc/mcollective/facts.yaml

ISSUES

  • Something to note about the order to the YAML files listed in the plugin.yaml config option – the order matters. The second yaml file values take precedence over the first – therefore, if you are overriding any of the default facts (or any of your custom facts) in the facts.yaml file, it must be second. Not an issue in our case but something to keep in mind.
  • The other issue with this approach is if we create a new custom fact/update an existing one OR if something on a node changes that would affect one of the default fact values, it will not get updated for our mcollective configuration until the next time the cron job runs. This could cause us issues in the future and it is likely that we will also create an mcollective agent that can call the ‘facter_to_yaml.rb’ script outside of the regular cron times to provide us with the option of calling it on an as-needed basis.

ALTERNATIVES
Alternatives to our approach – if you are simply looking to expose certain facter facts to mcollective then you should consider the approach detailed on the mcollective-plugin wiki: FactsFacterYAML

Environment based DevOps Deployment using Puppet and Mcollective

One of the challenges that we ran into at my current project was how to treat the deployment of our puppet configuration in the same way as we treat the deployment of our applications – i.e. push it to ‘test’ environments to verify the changes prior to pushing the changes to the production environment. We needed a way to validate that changes in the puppet code would produce the expected results when applied to the production environment without actually pushing them there.

Our solution to this was to setup 5 different puppet environments that represented each of the different environments into which code gets deployed. We then used a combination of puppet, mcollective and mercurial to promote changes between environments. With appropriate tests in each environment, we were able to validate that the infrastructure changes we had made were ready to be promoted up the ladder.

Technical Setup

We configured our machines into separate collectives that represent the deployment environment in which they lived. Each of these collectives had a corresponding environment allocated in puppet such that when they executed a puppet apply, they pulled their infrastructure code from their environment codebase. A successful application of the infrastructure code to the previous environment triggers an update of the environment codebase to the same mercurial revision via our continuous deployment server.

Our puppetmaster config (in /etc/puppet/puppet.conf) looks as follows:

manifest = /usr/share/puppet-recipes/$environment/puppet/manifests/site.pp
modulepath = /usr/share/puppet-recipes/$environment/puppet/modules

Our puppet application is triggered via an mcollective agent running the following command:

/usr/sbin/puppetd --environment=${collective} --onetime --no-daemonize --verbose

Execution Setup

The puppet environments we have configured are:

  1. NOOP
  2. CI
  3. DevTest
  4. UAT
  5. Production

Each of these environments corresponds to a different stage in our continuous deployment server. The first stage is the most interesting as it has the majority of the tests in place to catch issues with our puppet manifests. The NOOP run does the following:

  1. Pulls the latest checkin into the NOOP puppet environment codebase
  2. Compiles the catalogs for each of our nodes using the NOOP codebase – this catches the majority of typo errors, missing dependencies, forgotten variables for templates and missing files.
  3. Runs a puppet NO-OP run against all nodes – this catches most of the remaining logical and cyclical dependency errors that can be introduced by a puppet module change.
  4. The puppet NO-OP run also produces an output report that provides us with the visibility to understand what changes are going to be applied to each environment with the latest codebase – this is very useful for auditing and tracing purposes
  5. If the NO-OP run completes without any errors, the mercurial revision of the last checkin is exposed via our continuous deployment server

The following four stages all do the same thing:

  1. Grab the mercurial revision exposed by the last successful run of the previous stage and update the appropriate environment codebase to that revision
  2. Trigger a puppet apply run for all the machines in that collective – capture and analyse the output to verify no warnings or errors
  3. If the run completes without any errors, expose the mercurial revision that was just applied out via the continuous deployment server

Because each of our deployment environments is a similar setup to environments above it, this setup provides us with the opportunity to verify that changes applied to a server are going to work in upstream environments. The primary difference between a CI environment and a production environment in our case is that one will have more servers (of the same type) and may offload some work to a dedicated server instead of hosting it on the same box as the application runs (i.e. a db server running alongside an application server in CI and DevTest vs. an independent db server in UAT and Production).

This setup isn’t perfect – in particular, running the puppet NOOP stage whilst also running another stage can cause issues as puppet will fail if it detects another puppet run ongoing – but it provides us with a reasonable amount of certainty that the changes we have made are correct and will not break any of the systems in later environments.

Useful shell commands

As with all devs on a linux project, I seem to be spending a great deal of time figuring out how to do things on the UNIX command line that I know will come in handy again in the future. As such, I will contribute yet another command line post…

SSH TUNNEL:

ssh -f user@remote.machine -L localport:remote.machine:remoteport -N

Sends a message to localhost:localport and have it ‘tunnelled’ over to remote.machine:remoteport.

Delete files older than X:

find /path/to/dir -type f -name *foo* -mtime +3 | xargs rm -f

Finds all files in /path/to/dir that match the name *foo* (using wildcards) that are older than 3 days and delete them

Search contents of files for string:

find /path/to/dir -name *foo* | xargs grep "string to search"

Find everything in /path/to/dir that match the name *foo* (using wildcards) and then grep each for the string specified

Find files that match multiple different names:

find . -type f \( -iname "*.erb" -or -iname "*.rb" -or -iname "*.pp" \)

Find all files in /path/to/dir that end with (erb || rb || pp)

Job Hunting – Code Reviews

At my current company, I am sent at least one code submission to review each week. I am a pretty lenient reviewer – show me some decent code and I’ll get you in for an interview so we can see what you are all about in person. That said, I am getting less and less patient with some of the things that I have seen so I thought I would put together a blog post about the issues to try and correct the problems before they land up on my desk. These are simple things that should take a normal developer about 5 minutes to think about and fix.

  1. Input guards – Most applications that are created for code reviews, including all 3 of the options we give to our potential recruits, use some form of input. The method of input is immaterial to the program but one thing that shouldn’t be forgotten is that you need to provide some verification that the input conforms to whatever is expected. Implementing this is simple – Regex and boundary checks should validate any possible inputs – but without it, I can easily find ways to break the implementation you have delivered and I am more than likely to pass on your code.
  2. Unit tests – Is there any developer out there who doesn’t yet know about unit tests? You don’t have to have 100 percent coverage – I’m not looking to see how you tested that a constructor worked – but you should have coverage around the major pieces of business logic in your code. If you don’t (and you are not coming to us straight out of uni), your code goes straight into the bin.
  3. Comments – Comments are only useful in one situation – where it isn’t obvious why things have been done in a specific way. Even here I would rather refactor the code to take away the ambiguity but in some cases, notably where the code you are writing relies on a large amount of legacy code, you are forced to do things in a certain way because of the framework you are working within. Nothing though is worse than comments that only reiterate what the method names already tell me. If the method is called determineTotalCost() and there is a comment such as Determines the total cost of the object, this is a total FAIL.
  4. Unused variables/fields/methods – Just because your IDE provides a way to automatically create both GET and SET methods, doesn’t mean you should. If you have no reason to use these methods (i.e. you use constructor injection and only use the values internally), then don’t put them in. Methods that do nothing simply complicate a code base. If it is a legacy method that you are worried someone might want to use later, rely on your Version Control System to provide them with the ability to resurrect it – don’t just keep it around in case. Same goes for variables and fields – if you added them at some point for a specific reason but later changed the code, make sure you go back and clean up. These unused pieces of code are more annoying because IDEs provide a way to identify unused code – find this option and make it an ERROR instead of just a WARNING.
  5. Not including all external libraries required to compile – Finally, when you submit your code, please create a package that includes EVERYTHING I will need to run it. Don’t assume that I have that library you have used in your code (i.e. junit). If you have a lib directory for your code, give it to me as well so I don’t have to worry about downloading diverse jars just to get your code to compile and run on my machine. If you think that the required library/technology is too large (ANT for instance), then at least provide the info in a ReadMe with a link to a download page so I don’t have to go hunting.

What I really want to impart with this post is that when you are submitting code to a company for them to assess, you want to make sure that it is SIMPLE. A simple implementation that solves the problem with a simple means for the assessor to open and run the code. A great code submission is not overcomplicated to show off a recruit’s skills – it is a simple implementation that gets the job done and doesn’t aggravate the assessor by forcing them to jump through hoops to get it to run.

Automating clearcase

One of the biggest pain points we have encountered on the current project I am on is the use of ClearCase by the dev team.  I won’t go into depth on the time we have wasted wrestling with this most frustrating of source control systems but will describe some of the things we have gotten to work that automate the tasks we previously had to do manually.

There are two pieces that we have automated that have made our lives easier:

Checking development branch is in-sync with main branch – This was a simple implementation but a fantastic means of notification. Main branch check ins happen infrequently within the development environment here (1-3 times a month) and there is supposed to be email notifications that get sent out each time but, as with anything, they can sometimes not get sent or just get lost in the daily email delude. As such, we setup a simple check that compares our branch to the main branch and checks to see if anything on the main branch has not been merged into our branch:

    <taskdef name="cc-findmerge" classname="org.apache.tools.ant.taskdefs.optional.clearcase.CCFindmerge">
        <classpath path="${customAnt.dir}/ccfindmergeanttask.jar" />
        <classpath path="${log4j.dir}/log4j-1.2.14.jar" />
    </taskdef>

    <target name="check-insync-with-clearcase-main" description="Checks that the view is up-to-date with the main branch">
        <cc-findmerge viewpath="${view.dir}" version="/main/LATEST" />
    </target>

Here we are using the CCFindMerge task which compares the viewpath to the version and determines if a merge is required. If it is, it will cause the build to fail. We run this task on a nightly basis as part of a validation and metrics build and it gives us an instant heads up that something new has been checked into the main branch and we need to merge it.

Automating merge from dev branch into integration branch – The branching strategy at the client required that we do all development work in our development branch and then merge those changes into a shared integration branch. Since we are following an Agile methodology (and therefore commit code is small changes that are tested to work), we basically are able to merge into the integration branch on every succesful build of our development branch. Unfortunately, merging manually is a long process and was not worth the effort but we wondered if it would be possible to do an automated code merge on a nightly basis (is a fairly heavy process so don’t want it running on every checkin). Here is the solution we came up with:

  • Update the local copy of the integration branch with the latest codebase
  • Attempts to do an automated clearcase merge from our dev branch into the local copy of the integration branch
  • Find all files that have been checked out as a result of the automated merge attempt
  • If the automated merge was succesful, checkin all the files found
  • If the automated merge failed, undo all checkouts and fail the build

Rather than anything to fancy, most of these calls are done using the available command line interface from ClearCase – cleartool.exe. To do the auto merge attempt, we run the following ant target which puts the result of the command into a property called merge.outcome:

    <target name="merge-from-view" description="Attempts to merge 2 views - if merge is non-trivial, fails and undoes checkout" if="view.name">
        <exec executable="cleartool.exe" failonerror="false" resultproperty="merge.outcome">
            <arg line="findmerge" />
            <arg line="${btweb.view.dir}" />
            <arg line="-fversion /main/${view.name}/LATEST" />
            <arg line="-log NUL" />
            <arg line="-c 'Merging latest from ${view.name}'" />
            <arg line="-unreserved" />
            <arg line="-merge" />
        </exec>
    </target>

We then run a target to find all checkouts in the directory and put them into a property called checkout.files:

    <target name="find-checkouts" description="Finds all local checkouts">
        <exec executable="cleartool.exe" failonerror="true" outputproperty="checkout.files">
            <arg line="lscheckout" />
            <arg line="-cview" />
            <arg line="-recurse" />
            <arg line="-fmt '%n;'" />
            <arg line="${btweb.view.dir}" />
        </exec>
        <echo message="${checkout.files}" file="${btweb.view.dir}/mergeResult.log" />
    </target>

Finally we use the property merge.outcome to determine if we should checkin or undo checkout of all the files in the checkout.files property:

    <target name="decide-merge" depends="decide-merge-outcome, checkin-files, undo-checkouts" />

    <target name="decide-merge-outcome">
        <if>
            <equals arg1="${merge.outcome}" arg2="0" />
            <then />
            <else>
                <property name="merge.failure" value="true"/>
            </else>
        </if>
    </target>
    
    <target name="checkin-files" unless="merge.failure">
        <foreach target="checkin-file" param="file" inheritall="true" list="${checkout.files}" delimiter=";" />
    </target>
    
    <target name="checkin-file">
        <echo message="Checking in changes to ${file}" />
        <cc-checkin viewpath="${file}" comment="ANT - checking in from merge of ${view.name}" nowarn="true" keepcopy="false" identical="true" />
    </target>
        
    <target name="undo-checkouts" if="merge.failure">
        <foreach target="undo-checkout" param="file" inheritall="true" list="${checkout.files}" delimiter=";" />
    </target>

    <target name="undo-checkout">
        <echo message="trying to uncheckout ${file}" />
        <cc-uncheckout viewpath="${file}" keepcopy="false" failonerr="true" />
    </target>

As well, we use the merge.outcome to determine if we should fail the build due to the inability to automatically merge the files:

    <target name="fail-if-merge-failed" if="merge.failure">
        <fail message="Merging of changes from ${view.name} failed, all checkouts were undone - please attempt a manual merge" />
    </target>

The process isn’t perfect and could do with a bit of improvement. It succeeds about 95% of the time to automatically merge the changes but when it fails, the undo-checkout command can often fail, especially when dealing with files that have been newly added or deleted. The other major issue is when the branch is being used in some way while the merge is attempted. If a user has a Reserved Checkout or if the branch is in use by another process (i.e. being checked out or backed up), the automatic merge attempt fails.

Branch Development Fallacies

I have now been on 2 major projects within Australia that have used the branch development philosophy. The idea behind this is:

  1. Project is defined by business
  2. Copy of main branch is created and made available to new Project team
  3. All development for that project is confined to the branch
  4. When new changes are committed to the main branch – maintenance fixes or other projects going live – each branch merges the changes from main into their branch and deals with any conflicts
  5. As the project nears completion, it is assigned a production deployment date alongside one or more other projects
  6. Projects assigned to the same production deployment date merge together into an integration branch which is used to create deployment artefacts for testing that release
  7. Development for each project continues in isolated branches – code is merged into integration branch when teams ready to commit
  8. Once project goes live, the integration branch then gets merged back into the Main branch and all other outstanding development branches pull the changes into their codebases

In the ideal situation as outlined above, it makes perfect sense to the business. They argue that this type of development reduces risk by:

  • Making it simple to cancel a project and not have to worry about removing any completed code from the codebase
  • The main branch will always be stable
  • Projects that miss their deadlines or are parked in order to develop higher priority work do not adversely impact the main branch and the ability to push new releases

There are 2 problems with this type of thinking. The first is that very few long running projects that have cost millions of dollars are ever going to get cancelled. More likely they will get pushed to the next release but that means that all the testing that has taken place against the integrated code base must now be re-executed, likely leading to the other projects in that integration branch also missing the deploy window. The second is that while the approach may decrease business risk in that projects can be more easily managed, it actually increases technical risk. The problem with keeping each team isolated on its own branch until the last moment are:

  1. late integration – Since we are waiting until almost the end of a project to start merging our code with the other development branches going into Production with us, it gives us much less time to effectively test the codebase and resolve and merge issues that should arise. Regression defects are much more likely to be introduced as functionality that previously worked in one branch in isolation now breaks against the merged codebase. The teams involved now need to get together and hash through code that was potentially written months ago to determine why the functionality broke, who is responsible and how to get it fixed while preserving the changes both teams made.
  2. code divergence – When branches are used for very long projects, each branch will make changes that take it further and further from other branches. Though each will still remain in sync with Main, when one of the projects goes into main, the merge pain encountered by the other projects will be significant because of the large amounts of change between the old main and the new main.
  3. stale branches – If a branch gets parked in favor of other branches and remains so for any significant length of time, not only will the code on it become stale but also the knowledge of the developers who created that code will also become stale. After a short amount of time, the work will become effectively lost as the developed code no longer has any owner and is no longer in sync with the main branch.

I’m not completely against branch development but I think it should take one of 2 flavours:

  1. Discrete Functionality Branches – Branches that only exist for a very short amount of time (1-2 weeks) which are used to implement a discrete bit of functionality that can then be pushed back into the main branch and be deployed if necessary. The only issue with this type of setup is that the business must be reconfigured so that they are able to break down large projects into smaller components that can be deployed independently. This approach lends itself well to continuous deployment since each new piece of functionality can be pushed out to production as soon as it has been effectively tested.
  2. Early Integration – Ideally, a business would identify projects they want to be deployed into Production together as soon as they are approved to begin development. In that way those teams could either all share a single branch for development or, alternatively, each have a separate branch but create the integration branch immediately and merge to the integration branch as often as possible. In this way any integration pain would be encountered immediately and not at the tail end of a project

There is no magic bullet here but I think that Project managers have to take a longer view than just considering business risk when they insist on developing functionality in isolation. There is a lot of technical risk that will likely land up effecting them that is not taken into account.

Java Generics fun

I have played with Java generics on various projects and used them but I found the following chunk of code very useful for making me fully understand the possibilities:

    public  TResponse handleRequest(
            DataExtractRequestHandler requestHandler, Class responseType,
            TRequest request, 
            String uniqueId) {
        
        long startTime = new Date().getTime();
        String methodName = request.getClass().getSimpleName();
        logMethodEntry(methodName, uniqueId);
 
        TResponse response;
        try {
            response = requestHandler.handle(request);
            setResponseStatus(response, WrapServiceResponseStatus.SUCCESSFUL_REQUEST);
        } catch (WrapServiceException e) {
            response = buildErrorResponse(responseType, e.getErrorCode(), e.getDescription());
        } catch (Throwable e) {
            getLog().error(logPrefix(methodName, uniqueId) + " Unhandled Exception", e);
            response = buildErrorResponse(responseType, WrapServiceResponseStatus.UNHANDLED_EXCEPTION.getCode(), "Unhandled Exception: " + e);
        }
 
        logMethodExit(response, request.getClass().getSimpleName(), request.getClass().getSimpleName(), uniqueId, startTime);
        return response;
}

In this code, we pass in a request handler that itself uses the generics interface to identify the specific class of the request and response type that the handler will user. The request handler implements the following interface:

    public interface DataExtractRequestHandler {
        TResponse handle (TRequest request) throws Throwable;
    }

So we have used the handleRequest method in the first code block to provide a standard way of running the handle method of each of the different type of request handlers that implement the interface in the second block. In that way, we don’t have to duplicated all the logging calls in each request handler and we have a single place to do all the status handling (i.e. setting success on the response or dealing with any exceptions thrown by the handle method).

The primary driver behind this implementation was that when each request handler was responsible for dealing with its own exceptions and creating the appropriate response, there were some that missed a few of the exceptions thrown and therefore would return an exception via the webservice rather than a response that the consumer could handle.