Great Article On Testibility and Class Design, Part II

0

This is a follow up to the previous article that I posted on class design – good read!

As an aside, and one of the few things that I disagree with in either of the articles, is the wrapping of collection classes within separate classes solely for encapsulation. As much as I like single use classes, I just don’t think that encapsulating a collection in it’s own class just for access control is necessary. There are other means of achieving this. If this happens to be a concern then more is going on with the collection than probably should be.

Making Large Classes Small
(In 5 Not-So-Easy Steps)


In my previous editorial, I discussed the benefits and other effects on code bases of using small classes, which I defined using a limit of 50-60 lines. I received many letters on this article, most analogizing it to edicts of olden times to keep functions short enough to fit on one pane of glass. (The letters will appear in the upcoming issue of Dr. Dobb’s Journal, our PDF magazine, which ships on June 19.)
Much as I appreciate those letters, they miss a crucial point. I was not discussing a single function, but rather an entire class, which implies multiple functions in most cases. Coding classes as diminutive as 60 lines struck other correspondents as simply too much of a constraint and not worth the effort.
But it’s precisely the discipline that this number of lines imposes that creates the very clarity that’s so desirable in the resulting code. The belief expressed in other letters that this discipline could not be consistently maintained suggests that the standard techniques for keeping classes small are not as widely known as I would have expected. (Given that the original editorial was inspired by an extended effort to clean up a project that contains much of my own code, I say this with all due humility.)
Let’s go over the principal techniques. I presume in this discussion that design has been done and it’s now just a matter of writing the code. Or in the less attractive case, of maintaining code.
Diminish the workload. The first technique to apply is the single responsibility principle (SRP), which states that classes should do only one thing. How big that one thing is will determine in large part how big your classes are going to be. Reduce the work of each class; then, use other classes to marshal these smaller classes correctly.
Avoid primitive obsession. This obsession refers to the temptation to use collections in their raw form. This is definitely a code smell. If you have a linked list of objects, that linked list should be in its own class, with a descriptive name. Expose only the methods that the other classes need. This prevents other classes from performing operations without your knowledge on an object they don’t own. The purpose of the list is also supremely clear and this encapsulation enables you to change easily to a different data structure if the need should arise later on.
Reduce the number of class and instance variables. A profusion of instance variables is a code smell. It strongly suggests that the class is doing more than one thing. It also makes it very difficult for subsequent developers to figure out what the class does. Very often, some subset of the variables form a natural grouping. Group them into a new class. And move the operations that manipulate them directly into that class.
Subclass special-case logic. If you have a class that includes rarely used logic, determine whether that logic can be moved to a subclass or even to another class entirely. The classic example of the benefits of object orientation is polymorphism. Use it to handle special variants.
Don’t repeat yourself (DRY). This suggestion appears pointlessly obvious. However, even coders who are attentive to this rule will repeat code in two methods that differ only in a single detail. In addition, they can overlook the introduction of duplicate code during maintenance. More than the other guidelines here, which are all techniques, DRY is a discipline within a discipline.
Taken together, these tools get you most of the way to small classes. To see how they are implemented in real life, I suggest a book I’ve mentioned many times before, namely Martin Fowler’s Refactoring, which is essentially a cookbook of techniques for cleaning up code, including the step-by-step process, and the illustrative code.
Returning back to my own experience, I am finding that as I insist on this particular discipline in my code rework, my brain is slowly developing a “muscle memory” and is beginning to think automatically about class size prior to class development — and certainly during the cleanup of existing code. Cheers!
— Andrew Binstock
Editor in Chief
alb@drdobbs.com
Twitter: platypusguy

Tags: , ,

Great Article On Testibility and Class Design

0

I got this in an email update from Dr. Dobb’s, but unfortunately I couldn’t find it on the web site to link to, very odd. So I’m reproducing it here. Not to take credit for it but to herald it. Although I don’t quite agree with line counting, but the principle is sound.

In Praise of Small Classes

If you’ve been doing OO programming for a while, you’ve surely run into the seemingly endless essays on testability. This debate focuses on how to write code to make it more amenable to automated testing. It’s a vein that is particularly intriguing to exponents of test-driven development (TDD), who argue that if you write tests first, as in the orthodox approach to TDD, your code will be inherently testable.

In real life, however, this is not always how it happens. TDD developers frequently shift to the standard code-before-tests approach when hacking away at a complex problem or one in which testability is not easily attained. They then write tests after the fact to exercise the cod; then modify the code to increase code coverage. There are good reasons why code can be hard to test, even for the most disciplined developers. A simple example is testing private methods; a more complex one is handling singletons. These are issues at the unit testing level. At higher levels, such as UAT, a host of tools help provide testability. Those products, however, tend to focus principally on the GUI aspect (and startlingly few of those handle GUIs on mobile devices). In other areas, such as document creation, there is no software that provides automated UAT-level validation because parsing and recreating the content of a report or output document is often an insuperable task.

I don’t want to get off my main point, however. Which is that what makes code untestable is frequently not anything I’ve touched on so far, but rather excessive complexity. High levels of complexity, generally measured with the suboptimal cyclomatic complexity measure (CCR), is what the agile folks correctly term a “code smell.” Intricate code doesn’t smell right. According to numerous studies, it generally contains a higher number of defects and it’s hard — sometimes impossible — to maintain. Fortunately, there are many techniques available to the modern programmer to reduce complexity. One could argue that Martin Fowler’s masterpiece, Refactoring, is almost entirely dedicated to this topic. (Michael Feathers’ Working Effectively With Legacy Code is the equivalent tome for the poor schlemiels who are handed a high-CCR codebase and told to fix it.)

My question, though, is how to avoid creating complexity in the first place? This topic too has been richly mined by agile trainers, who offer the same basic advice: Follow the Open-Closed principle, obey the Hollywood principle, use the full panoply of design patterns, and so on. All of this is good advice; but ultimately, it doesn’t cut it. When you’re deep into a problem such as parsing text or writing business logic for a process that calls on many feeder processes, you don’t think about Liskov Substitution or the Open-Closed principle. Typically, you write the code that works and you change it minimally once it passes the essential tests. In other words, as you’re writing the code there is little to tell you, “Whoa! You’re doing it wrong.”

For that, you need another measure, one which I’ve found to be extraordinarily effective in reducing initial complexity and greatly expanding testability: class size. Small classes are much easier to understand and to test.

If small size is an objective, then the immediate next question is, “How small?” Jeff Bay, who contributed a brilliant essay entitled “Object Calisthenics” (in the book The Thoughtworks Anthology) that touches on this topic, suggests the number should be in the 50-60 line range. Essentially, what fits on one screen.

Most developers, endowed as we are with the belief that our craft does not and should not be constrained to hard numerical limits, will scoff at this number (or at any number of lines) and will surely conjure up an example that is irreducible to such a small size. Let them enjoy their big classes. But I suspect they are wrong about the irreducibility.

I have lately been doing a complete rewrite of some major packages in a project I contribute to. These are packages that were written in part by a contributor whose style I never got the hang of. Now that he’s moved on, I want to understand what he wrote and convert it to a development style that looks familiar to me and is more or less consistent with the rest of the project. Since I was dealing with lots of large classes, I decided this would be a good time to hew closely to Bay’s guideline. At first, predictably, it felt like a silly straitjacket. But I persevered, and things began to change under my feet. Here is what was different:

Big classes became collections of small classes. I began to group these classes in a natural way at the package level. My packages became a lot “bushier.” I also found that I spent more time in managing the package tree, but this grouping feels more natural. Previously, packages were broken up at a rough level that dealt with major program components and they were rarely more than two or three levels deep. Now, their structure is deeper and wider and is a useful roadmap to the project.

Testability jumped dramatically. By breaking down complex classes into their organic parts and then reducing those parts to the minimum number of lines, each class did one small thing that I could test. The top level class, which replaced its complex forebear, became a sort of main line that simply coordinated the actions of multiple subordinate classes. This top class generally was best tested at the UAT level, rather than with unit tests.

The single-responsibility principle (SRP), which states that each class should do only one thing, became the natural result of the new code, rather than a maxim that needed to apply consciously.

And finally, I have enjoyed an advantage foretold by Bay in his essay: I can see the entire class in the IDE without having to scroll. Dropping in to look at something is now quick. If I use the IDE to search, the correct hit is easy to locate, because the package structure leads me directly to the right class. In sum, everything is more readable; and on a conceptual level, everything is more manageable.

— Andrew Binstock
Editor in Chief
alb@drdobbs.com

Twitter: platypusguy

Tags: ,

Pains with Fuse / ServiceMix

0

Ok, so I’m just starting out with ESB’s – crazy I know for as long as I’ve been at java development (over 10 years). To be totally honest, I was never really able to wrap my head around exactly what they can do for you – the definitions being so ambiguous that it really just sounded like a marketing ploy.

Anyway, the light has kind of gone off for me and I decided to delve into some open source ESB stacks like ServiceMix, OpenESB, and Fuse. Needless to say there were quite a few new concepts, as well as technical hurdles, that I needed to overcome in order to get a simple test running. All in all about 3 days work. “Three days!”, you say to yourself, “This guys gotta be a nimrod”. That wasn’t wholly the case (well, maybe partially). If you know how I work and digest things you’d know that just getting something to work simply isn’t enough. I actually like to know how things are working internally. Why does this do that, what makes that kick off, what exactly is MEP, JBI, and DSL. What’s the NMR and how does it work – that kind of stuff.

I settled on Fuse, which is really an enterprise supported edition of Servicemix, because it seemed to adhere to all the standards I wanted to follow as well as providing what appeared to be pretty decent documenation – at least as decent as open source documentation goes. I dug in, did my research on the different technologies involved and their interactions. As it turns out, the documentation was spotty and a few versions behind (gasp!) – so there I was, off and digging.

As it turns out, they have nifty maven archetypes for just about anything that you want to create. Simple SU and SA assemblies for ServiceMix / Fuse as well as other types of artifacts. Seems pretty easy and well put together… until the above bit me. Everything is out of date – from the documentation to the archetypes, which makes them not so handy for a variety of reasons.

Although the archetypes are moderately helpful for creating the basic project structure and the poms, they were completely useless in actually building a working project. The plugin tooling was the wrong version, the dependencies for the binding components were incorrect, and even worse – there is nowhere, and I mean nowhere – that the correct versions of all the dependencies are listed in order to fix the issue. Not outside of trial and error anyways – which is just not practical considering the sheer number of BE’s and BC’s that exist.

So I did some snooping around the examples and the dark recesses of the distribution. As it turns out they did a nice job of pom inheritance in their examples. And this is what led me to a resolution. If you follow the pom inheritance change up to it’s root from one of the examples you come to the master pom. In  which, and thank you so much SMX/Fuse contributers, they have a dependencyManagement section which lists all of the possible dependencies for the BC’s and BE’s!

Now that I figured out all the correct versions of everything the only thing left to due was to create a master pom for my purposes that included that dependencyManagement section. Why didn’t I just use it as my master pom? There was way too much SMX/Fuse specific build process in it and it was just easier to strip out what I needed rather than molding my process to fit theirs.

So there you have it – simple, huh? I’m now off an building service assemblies sans issues like a champ.

Hope this helps someone out there save a few hours of research!

Well, like a champ apparently was a little premature. More issues have arisen due some xbean issues, which again, I’m trying to scour the internet and figure out. Not that I can’t get things to work, just that they aren’t working the way I want them to work…

Tags: , , ,

ClassNotFoundException When Executing JUnit Test Runner In Eclipse With M2Eclipse

0

Running unit tests in Eclipse with Maven tooling can be a little tricky as the JUnit plugin for eclipse is expecting to find all of the compiled classes under target/classes.

This would be fine, except that the JUnit plugin simple tells Eclipse to build, which invokes the Maven builder (which is called by m2eclipse plugin of eclipse) to build, which only goes through Maven’s compile phase – meaning it only compiles regular source files – not test files.

This is kind of a miss by the folks who wrote the m2eclipse plugin as they should have added an additional builder for Eclipse that would kick off the compiler:testCompile goal which does compile the test files – but alas, it’s not my plugin.

Anyway, for the JUnit runner to work correctly in Eclipse we need to add an additional builder to each project that we want to run tests. This builder will invoke maven in a separate step of Eclipse’s build process to compile the test classes before JUnit executes.

Follow these steps:

  • Select he project to configure and alt+enter to open its properties.
  • Select Builders
  • Click New to bring up the configuration type dialog.
  • Select Program and then click Ok

This brings us to the launch configuration properties dialog which we need to fill out to create the launch configuration.

  • Enter Maven Test Compiler for the Name property
  • Location refers to the executable that you’d like to run. This needs to point to your mvn.bat file
  • In Working Directory, click on the Variables… button and select project_loc and then click Ok
  • Under Arguments enter – wthout the qutoes – “compiler:testCompile”
  • Go to the Build Options tab, and make sure that the check boxes under Run the builder are selected for all except During a “Clean” and then click Ok

This should take you back to the project’s properties. You can now select Ok and happily enjoy running the eclipse JUnit test runner!

Tags: ,

Current Projects – Part II: Where the hell is that class? Or, The Jar Indexer

0

First, a little background for the developer novice:

As any (Java) developer knows there are a ton of external class dependencies when you are developing an application. This is done via “import” statements (or fully qualified class names) in the code itself. Every language has this type of mechanism in one form or another. It’s a very simple and straightforward process.

Problems soon follow, however, as the compiling application needs to be able to find these resources to successfully build the software. Each language has it’s own dependency packaging scheme and also a means of finding these packages. Java uses .jar files, which are really nothing more than zip files with a specific layout. These jar files need to be included on the application’s “classpath” (a special variable that java looks to in order to find it’s compile-time and run-time dependencies) in order to compile and run.

Ok, now that the primer is out of the way we’ll get down to the issue at hand.

I actually spend a lot of my time going from project to project converting them to use the Maven build process. This entails finding and declaring every top-level dependency that is required for both test-, compile-, and run-times. Keep in mind that enterprise level applications have dozens of dependencies. As you can imagine this is a daunting and very boring task.

Why so daunting you ask? To be honest, it’s a basic flaw of the java dependency packaging scheme. The class is the basic unit of Java and it is on these is that software depends. Classes, to maintain a unique namespace, are housed in packages – which is no more than a directory structure inside of the jar file itself (i.e. com.mgensystems.TestClass). All that is fine and good, but Java doesn’t force any sort of standard on the naming of the jar files by what it contains. A single jar can contain any number of package and class combinations. Now you can start to see my problem – not being able to locate a jar that contains a particular package / class combination…

Well, being lazy at heart, I decided to come up with a solution that would allow me to find these pesky dependencies with a lot less effort. What I wanted to do was find some way to record every jar file on the filesystem and have every package and class they contained indexed so that I could find a specific jar file when looking for a particular class. What I came up with is the very appropriately named Jar Indexer. And yes – I do this a lot. Some folk would call it lazy – I call it being ingenuous and efficient 🙂

The Jar Indexer is, at it’s core, very simple. It utilizes an open source indexing engine from the Apache open source project called Lucene. The Lucene engine allows you to index whatever tidbits of information that you want and will provide you the means to search on it – which sounds like something that could be very handy!

Having the most difficult part of the project done for me (thank you open source!) all that I needed to do was a) provide the engine with the information to index, and b) wrap a simple interface around the index and search functions.

To accomplish Part A of the design I chose to rely upon another open source tool from the Jakarta Commons project – commons-io. Commons-IO allows me to specify a root location and have a list of recursively scanned files returned to me based on a filter – how nice is that? I take that output and break open each jar, noting every package and class it contains. I then pass that information into Lucene to index. Of course it was a little more complicated than that (I added a caching mechanism and whatnot for speed plus some other pieces of niftiness), but that was the high level process in a nutshell.

Now for Part B – a simple user interface. I again called upon the open source community and used another Jakarta Commons project- commons-cli. This handy API provides an easy way to create a command-line interface. You specify the arguments and it will parse them. Extremely simple but it’s a tremendous time saver.

Below is an example of the command-line for indexing using the Jar Indexer:

usage: JarIndexer -p paths -i path
-i,–index-path full path to the location of the index
-p,–jar-paths comma separated full paths to directories
containing jars to index

and the command-line for searching the index:

usage: JarIndexer [-q queries] [-o format] [-l repo] -i index [-f field]
-q,–path-to-file file containng multiple queries
-o,–result-format set to ‘m’ for maven dependency
-l,–repo-location the base path of the maven repository for
building the dep section
-f,–field-name sets the default field
-i,–index-path full path to the location of the index

When this application is run on the root of the filesystem it will index every jar / package / class on your machine and you can easily find in what jar any particular class is located… very handy indeed…

You can download the Jar Indexer from sourceforge.net here.

What’s next for Jar Indexer? More information indexed, more output formats, and a fancy GUI… I’ll keep you posted. Also, any ideas would be fantastic as well.

I hope maybe this tool will be useful to folks. But more importantly it was the first step in my goal of automating the Maven set up for new and existing projects. Be sure to read my next blog on the Maven Source Dependency Scanner!

Tags: ,