The Limber Lambda

Cheating at Word Games, Part 2

Posted in Fun by Eric Smith on February 7, 2012

In a previous post, I detailed how to cheat at a popular Zynga game which is a play on the age-old hangman, but with cute balloons (the style of which you can upgrade with purchasable coins).  I covered the part of the game where you need to choose a word that your opponent needs to guess.  I mentioned that, of course, it’s also possible to cheat on the flip-side, that is, in having to guess the word that your opponent has set as a challenge for you. At that stage, I was resorting to grepping through all the words, with a regular expression, using the “any” regex token for letters that hadn’t yet been guessed (“.”).  i.e., if the word so far ended in at and consisted of four letters, finding possibilities was a case of:

grep –i “..at” /tmp/scowl/final/*

Of course, this isn’t particularly smart, because it will return words that aren’t candidates by virtue of the fact that they contain letters that I may have already incorrectly guessed.  Assuming that I have already guessed e and r, neither of which were correct, then a better grepwould be:

grep –iE “[^er]{2}at” /tmp/scowl/final/*

Enter Ruby

I’m a great fan of knocking Unixy bits together in impressive ways, but the poor mans word-guesser above needs an upgrade, and since I’m learning Ruby at the moment, I figured it was time to put in the effort to Ruby-fy what is essentially a simple need. In Part 1 of the cheating story I defined a class called EnglishWords which serves to encapsulate all possible words and operations on those words that I may be interested in.  I need to re-use this class for my word-guesser, so it’s time to move it out into its own source file. Using EnglishWords from a script then becomes:

require File.join(File.dirname($0), 'english_words.rb')

$0 is the path to the script being executed and File.dirname gives us the directory part of the path.  This require assumes that english_words.rb resides in the same folder as the top-level script.  Fortunately, I have a bit of background in Perl, from-which Ruby borrows many things, so the idioms are familiar.  Ruby also borrows a few things from Unix shell script, e.g., $0, $1 etc., and dirname/basename.  How do I get the directory of the currently executing script in Bash?

echo $( dirname $0 )

This is one of the reasons why this language is growing on me so quickly—an example of the principle of least surprise in action (provided, of course, you’re a Unix-head and not from a purely Windows background). I took the grep above and Rubyfied it without much translation at all:

words = EnglishWords.new ARGV.shift
word_so_far = ARGV.shift
wrong_letters = ARGV.shift

regexp_so_far = wrong_letters.empty? ?
                Regexp.new("^#{word_so_far}$") :
                Regexp.new("^" + word_so_far.gsub('.', "[^#{wrong_letters}]") + "$")

Line 7 is the important part; just replace all instances of “.” with a regular expression character class of the letters already guessed, and we’re ready to apply the regular expression.  Lines 1 to 3 are another example of a Unix idiom, namely, shift which “knocks” the first item of the array being acted on off and returns it.  Interestingly, shift is an example that I’ve come across in Ruby of a deviation from another convention—that of naming “dangerous” methods (that is, methods that alter the state of what’s being acted on) with a bang at the end e.g., gsub!(..).  I guess in this case, assuming you’re familiar with the Unix shift, the bang is redundant since shift by definition changes state. Searching for (and outputting) candidate words becomes:

words.each do |w|
    puts w if regexp_so_far.match(w)
end

But wait … EnglishWords has no definition for each yet.  Here it is:

  1: def each()
  2:     @words.keys.each { |w| yield w }
  3: end

Notice how each above takes no parameters, but I’m passing it a block (blocks are the same as lambdas in C#-speak).  Another feature of Ruby—the implicit block parameter which can be passed to a method and invoked with yield. image Without too much effort, and a little bit of Ruby, we’re up-and-running in the word guessing department Smile.

Cheating at Word Games

Posted in Development, Fun, Ruby by Eric Smith on January 30, 2012

After a gruelling end of 2011, I’ve finally had an opportunity to have some time off.  When you’re head down having to deal with the stress of trying to reach a deadline, other things tend to get neglected.  In my case, “other things” includes staying up-to-date with the technology status quo, whether it be keeping pace with my RSS feed or learning new things.

There has been so much fuss about Ruby that I’ve finally capitulated and decided to learn it.  Learning a new language is nothing to be sniffed at though, it’s a bit like eating an elephant—there’s only one way to do it—small bite by small bite.

Whenever someone decides to dream up a new language my immediate response is: “why?”.  Why spend all that effort coming up with yet another language when so many already exist?  I’ve decided that the answer seems to be: “because you can”.  Ok, trying to be a little less facetious about it … the reason why new languages pop up is because, of course, there is a need.  If there wasn’t, then any new language would surely die as quickly as it was proposed.

Yukihiro Matsumoto says that the guiding principle of Ruby is that of “least surprise”, that is, programming in a language (any language) should be a natural and intuitive process.  One shouldn’t have to fight with the language; after all, computers are there to assist us in getting things done.

Word Games

I’ve become addicted to Zynga’s word games, that is, the ones where you play against other users.  The only issue is though, that lately I’ve had a sneaky suspicion that one or two of my opponents are cheating.  When confronted with a cheater, the only option is take it to ‘em.

Here’s “Hanging with Friends” a delightfully contrived game where your character gets to hang over runny lava with a bunch of helium balloons being his/her only saving grace.  You make up words and your opponent has to guess them and vice versa.  Incorrectly guessing a word means that you lose a balloon … getting that much closer to the hot stuff.

If you’re itching to play, you can pair up with a random user.  Despite being chosen randomly (I think) all of my opponents are doing a pretty bang-up job of lava-dunking my little guy, so I figured that it was time for a Ruby script to level the playing field.

beeatch-showing-whollopping

There are two opportunities for cheating, that is, a) when trying to guess your opponents word and b) when trying to come up with a word that your opponent needs to guess.  I cheat at both points.  Getting some computer-aided assistance for the former is straightforward, and I might talk about that in another post.  Now though, I want to talk about the latter, that is, coming up with a new word.

You can’t just choose any word; you’re given a random selection of letters from which to build a word.  How about if we had something that could tell us what words can be made out of the given letter selection?

Something along the lines of:

  1. Find all permutations of letters for a given set, and …
  2. Select those permutations that are actual words.

Straightforward, really.

English Language Words

There are a whole bunch of word lists for download, categorised in various ways.  I’m not particularly fussy and just went for “Kevin’s Word List Page”, in particular, the “Spell Checker Oriented Word Lists” (SCOWL).  The lists comprise a bunch of files, categorised by type of English (UK, US, Canada) and in some other ways.  For the purposes of cheating, I don’t really care, because you don’t get penalised if you enter an incorrectly spelled word—you just have to try again.

Using SCOWL, to get a list of all words (including some proper nouns), is a matter of …

cat /tmp/scowl-7.1/final/*

… after unzipping the SCOWL zip.

Permutations

I’m not going to pretend that I have some fancy way of figuring out words—my methods in this case are very practical … brute-force-style practical.

The method, then, is straightforward.  Given a target number of letters, and given a selection of letters from which you can choose a word, just find all permutations of those letters fitting within the target number of letters and check whether or not each is a real word.

Example:

All permutations of maximum three letters of the following set of letters: [ “a”, “b”, “c”, “d” ], gives us:

["a", "b", "c"]
["a", "b", "d"]
["a", "c", "b"]
["a", "c", "d"]
["a", "d", "b"]
["a", "d", "c"]
["b", "a", "c"]
["b", "a", "d"]
["b", "c", "a"]
["b", "c", "d"]
["b", "d", "a"]
["b", "d", "c"]
["c", "a", "b"]
["c", "a", "d"]
["c", "b", "a"]
["c", "b", "d"]
["c", "d", "a"]
["c", "d", "b"]
["d", "a", "b"]
["d", "a", "c"]
["d", "b", "a"]
["d", "b", "c"]
["d", "c", "a"]
["d", "c", "b"]

Of those, following are real words: bad, cab, cad, dab.

Loading English Words, Ruby-Style

The easiest way of doing this is to load all words from our word lists into memory, into a Hash.  This is fairly pedestrian, and looks like this:

def load_words(path_to_word_files)
    raise "Path \"#{path_to_word_files}\" is not a directory" unless File.directory? path_to_word_files
    @words = Hash.new
    Dir.new(path_to_word_files).each do |f|
        path_to_file = File.join(path_to_word_files, f)
        if File.file?(path_to_file)
            File.new(path_to_file).each_line do |l|
                word_without_newlines = l.chomp
                @words[word_without_newlines] = word_without_newlines
            end
        end
    end
end

As promised by Matz, this stuff is easy to write and easy to read.  Anyone who has written some Perl can see a influence coming in to Ruby, in particular chomp (for removing newlines) and unless (used as a statement modifier).

Permutations

Once all the words have been hashified, what remains is brute-forcing all permutations.  Fortunately, Ruby has permutation-finding built in to its Array implementation.  What we need to do becomes:

def get_words_with_letters(letters, word_length=8)
    word_length = letters.length if letters.length < word_length
    found = Hash.new
    letters.chars.to_a.permutation(word_length).each do |permutation|
        candidate = permutation.join
        next if found.has_key?(candidate)
        found[candidate] = candidate if @words.has_key?(candidate)
    end
    return found.keys
end

In a single chained call on a string containing the letters that we can work with, we:

  1. Get an enumeration of characters comprising the string;
  2. Convert the enumeration to an array …
  3. create an array containing all the permutations for a given length of word (each permutation being an array of characters) and
  4. Enumerate the array, passing each item to a block.

A few more statements with modifiers gives us a terse but expressive bit of code that eliminates duplicates (as a result of multiple of the same letter being part of the original set), and adds the item if it is a real word.

An initial, rough-and-ready but effective script is now ready to join my arsenal of cobbled-together tools needed to bring humility to any would-be “Hanging” challengers:

#!/usr/bin/env ruby

class EnglishWords
    def initialize(path_to_word_files)
        load_words(path_to_word_files)
    end

    def load_words(path_to_word_files)
        raise "Path \"#{path_to_word_files}\" is not a directory" unless File.directory? path_to_word_files
        @words = Hash.new
        Dir.new(path_to_word_files).each do |f|
            path_to_file = File.join(path_to_word_files, f)
            if File.file?(path_to_file)
                File.new(path_to_file).each_line do |l|
                    word_without_newlines = l.chomp
                    @words[word_without_newlines] = word_without_newlines
                end
            end
        end
    end

    def get_words_with_letters(letters, word_length=8)
        word_length = letters.length if letters.length < word_length
        found = Hash.new
        letters.chars.to_a.permutation(word_length).each do |permutation|
            candidate = permutation.join
            next if found.has_key?(candidate)
            found[candidate] = candidate if @words.has_key?(candidate)
        end
        return found.keys
    end
end

def print_usage_and_exit
    puts <<EOF
#{$0[/[\w\d_-]+/]} <path_to_words> <letters> <word_length>

    <path_to_words>     A folder containing English word files.  The format of word files
                        is one word per line.
    <letters>           A string of letters, a subset of which determine words to find.
    <length>            Length of words to find (must be less then the lesser of
                        the length of <letters> and 9).

    Example:

    #{$0[/[\w\d_-]+/]} /tmp/scowl/final trasse 5

    yields:

    tasser
    tasers
    terass
    reasts
    asters
    assert
    straes
    stares
    stears
    essart
EOF
    exit -1
end

print_usage_and_exit if ARGV.length < 3

words = EnglishWords.new ARGV.shift
letters = ARGV.shift
word_length = ARGV.shift.to_i

words.get_words_with_letters(letters, word_length).each { |w| puts w }

image

I don’t know what a tostados is, but “Hanging with Friends” seems happy with it Smile.

Architecture Revisited

Posted in Uncategorized by Eric Smith on April 16, 2011

For a while now, at work I’ve been in a kind of limbo; not quite an architect, not quite a developer, a pinch of project manager and a a sprinkle of business analyst.  To tell the truth, as of recently, mostly business analyst—however that may be interpreted (as unforutnate as it is, I don’t think there’s a coherent view of what a BA does, or is responsible for in our industry).

As the challenging project that I’ve been involved with for the past five years draws to a close, I’ve found myself reflecting on it’s early stages and contrasting it with things as they are now.  There’s this cliché that  there’s 20/20 vision in hindsight, so this is the perfect opportunity to challenge those things that we look at now as being so clear in hindsight which were such murky choices early on.

A natural tendency, when "shown the truth" through experience is to focus on the negative.  I want to make a conscious decision though, to focus on the positive.  Focusing on the positive, and recognising what we would normally term "failure" as being "opporunity for improvement", is what popular wisdom teaches us as being what we should strive for.  Why would I argue with that?

The Project

"The Project" as it will be known, in hindsight, was enormous, is enormous.  We didn’t fully recognise it as being enormous at the time.  I’m not at liberty to explain what it’s about, suffice to say it is in the financial industry (that shouldn’t be enough to incriminate me), and suffice to say that the business logic involved is highly complex.  I doubt whether I will ever again, in my life, encounter a bespoke development project involving business logic as complex as this.

In any case, we spent some time coming up with a detailed functional requirements document.  This is a story I’m sure you’ve heard before.  You might even be sighing internally as you read this … a sigh of pity; "more poor souls who didn’t embrace Agile" you might be saying you in your head.  There are no excuses here, because I don’t think that specifying requirements in detail up-front was entirely the wrong thing to do in this case.  That’s because,  what was required was poorly understood,  difficult to understand and prone to misinterpretation.  The subject matter was, and is, the domain of academia, being tackled by folk who for the most part didn’t have formal qualifications.  Due diligence, even if through up-front requirements specification and review was justified, however you look at it.  The only way that interpretation of requirements could have been tackled efficiently in an agile setting in this case would have been if the product owner was uber-qualified—someone who knew the business backwards and who had a sufficient personal stake to take delivery personally.  No such person existed.

The positive: specifying requirements as a detailed document, up-front, was the best thing to do given the circumstances.

Following on from the requirements, as is dictated by a well-behaved waterfall implementation, is the architecture specification.  That was, and continues to be, my responsibility.

The guiding principles that lead me to the architecture that shapes our lives:

  1. Live, Eat and Sleep Objects;
  2. Keep it Simple (S);
  3. You Aint Gonna Need It;
  4. If Microsoft Says it, it Must Be So;
  5. Don’t Repeat Yourself.

If you read between the lines from the above, there’s a theme, let’s call it "minimalism", engendered primarily by "Keep It Simple (S)" and "You Aint Gonna Need It", but arguably, also by "If Microsoft Says it, It Must Be So" and DRY.

Let’s back up a bit and start with …

Live, Eat and Sleep Objects

I have always, in my heart-of-hearts, believed that modelling a real-world problem with objects is the way to go.  It’s all parcelled up in some deep beliefs I have about the consequences of lying to oneself.  If you represent something as something other than what it is (or at least as something not as close as you can get to what it is, given the tools at hand), you’re on a slippery slope to invevitable doom.  As it turns out, through experience, and through confirmation by Martin Fowler in his PoEAA, there really was no other choice—when presented with a sufficiently complex domain, the choice of whether or not to use a rich domain model is a simple one … it is, for all intents and purposes, made for you.

The positive: modelling the domain using well-understood OO modelling techniques was far and away the right choice; with such a complex domain, anything else would have been death-by-duplication.

Keep It Simple (S)

What has been said about KISS has been said many times before and continues to be said—it’s one of those instances where one can be forgiven for repeating oneself.  Ironically one would think that one can’t go wrong with KISS, the benefits are simply obvious … matched by the implementation.  Sadly, this is not always the case.  The fact is, when confronted with a complex domain, there is a certain amount of “not-so-simple” that one must take on as necessary baggage in order for things to be manageable, and that’s where this little four-letter-bundle-of-wisdom has a sting in the tail.

Never-the-less, this particular post is all about the positive, so here are some positive things that KISS influenced me on:

Two tiers: a client tier and a database tier; that’s it.

And I have great news … this has served us well.  We did need to take on a persistent service running on an application server, simply because there are things that are global and shouldn’t be influenced by whether or not a client workstation is on or not, but that’s the extent of it.  No need for Windows Workflow Foundation, Windows Communication Foundation or Microsoft Message Queue, just a good ‘ol database.

The positive: KISS has it’s benefits—we didn’t take on some unnecessary technology because we thought we just might need it or because it has a warm-and-fuzzy enterprisey feel to it.

You Aint Gonna Need It

This one is pretty closely related to KISS, and I have to admit, my approach at the outset was quite heavily influenced by the “last responsible moment” camp.  YAGNI at a details level is helpful, even necessary, but at an architectural level it has far-reaching consequences.  I will continue to give it it’s due, but not with the same rigour as DRY, for instance, and certainly not at macro (read architectural) level.  The problem is that YAGNI dictates that you only consider what you can see, and in doing so provides a false sense of peace that it’s Ok to forge forward with inadequate knowledge of the territory.  For example: “I’d like to use objects, but damn … all this talk about ORMs—all I need is LINQ-to-Sql, right?”.  Wrong.

The positive: as a meta-lesson, YAGNI shouldn’t be applied at an architectural level, and I’ll even go so far as to say that this whole “last responsible moment” talk is somewhat irresponsible.

If Microsoft Says It, It Must Be So

I feel sorry for the tens of thousands of developers out there whose lives are bound by the Redmond Reality Distortion Field.  I know I shouldn’t need to, because the vast majority of them are in bliss and will continue to be as long as ignorance will allow them to be.  When I started this project I had come out of what was effectively six years of Java and Unix and even though prior to that I had done Windows development, I was happy to accept any guidance from the Mothership that I could get.

Sadly, experience has proven that Microsoft makes best-of-breed product in some areas, but not others.

Things where Microsoft is best-of-breed:

  1. The .NET platform;
  2. Visual Studio.

Things where Microsoft is not best-of-breed:

  1. Test-driven-development, and associated tools (MSTest);
  2. Domain-driven-design and associated tools;
  3. Build tools (MSBuild);
  4. Source-control (TFS);
  5. ORMs (EF);
  6. IoC containers (Unity);
  7. Aspect orientation (Unity).

We adopted a few of the technologies above simply because it was the “Word from Redmond”, and didn’t adopt other of the technologies above because they didn’t yet exist and Microsoft didn’t have an answer to equivalents in the Java world that had been around for years.  I’m not going to labour on this one, because too much emotional energy has already gone into it; suffice to say that I’m disappointed.

The positive: the lesson is, don’t blindly follow what may, in fact, be akin to a faith.  Question, dig deep, analyse, consider, understand and criticise before you marry yourself to a technology—I’m much wiser for this experience.

Don’t Repeat Yourself

I don’t know how much I’ve repeated this, but as ironic as it is, I’ll continue to do so.  The problem with DRY is that it’s only when you don’t observe the principle that you discover how valuable it would have been if you had observed it.  Everything we’ve done has revolved around this, including:

  1. An aspect-orientation system for applying cross-cutting functionality in a type-independent way;
  2. A single view of data and generation of types from the data model that are used by the data access code;
  3. Generation of enumerated types that have an analogue as a table in the database from a single source;
  4. A philosophy of single-responsibility at project, type and method level engendered into how we develop and refactor as a team;
  5. A philosophy of writing code that is self-documenting.

Revisiting the statement I made about knowing the value that DRY has delivered, and referring to item 2 above, we have not once had a situation where we referenced a table or column at runtime that didn’t exist in the database.

The positive: By diligently observing DRY, both at an architecture level as well as design and code level, we’ve averted a lot of potential wasted time in correcting unnecessary inconsistencies.  The real plus though, is the durability of this benefit—the fact that such potential wastages in resources, time and money won’t ever be repeated.

In Conclusion

There’s a lot of wisdom bandied about on “core values” when it comes to software architecture.  In fairness, one needs to contextualise things before one arrives at a set of values—that is, when embarking on a project, figure out where you’re at in terms of project scale, complexity, and in particular, what would be considered as acceptable as “success”.

I’ve put myself in a position now where I feel obliged to distil all of this into a neat uber-tweet of wisdom.  Now here’s the real irony … a principle that I didn’t explicitly follow, but that arguably trumps all the others—figure out what’s really important, remembering to express it in business terms and optimise for that (can you say 80/20?) Smile

Slosh

Posted in Opinion by Eric Smith on May 28, 2010

… is not a word that you’ve likely encountered too often.  It happens to be an alias for the ubiquitous backslash.

\

Now as a programmer, I’ve come to react to certain ASCII characters instinctively:

#

“Smells like a comment”

/

“Looks like a path component delimiter”

.

“Wants to be in a namespace”

\

“Careful … things aren’t as they seem”

That last one is special … it transcends ordinary meaning.  That’s because its scope extends beyond just itself … to what follows it.  And that, with little argument, is generally accepted the world-over.  The backslash character is the universal escape character.  Why then, is it so difficult for Microsoft to get in line?

Larry Osterman sums up the origins of the gross misuse of the backslash character:

Many of the DOS utilities (except for command.com) were written by IBM, and they used the "/" character as the "switch" character for their utilities (the "switch" character is the character that’s used to distinguish command line switches – on *nix, it’s the "-" character, on most DEC operating systems (including VMS, the DECSystem-20 and DECSystem-10), it’s the "/" character" (note: I’m grey on whether the "/" character came from IBM or from Microsoft – several of the original MS-DOS developers were old-hand DEC-20 developers, so it’s possible that they carried it forward from their DEC background).

The fact that the "/" character conflicted with the path character of another relatively popular operating system wasn’t particularly relevant to the original developers – after all, DOS didn’t support directories, just files in a single root directory.

Then along came DOS 2.0.  DOS 2.0 was tied to the PC/XT, whose major feature was a 10M hard disk.  IBM asked the Microsoft to add support for hard disks, and the MS-DOS developers took this as an opportunity to add support for modern file APIs – they added a whole series of handle based APIs to the system (DOS 1.0 relied on an application controlled structure called an FCB).  They also had to add support for hierarchical paths.

Now historically there have been a number of different mechanisms for providing hierarchical paths.  The DecSystem-20, for example represented directories as: "<volume>:"<"<Directory>[.<Subdirectory>">"FileName.Extension[,Version]" ("PS:<SYSTEM>MONITR.EXE,4").   VMS used a similar naming scheme, but instead of < and > characters it used [ and ] (and VMS used ";" to differentiate between versions of files).  *nix defines hierarchical paths with a simple hierarchy rooted at "/" – in *nix’s naming hierarchy, there’s no way of differentiating between files and directories, etc (this isn’t bad, btw, it just is).

For MS-DOS 2.0, the designers of DOS chose a hybrid version – they already had support for drive letters from DOS 1.0, so they needed to continue using that.  And they chose to use the *nix style method of specifying a hierarchy – instead of calling the directory out in the filename (like VMS and the DEC-20), they simply made the directory and filename indistinguishable parts of the path.

But there was a problem.  They couldn’t use the *nix form of path separator of "/", because the "/" was being used for the switch character.

So what were they to do?  They could have used the "." character like the DEC machines, but the "." character was being used to differentiate between file and extension.  So they chose the next best thing – the "\" character, which was visually similar to the "/" character.

And that’s how the "\" character was chosen.

Everywhere else we instinctively know that a backslash escapes things … everywhere … except for Windows/DOS path names.

In defence of Windows, the OS is perfectly happy to interpret slash (“/”) as it would slosh … and for the most part application software observes this.  There are occasional heretical instances where this is not true though and the application software writer has (<<shudder>>), deliberately chosen the backslash as a categorisation delimiter.

So, today I battled through an instance of this, trying to call sqlcmd.exe from within a Bash script.  Now I realise that this is largely a function of the implementation (which was re-expanding parameters multiple times before they were eventually passed to sqlcmd.exe), but oh how this would so have been a non-issue if someone, somewhere had simply respected the good ‘ol escape character and left it to only ever escape and nothing more.

What was needed in the invocation of my Bash script:

image

… to eventually get into the form that sqlcmd.exe desires:

image

What a wonderful place the world would be if we could all agree on one thing: slosh is for escaping … nothing more, nothing less.

The Skill to Learn

Posted in Opinion, Recruitment by Eric Smith on May 25, 2010

I’ve been working now for some fifteen years, and I think that for someone who’s had a bit of experience, they can honestly look back and draw some valuable conclusions on “what’s important”.  By implication, there are also a whole bunch of things that are not important.

Today, sitting with a prospective hire, I was asked what I thought of courses.  I interpreted it is a question about the value of courses, in general.  That’s a tough one.  Any academic endeavour, ultimately, is only worth it’s level of recognition, presumably by a body or group of bodies whose opinion one values.  The only truly worthwhile “course”, in my opinion, is one that you get through a reputable university.  Ultimately, it then boils down to what is meant by “reputable”.

In IT, there will always be a market for courses.  At first this statement seems obvious—after all, isn’t the way one learns how to administer an instance of Microsoft Exchange, or SharePoint, or Microsoft CRM is through a course?  Is this the only way?  If you’re a corporate charged with maintaining a level of skill within the org, it sure seems like the only way … it sure is the most expedient (at least at first glance).

The course-peddlers know this, and they exploit it.  Corporates are easily wooed by the promise of the training silver bullet.  It’s money for jam … time away from the office and the shelling out of a few shekels brings you back shiny new Exchange-ready employees the following week.  There’s a problem with this though, it’s subtle—they’re Exchange-ready … yes, but they’re not really Anything-Ready, and that’s what we should really be looking for.

Anything-Ready

Anything-Ready is a meta-concept … it’s not about anything particular that you may know, or that you may need to know, but rather how good you are getting to know what you need to know.  I’m not looking for Master Exchange Certified, or SCRUM Master Certified … I’m looking for Master UPSKILL Certified, All-Technologies-Applicable.  Now that is valuable.

Which brings me back to why I place value in a university degree.  When you go to university, you learn how to learn, and if you’re lucky enough to go on to do a post-graduate qualification, you get first exposure to Anything-Ready 101.  I’m probably being unfair in that this judgement really has been made through the lens of technical degree … can’t vouch for a BA.  Ultimately though, there’s still a whole lot of learning to do before you graduate as “Anything-Ready” and that’s the job of industry.  True Anything-Ready graduates are the product of years of industry-grind-mill, post university degree.

So .. what is Anything-Ready?

Anything-Ready

Apart from getting your daily dose of mental calisthenics, addressing the root cause means that you’re eliminating a fear—that’s because if you haven’t gotten to the nub, you will always remain in fear of what may pop up later on because you chose to shortcut grokking the problem.

You can spot a course-taker a mile away, she tends to shy away from anything not within the ambit of what has been explicitly learned—face your fear and become a polyglot.

A keen study of Anything-Ready knows that the “Ready” part needs constant attention—regular reading is a must to maintain that status.

I continue to be astounded by the number of programmers who haven’t mastered the bread-and-butter of their tools.  If I had a penny for every “senior programmer” I’ve met who can’t touch-type… Jeff Atwood says it far better than I could.  As for your editor … choose one, embrace it, get to grok it, because if you’re a programmer, it’s the canvas of your craft.

No man is an island.  Programming and Autism—two peas in a pod?   It doesn’t have to be that way.  Humans are computers … just really complex ones, and it does take a while, but every programmer eventually figures out that solving certain problems can be done an order of magnitude faster through a small human interaction than solitary toil.  So keep community, not so much for the selfishness of getting what you need quickly, but for the benefit of your own soul.

There’s no barrier to entry with programming.  Anyone with a laptop and an Internet connection can do it.  More alarming though—anyone with a laptop and an Internet connection can convince Joe Sales Exec that they can do it well.  That’s why the keeper of “Anything-Ready” faith knows how important always beats urgent.  The “sins of the fathers” is a biblical meme that applies everywhere, not least in programming—careful what decisions you make, because they form more of a legacy for those who come after you than you may think.

It’s incumbent on us to be vigilant about what’s important.  If we don’t, we’re in danger of wasting time.  Distilling what’s important is about cutting to the chase and removing waste.  It’s another way of repeating DRY (the only time you get to repeat yourself and feel good about it).  Anything-Ready-practitioners are voracious distillers of what’s important.

Someone who is Anything-Ready knows that it’s tenuous title.  They don’t call it the fire hose for nothing.  Do you know someone who can actually drink from a fire hose? (no spilling).  So keep that ego in check—there’s always someone around the corner way smarter than you.  Really.

Back to Courses

Courses aren’t all bad.  A good course will:

  • Provide a formal, structured means to refocus;
  • Expose you to people who you wouldn’t normally have had the chance to meet;
  • Put your bosses mind at ease that the “skill issue” is being addressed (definitely a soft benefit);
  • Keep the paper mills afloat.

Senior Developer Assessment: Re-aligning Expectations

Posted in Algorithms, Development, Opinion, Recruitment by Eric Smith on March 5, 2010

Today we assessed our 20th candidate for position of senior developer, using a small test that I drafted some time ago.  Since then I’ve spoken about how the test was dumbed down since one of the questions was perceived as being too difficult.  The assessment consists of three questions, which I’ve dubbed Fibonacci, Quicksort and University.  I’ve already discussed two of the questions, namely Quicksort and University.

Although not an earth-shattering sample, 20 is a number that we can start to draw pretty graphs with, so I’ve included a summary of the results alongside.

image

We can speculate about what it tells us:

  1. There’s something wrong with University; nobody has come up with a satisfactory answer;
  2. In the case of Fibonacci, there’s something wrong with the candidates; only five people have come up with a satisfactory answer.

Ok, my second conclusion above is possibly a little harsh.  I think it’s a little harsh, because upon inspection by others, it’s elicited use of some power words like “mathlete” and “math-wiz”.

Now Fibonacci is the only question that I haven’t posted for all to see, mostly because I thought it was too simplistic and therefore quite uninteresting.  Given these results though, I’ve been prompted to reword it completely.  I mean, if only math-wiz’s have what it takes to do it then I can’t make any assumptions about the background that a candidate may have in the arithmeticmaths department.

What follows is the version of Fibonacci as it was up until now:

image

I’ve taken great pains to remove anything that may constitute an implicit (read “unfair”) assumption about mathematical background and I’ve reworded the question to provide a lot more hand-holding for the candidate.  This should also help in offsetting the nervousness factor.

Here’s my updated version:

image

image

Desperately Seeking Senior

Posted in Object Orientation, Recruitment by Eric Smith on February 27, 2010

Recently I chatted about appropriate coding assessment questions for senior developers, and came to the conclusion that Solver was a little too demanding for someone to do in around twenty minutes (under pressure), so I replaced it with Quicksort.

My assessment now consists of three questions:

  1. Fibonacci.  In short, “print out” the first thirty terms of the Fibonacci sequence, any which-way;
  2. Quicksort.  Sort a simple list of names into alphabetical order, applying a crude (read: not very efficient) implementation of the Quicksort algorithm;
  3. University.  Code up a set of objects and/or interfaces that describe a simple domain model, illustrating structural relationships between the domain objects.

Here’s University:

image

At first glance, this would seem fairly easy.  The tricky bit comes in due to the fourth bullet point in the problem description: “a lecturer might also be a student”.  Now C# doesn’t support multiple implementation inheritance, and this problem calls for a design involving multiple inheritance.

I’m going to present my solution to the problem.  I need to stress though, that I’m not wholly satisfied with it because it involves a StudentLecturer type (you guessed it … a hybrid) which just doesn’t sit well with me.

In any case, this is a classic case of the Diamond inheritance pattern (see the Diamond inheritance problem), and this is how the relationships might look:

University

I’ve deliberately left Course out because it isn’t really central to the real problem, and as a result, just creates clutter.  Within the bounds of the description of this problem, this solution might be acceptable, but it’s pretty tightly coupled and promises to turn into a bit of a nightmare should we need to extend the orthogonal roles to more than just Student and Lecturer (although, off the top of my head, I can’t think of how).  Ideally though, this sort of mixin scenario is more suited to a dynamic language like Python that allows types to be defined at runtime.  Never-the-less, it makes for a worthy brain teaser when done using a statically typed language that only supports multiple interface inheritance.

All three questions comprising the assessment are required to be completed within an hour.  What I didn’t anticipate though, is the response to the questions.  After assessing twelve people, no-one has provided a satisfactory answer to University.  Is this because it’s particularly hard?  Are C# developers, in general, not as strong on the modelling front?  It’s a little perplexing.

Senior Developer Assessment Revisited

Posted in Algorithms, Development, Opinion by Eric Smith on February 20, 2010

This is really part two of the article I wrote “What is a Senior Developer?”.  I’ve received some shrill feedback on my choice of assessment problem:

  • Too math’y!
  • Standards too exacting!
  • A bit much to ask of your typical commercial developer.

So I’ve taken this all to heart and decided to revamp the Solver question.  Actually, I’ve decided to drop it completely and replace it with something a lot less “math’y” but possibly no more representative of real-world requirements.

A suggestion that was given me: “get them to sort stuff”.  Ok, so what would making someone jump through “sort algorithm” hoops prove?  After all, these days, sorting things amounts to a call to List<T>.Sort—I mean honestly, who ever needs to resort to first principles when sorting these days?  I’m willing to take a different tack—if I’m testing something slightly different, that is, not knowledge, but the ability to absorb and apply … well then that’s slightly different.  Besides … dealing with pointers is generally seen as unnecessary masochism, but some people still regard it as crucial background to being a good developer.

So this time, I’ve actually taken the time to capture the requirements in detail; this amounts to softening things up a little since the general consensus seems to be that the original assessment was too demanding (at least, the Solver question was).

image

Quicksort, above, doesn’t test ability to perform research independently, and offers a lot of hand-holding, but it is somewhat less daunting than Solver.  I can’t help thinking though that things are being dumbed-down a little too much.

The dumbing-down:

  • This is limited to System.String, but could easily be extended to be generic (bonus points if the candidate takes the initiative to do this!);
  • I haven’t specified any constraints in terms of efficiency issues (the naïve implementation is, of course, a horrible memory hog);
  • I don’t know if I could provide any more hand-holding than this, it’s practically paint-by-numbers.

I have been somewhat vague about one thing, namely choice of pivot.  I have arguably been a little tricky in this question because the example isn’t consistent in how the pivot is chosen.  The astute candidate will quickly realise that choice of pivot isn’t crucial.

Here’s my solution, coded up in approximately 20 minutes:


	public static class QuickSorter
	{
		public static IEnumerable<string> QuickSort(IEnumerable<string> jumbled)
		{
			if (jumbled.Count() < 2)
				return jumbled;
			else
			{
				return
					QuickSort(AllLessThan(jumbled.ElementAt(0), jumbled.Skip(1)))
					.Concat(jumbled.Take(1))
					.Concat(
					QuickSort(AllGreaterThan(jumbled.ElementAt(0), jumbled.Skip(1))));
			}
		}

		private static IEnumerable<string> AllLessThan(string value, IEnumerable<string> others)
		{
			return AllSatisfying(others, s => String.Compare(value, s) > 0);
		}

		private static IEnumerable<string> AllGreaterThan(string value, IEnumerable<string> others)
		{
			return AllSatisfying(others, s => String.Compare(value, s) <= 0);
		}

		private static IEnumerable<string> AllSatisfying(IEnumerable<string> others, Predicate<string> predicate)
		{
			return others.Where(s => predicate(s));
		}
	}

Things to notice about my implementation:

  • It’s pretty much declarative, thanks to Linq, of course;
  • It’s not very efficient … no in-place swapping; that’s what you get in twenty minutes.

I think that this provides a less jarring assessment experience for a would-be candidate than Solver, especially if our candidate isn’t a math-wiz.

Exposing your Applications Guts using IronPython

Posted in Administration, Development by Eric Smith on February 14, 2010

Application guts (or indeed anyone’s guts) isn’t typically on ones “list of things to see”.  Quite often though, when presented with some perplexing behaviour on live, you end up wishing that you’d added a key piece of logging code to get you to the point where you had just enough visibility to be able to solve the problem.

As it turns out, if you’re writing a .NET application, there isn’t a tremendous amount of difference between a Debug and a Release build.  That’s because, of course, the compiler isn’t spitting out real machine code, but rather MSIL, and if any kind of optimisation ever happens, it happens at JIT time.  At the end of the day, if you’re talking .NET, the difference between being able to debug your application boils down to the availability of some .pdb’s and an INI file … that’s it.

Never-the-less, you may not actually have a copy of Visual Studio or WinDbg with Son-of-Strike installed on the machine that you’re interested in poking around your badly behaving live application with.  That’s why you’d be particularly interested in employing the services of a worthy logging framework like Log4Net or the logging block from Entlib at the outset of your enterprise development stint.  Just increase your log level to “debug” and you’re a-for-away … right?

There are a couple of issues with this:

  • Typically, you need to restart your application or service to get the higher logging level into effect—maybe you have a situation where you don’t want to do that; you just want to view state?
  • What if, even on debug level, you’re not emitting the detail that you need?  At the end of the day, you’re at the mercy of the vigilance of the developer who wrote the debug entries—they may just not be enough.

How about bundling a little back-door into your app?  You could:

  • Get at it using something available on any PC, namely telnet;
  • Do anything that you would be able to do using code, only dynamically.

Enter PythonServer, available on github.  Let’s take it for a run using the (very) simple sample application:


static void Main()
{
	Console.WriteLine("Starting...");
	var theFibber = new Fibber();
	var pythonServer = new TheLimberLambda.Utils.PythonServer(2323,
		new [] { new NameBinding("fibber", theFibber) });
	pythonServer.Start();
	Console.ReadLine();
}

SimpleSample fires up a console, instantiates an instance of Fibber and starts an instance of PythonServer, listening on port 2323.

Fibber is just an implementation of IEnumerable<int> that spits out Fibonacci terms, but the key point is that it was instantiated in and lives inside the SimpleSample process.  We keep SimpleSample from exiting by waiting for input on the console.

Telnet’ing into localhost on port 2323 gives us an interactive Python command-line, so legal Python will execute as expected:

image

The real kicker here though is that we have access to our SimpleSample process, and anything that we decided to publish is available to us (in SimpleSample’s case, that would include our instance of Fibber).  Since Fibber implements IEnumerable, we can benefit from IronPython’s automatic recognition of anything IEnumerable as a Python iterator:

image

Here we’re using the itertools package that comes with Python (or in our case, IronPython) to grab the first 10 items of the Fibonacci series.

Because we’re referencing a single instance of Fibber, and because the state of “where we are” in the series is maintained, we can telnet in from a difference spot, and ask for the next two items:

image

Thus, we have a Python interface into potentially any .NET application.

Name Binding

Now, how did the name “fibber” become available to us, you may ask?  The key is the IEnumerable<NameBinding> that we passed to the PythonServer constructor.  At some point we need to provide some translation between the python namespace and the object instances of interest.  Presently, PythonServer does this using a dead simple string-to-reference map provided up-front.

Going under the bonnet and taking a squiz at the code that gets executed when a connection is made to the server, we notice the introduction of a ScriptScope instance:


		private void InitialiseScriptRuntime(Socket socket)
		{
			_ScriptRuntime.IO.SetOutput(new SocketConverserStream(socket), Encoding.ASCII);
			_ScriptRuntime.IO.SetErrorOutput(new SocketConverserStream(socket), Encoding.ASCII);
			_ScriptScope = _ScriptRuntime.CreateScope("py");
		}

… and binding names is just a matter of setting ScriptScope variables, thusly:


		private void BindScriptScopeNames(IEnumerable<NameBinding> nameBindings)
		{
			foreach (var binding in nameBindings)
				_ScriptScope.SetVariable(binding.Name, binding.Target);
		}

I will admit that PythonServer has a way to go, and could do with a whole bunch of things, including:

  • A solid security model.  At the moment PythonServer should really only be used in controlled environments since of course there is no authentication (or encryption) to speak of—SSH should fit nicely here;
  • Integration of the name binding interface into your favourite IoC container.

As a start though, this provides a great means of getting into your process in a relatively hassle-free way.

What is a Senior Developer?

Posted in Uncategorized by Eric Smith on February 9, 2010

So, at work we’re in this recruitment cycle again.  This time it’s aggressive, and we’re really after the cream-of-the-cream.  Those hard-to-find coding ninjas who generally don’t ever need to approach a recruitment agent, because of course, the second someone sniffs that they’re on the market, they’re wooed with shares and options and Wii’s and iPads and rubdowns.  It’s a mad scramble.  Did I mention that I’ve never had to approach a recruitment agent? ;)

The best way to finger these types is through someone you know who’s really good, who knows someone they worked with at some point who blew their socks off.  Unfortunately the network method has failed us—sadly it looks like all talent has gone deep underground, or left the country.  I’m partial to the latter because quite frankly, the quality of the meat that our corporate designated agent is passing our way has been found wanting … repeatedly.

But before I get ahead of myself—let’s approach this methodically, like we should a new project.

Step 1: Clearly define what we require

Enter the Senior Developer.  As expected, this is all too subjective … a quick zoot over to SO confirms our fears and leaves us unsatisfied: it depends.  I wish it was as easy as “Spanish male developer” (si señor!).

Being the “main technical peanut” using our PM’s terminology, the responsibility of defining the standard falls on my shoulders.  I do subscribe to Joel Spolsky’s “smart and gets things done”, but the definition somehow falls short … it just isn’t complete.

Drawing from that deep unknowable über-developer essence that I supposedly have access to, I therefore decree:

Senior Developer Quality 1: Professionalism

Now we all know that you can’t distil what makes a senior developer into one simple thing, but this is one of those undeniably big differentiators.  Can you do a good job, consistently?  If yes, then proceed to next assessment gate.

Senior Developer Quality 2: Intelligence

Where I come from, there seems to be this unspoken rule: “one doesn’t explicitly talk about smarts”.  Because of course, smarts is one of those things that if you don’t have, no amount of experience is ever going to improve the situation.  Let’s square up to this, for crying out aloud—if you want to be a senior developer you absolutely must be smart.  Preferably, very smart.

Senior Developer Quality 3: Passion

Another big differentiator from the unwashed Mort-cast.  You gotta wanna learn, all the time, during meals, on the can, driving to work, driving from work, on the treadmill, aside the water cooler and anywhere else you care to mention.  Senior developers have technology in their veins, they live it and breathe it.  Excitement isn’t derived from the promise of a ticket to the Super 14 final, but rather the appearance of a postal collection note for that 200MB/s write-rate SSD from Newegg.

Senior Developer Quality 4: Humility

Getting to the point of truly grokking that no matter how good you think you are, there’s always someone else out there who’s better than you are is a watershed moment.  In all honesty though, no-one enjoys an arrogant git … it just isn’t conducive to greasing the cogs of the team dynamic.  The more numerous the alpha-geeks in a team, the more critical the quality of humility becomes.

Senior Developer Quality 5: Experience

There’s a bit of cross-over here with quality 1, so let’s say that in this case we’re particularly interested in the sort of experience that gives you that technical “gut feel”.  After a number of years, the neural pathways have been set up so that you can generally “smell” whether something sounds right or it doesn’t (when interacting with colleagues), and your hunch about where problems may lie tend to be more often right than wrong.

Step 2: Screen ‘em

When it comes to finding good people, and when you don’t have the luxury of a network-enabled direct route, it boils down to a numbers game.

Spolsky advocates the phone screen, but we’ve opted for a technical assessment.  Do we really want to waste our time sitting down to chat with someone if they don’t make the bar?  So to be sure, this is an effort to weed out those who think they represent our definition of Senior Developer, but who don’t.

A simple test should suffice.  I drafted one this morning, and I’m going to publish one of the questions (with sample answer).  Now you’ll notice that the problem posed isn’t very challenging (although some of my colleagues beg to differ), but you’d be surprised at how many people who sell themselves as senior developers who can’t do it.

image

My lazy side originally opted to go for one of those shrink-wrapped multiple choice online assessments ala Brainbench.  I’m not going to mention the brand of assessment that is our corporate standard because I have nothing good to say about it—typographical errors, and code that wouldn’t compile in almost every single question?  I wasn’t impressed either.  Suffice it to say, it’s not Brainbench.  Whatever the choice of assessor though, all of these tests suffer from the same problem; they tend to test stuff that we would naturally expect to Google these days.  I don’t rate that as being particularly useful at all.

Solver, above, may seem overly mathematical and unrepresentative of typical “throw-stuff-in-a-database-and-pull-it-out-again” business requirements, but what it does do is very quickly highlight the sort of person we don’t want.

  • Do we want someone who can’t understand the question because it contains “nasty unknowable symbols”?  Even if you didn’t do maths at university, surely you did algebra at school?  That’s all you need to know.
  • Do we want someone who can’t do research on the Internet?  Jeepers, I even provided the exact Wikipedia search.  The corresponding article, predictably, contains pseudo-code for various algorithms—would you honestly need more than that?
  • Do we want someone whose brain can’t be stretched to understanding an iterative algorithm to which they haven’t previously been introduced?
  • Do we want someone who doesn’t know enough to fire up the browser and at least try “+”root-finding” +C#” on Google?  (Yes, I do check the browser history afterwards :)

Of course I’m not expecting Newton’s method here, simple bisection will suffice.  Even if our prospective senior has never attended a calculus class, I would expect her to be able to fathom this one unassisted.

Here’s my bisection code, written up and tested in 20 minutes—too much time, I might add, for what I would consider a senior who lives and breathes code:

	class Program
	{
		static void Main(string[] args)
		{
			var lhs = (Func<double,double>) (x => x * x - 3);
			var rhs = (Func<double, double>)(x => x * Math.Log(x));
			var diff = (Func<double,double>) (x => rhs(x) - lhs(x));
			const double threshold = 1e-5;

			Func<double,double,double> findRoot = null;
			findRoot =
			((l, r) =>
			{
				if (Math.Abs(l-r) < threshold)
					return l;
				var mid = (r+l) / 2;
				return (Math.Sign(diff(l)) == Math.Sign(diff(mid))) ? findRoot(mid, r) : findRoot(l, mid);
			});

			Console.WriteLine(Math.Round(findRoot(1,50), 2));
			Console.ReadLine();
		}
	}

Now the big problem with bisection, of course, is local minima, but that is a non-issue because I even provide a graph illustrating that there aren’t any local minima.

It’s time to wrap this post up, and defer description of the how testing against qualities 1 to 5 should be done to another one.  Today we presented our first senior candidate with Solver, but you can probably guess what the outcome was when I tell you that he couldn’t nail Fibonacci which was to print out the first 30 terms of the sequence of the same name.  Sad, indeed.

Follow

Get every new post delivered to your Inbox.