A hectic geek week

This week has been characterised by watching lots of videos, playing with lots of things. Hadoop, Neo4j, Java and even some MSIL.

Why so many different things? For a few weeks I’ve focussed purely on R, as it’s quite a tricky language for even seasoned coders it seems. However, a separate skill I wish to master is that of learning and applying languages/technology quickly. Once you get past the basics of coding – values, data structures, flow control, functions, OOP and abstractions – to me, the next thing other than digesting API’s (a whole black art!) is language variety. So, having started with VB and SQL, I added F# and now R. The challenge is to take each skill to a useful level of competency. Once at this stage, skill transfer becomes easier, though it’s unrealistic to become expert in all of them in anything under a few years. Not all useful tasks need grand mastery however!

This means using each language idiomatically rather than dialects of my first language:

  • F# currying, higher order functions, pattern matching: it’s functional first, multiparadigm and has great data interconnectivity.
  • R apply functions, vectors/matrices/dataframes, S3/S4 object systems and subsetting
  • SQL Set based thinking sub queries, unions, intersections etc
  • C#/VB OOP first mixed paradigm emphasising composition over inheritance etc.


Courses viewed this week:

Next Course – Starts Today

Some very nice F# blogs:

Polyglot Fizzbuzz
Second, an old CodingHorror post on candidates inspired me to mess about with FizzBuzz. It’s so silly I won’t even explain except via code – and yes, I took Camtasia vids of this as I was in geek mode: F# – my real favourite and I doubt this is the most concise:

let a = [1..100] |> List.map(function (x) ->match x with
                                           | n when x % 3 = 0 -> printf "fizz"
                                           | n when x % 5 = 0 -> printf "buzz"
                                           | _ -> printf "" )

VB – no LINQ.

Sub Main
    For i = 1 To 100
      if i Mod 3 = 0 then Console.WriteLine( "fizz" & i)
      if i Mod 5 = 0 then Console.WriteLine( "buzz" & i)
End Sub

R – about 4 mins or so:

  if (x %% 3 == 0){ print(paste("fizz", x))}
  if (x %% 5 == 0){ print(paste("buzz", x))}
x<- c(1:100 )
lapply(x, function(s) fizz(s) )

My very first Java – under 4 mins.

public class fizzbuzz {
public static void main(String[] args) {
  for(int x = 1; x <=100; x = x+1){
  if(x % 3 == 0){System. out.println("fizz" ); }
  if(x % 5 == 0){System. out.println("buzz" ); }

Computation for Data Analysis
However, just before I start my next Coursera course – Algorithms 1, I want to review aspects of the course I’ve just completed – Computation for Data Analysis. Having taken the Signature Track, I’ll be getting a certificate soon for it. The nice thing about this course is within a short space of time, you have to dig into the R language sufficiently to curse at the gnarly bits and really get into some of the reasons why it’s so widely used for statistics work.

The language itself is based on S-Plus, built round the concept of vectors and is the only language I’ve encountered so far that appears to have 3 OOP systems (S3, S4 and reference)

Yet it appears to have functional idioms:

R higher order functions: Here we return a function that can be used just like any other.

    log(base, exp)
> getlog<-makeLog(10) # Creates a function
> getlog(5)

Finally, a graph – a three panel one, showing off a bit more R.

   par(mfcol=c(1,3)) #Set panels to be 3 columns, 1 row: 3 graphs in a line.

   buildPlots<-function(plots, title){
      Col <- suppressWarnings(na.omit(as.numeric(plots)))
      Med <- median(Col)
      Mn <- mean(Col)
      PlotA<-hist(Col, main=paste(title), xlab = substitute(paste("Mean(", hat(x), ")=", nn), list(nn=Mn)))
    #Must be a list - if you vector this, it flattens!
    p<-list(outcome[,11], outcome[,17], outcome[,23])
    names<-list("Heart Attack 30 Day","Heart Failure 30 Day", "Pneumonia 30 Day")
    #This is how you get to put more than one variable into your function!
    mapply(function(x,y) buildPlots(x,y),p,names)


Polyglot demo project – First steps of a larger C#/F#/R Data exploration experiment:
This section is something new I’m starting so I’ll be keeping this page updated. If it’s successful, I’ll devote a category to it and document my findings.

Strictly speaking, I didn’t need a C# front end, however WPF is better supported in the designer at this time. Plus, the C# practice is useful. Note: you can talk to R via C#!

First the DLL that talks to R:

open FSharp.Data
open System.IO 
open RProvider
open RProvider.``base``

module DataRoutines =
    type region =
               | North  
               | South
               | MidlandsAndEast
               | London
               | NoRegion

    let regmatch sha =
            match sha with
            | "Q30"|"Q31"|"Q32" -> North , "North"
            | "Q33"|"Q34"|"Q35" -> MidlandsAndEast , "Midlands and East"
            | "Q37"|"Q38"|"Q39" -> South , "South"
            | "Q36" -> London , "London"
            | _ -> NoRegion , "Options are Q30 - Q39"

    let x = R.eval(R.parse(text="x<-rnorm(100)"))  //Here you can use R's higher order functions for one.
    let y = R.c([1..100])
    let v = R.data_frame(R.c(y),R.c(x)) //TODO: Pass in a Dictionary of names!
    let outputV = v.AsList


This post is mostly an exercise in reviewing my last 7 days as part of the reflection process when studying.

Oh and I discovered MarkDownPad

Khan Academy Goal: 150,400 / 22% since December 20th 2013 🙂

A few scribbled mind maps for my own future reference: R, F# and Neural Networks

R Course Map 1

Mind map review: F# types

]15 Mind map review: F# types

2014-01-31 07.06.46 2014-01-31 07.06.31 2014-01-31 07.05.53

Real Job Requirements – Case Studies

Skip to Challenge Project Examples

This is unashamedly my evolution of Lou Adler’s great piece about hiring the best performers:

Lou Adler – Hiring Performance Benchmarks

An old fashioned job advert often cannot fully define what you really need, yet you want to get the best people for it. What else can you try?

How about a Challenge Project based job application: to apply, you don’t need a degree, and you don’t need (N) years of experience.

The only way to apply for our job is prove you can do it.

All you need to do is produce and present a project we set for you: a real challenge that is a direct simulation of the work we need doing.

This magically weeds out anyone who’s not able or committed enough to do it. No CV/Qualification/Keyword filters – just the project.

The act of authoring the project is likely to focus the department or company on what they really need instead of the lazy shortcut of listing requirements on a job board.

Potential candidates have to prove a lot of skills: research the project and learn the skill to do it, think logically, produce and organise the content then communicate the results. Appropriate internal practices, formatting, and rules can also be provided. Team working and other soft skills can be considered at interview or for high level jobs; a deeper challenge could be designed.

There is a downside to this approach, apart from having to organise and review projects – you have no guarantee that the best candidate has any degrees or even any *years in the job.
The best candidate may have simply proven beyond doubt, they can give you what you want.


Less CV’s to work through, less candidates to phone interview and the ones you do talk to – have already proven they can do the work within the deadline and demonstrated a number of key business skills: Organisation/Planning, research, problem solving and communication

When you interview, you get them to take you through their project in detail and find points to critique; find out how they handle disagreement, countering opinions, alternative ideas or negative feedback: if they can listen, then use or debate the feedback objectively and productively, you’ve ticked off more soft skills. In addition, you can see for yourself if you and your team would work well with them.

Some projects would be suitable for a panel discussion, making an assessment of teamwork plausible. This point could be stretched for some levels of job, if the responsibility is large enough, you may choose a 3rd stage where parts of the project are implemented using key members of staff – a real test of whether the candidate has the full stack of skills they will need. All this, before they’ve had any chance to learn on the job – so assessment of each candidate would have to reflect this.

So far, without any standard Interview Questions, you’ve found out everything important regarding communication, project planning, time keeping, research, ability to learn and more about one or more candidates. As a bonus, no-one has bleeding eyeballs from dozens of CV’s or earache from listening to endless answers to cliché questions. If you’ve chosen to involve more staff, you’ve also seen how they handle the culture in your company.

Pretty reasonable so far?

Now, you and your company have a LOT more information upon which to base your gut reaction to each candidate. Gut instinct can be very powerful – its accuracy is proportional to the quantity of quality information it’s fed.

At this point, you can decide for yourself if you really needed the answers to the Standard Interview Questions.

One last thought, if commitment is what you are looking for which of these involves more real life commitment:
1)Submit a letter and CV
2)Submit a project challenge representing the real work you want done?

The two case studies on the following pages are biased towards technology roles and are based around one or more simple questions: