A Library of Interview Question Algorithms

I recently spent some time revising for programming interview questions, and having done all of that work I wondered what I could do with it.

The result is a git repo of interview question algorithms that I’ve structured as a somewhat tongue-in-cheek utility; the kind of utility that i’d love to be able to use in a programming interview, but the kind would probably negate the point of the interview question in the first place.

The library solves programming questions that keep popping up, whether in books or online on sites such as HackerRank. This includes solutions to the following problems:

Strings

Arrays

Graphs

Lists

Math

Sorting / Search

 

Project Euler

As an extra bit of practice, I also created a repo of solutions to Project Euler problems. These were mostly based on the HackerRank version of the problems, which typically require the most efficient solution to the problem rather than a single solution to a given input.

Programming Resolutions for 2016

Inspired by Matt Might and others, I’ve decided to create some resolutions for 2016. I’m going to leave out the non-technical ones though, so while ‘Stay Healthy’ is out, it doesn’t mean i’m not trying!

  1. Write more. Starting now!
  2. Get better with the command line. This is the epitome of tools that I use all the time, but nowhere near as efficiently as I should.
  3. Learn Go. Of the all the new(-ish) trendy languages, this is the one that I feel i’m missing out on the most.
  4. Read more recent research. Since finishing my PhD I haven’t kept up-to-date with the latest and greatest from the research community.
  5. Make use of my spie charts. I have a couple of ideas for the spie chart implementation i wrote last year, but haven’t done anything with them.
  6. Secure this site. Since Let’s Encrypt entered open beta, i’ve been interested in trying it out (status: achieved).

And that’s it… for now. If everything goes to plan, you should see posts coming throughout the year; if it doesn’t, damn.

Writing a Huffman Encoder with Java 8 and Dagger 2

Last year I wrote a Huffman Encoder to try out Dagger 2, and to experiment with Java 8, as we’d just started using it at work.

As a dependency injection tool, the advantage of Dagger is that it generates dependencies at compile time, meaning it doesn’t require reflection and errors should be easier to spot up-front.

In practice, I had a hard time getting it to work until I found this great post on using annotation processors in eclipse. This makes sure eclipse does all of the annotation processing that Dagger requires at compile time, though I found I had to manually build the project to get this to work (rather than relying on save-and-build).

There aren’t many component parts to the program, so Dagger doesn’t have to do much. The project is possibly most useful as a basic skeleton showing Dagger 2 working with maven and JUnit tests.

Creating a WordBrain Solver

I became quietly obsessed with an iOS game called WordBrain over the holidays, and thought it’d be interesting to write some code to solve it’s puzzles (you can find the result here).

basic-example
A basic single-word game.

The game itself is similar to a typical word search puzzle where, given a grid of letters, you have to find the words in the grid.

It differs in that words don’t have to be found in straight lines, so individual letters can be joined across diagonals, and as each word is found, it is removed from the grid and the remaining letters fall down. This means that for harder puzzles it isn’t possible to connect the letters in the last word unless you have found the preceding word(s).

Before starting, you know how long each word is and the order in which words must be found (as the 5 word puzzle below illustrates).

A more elaborate multiple-word game.

Solution

My solution started out as a simple word finder — finding all words of a certain length — but evolved into a more complete puzzle solver.

First, given a grid and a set of word lengths, the solver comprehensively scans the the grid for words matching against the first word’s length. It then recursively scans for the subsequent words, based on the updated grid (which is created by removing the preceding word(s)). This creates a set of potential solutions to the puzzle.

To speed up search, I created a trie dictionary to allow the solver to terminate when a non-english prefix is found.

Then, to narrow down the set of possible solutions, I used a corpus of English word frequencies to rank them by the most commonly used set of words. This tends to match the expected solution, because WordBrain uses fairly common words.

Finally, I created a Swing UI to make it easier to enter problems and view solutions.

Analysis

The trie solution is significantly faster than the alternative, an in-memory set of words that didn’t allow early termination based on prefixes.

The solver itself could be slightly more space efficient, because it creates new a string through each recursive call rather than modifying a character array. However, as the search continues onto successive words, some of this copying is necessary (the updated grid, the words found in each solution, etc.).

The Swing UI isn’t particularly well written, but it does display everything required for a solution. First, potential solutions are shown next to the grid:

Potential Solutions
Potential Solutions

Next, for a particular solution, the highlighted word is shown in a decreasing shade to show the order in which letters must be selected:

Solution Illustration
Solution Illustration

Overall, this was a relatively interesting side-project, with a few things such as the trie and solution sorting which added a little depth to the implementation.

That said, I’d recommend completely ignoring this code and playing WordBrain unassisted. It’s a lot of fun!

Spie Charts Explained (+ Chart.JS Add-On)

I recently discovered spie charts, and thought it’d be interesting to try to create one in javascript… so here we are! This post introduces spie charts and discusses a spie chart extension I created for chart.js.

You can find the working code for the chart.js extension on GitHub.

What is a Spie Chart?

A spie chart is an overloaded version of a pie chart, where you can vary the height of each segment along with its width. This enhancement enables various things, such as the comparison of a dataset in two different states, or the addition of extra dimensions to a single dataset.

The next two sections use examples from the original paper on spie charts to illustrate these uses (though my data is totally made up).

Type A: Comparison of States

To compare a dataset in two states, the original dataset can be segmented in a regular pie chart, with the difference between this and the next state illustrated by the varying height of each segment.

Imagine government expenditure for two distinct years charted in separate pie charts. The change between the two charts is so marginal that it is difficult to determine what changed.

With a spie chart, expenditure in one year can be represented by the width (angle) of each segment in the chart, and the change in spending for the next year can represented by the height (radius) of each segment. This allows you to visualize both the relative weight of spending in each sector, and how that spending has changed over time:

In this example, some segments are broken down into multiple slices, where the outer slice illustrates how much spending has been reduced between the two reports.

Type B: Additional Dimensions

Alternatively, the height can be used to show a size relative to one value, while the width shows the same size relative to another.

In the following example (again based on one in the Feitelson paper), the spie chart conveys road casualties by age and sex. The height of each segment is used to indicate the number of casualties relative to the population size of that segment, while the width is used to indicate the proportion of casualties between age ranges (segments).

This example also shows how slices can be used to convey even more information, such as a breakdown of casualties in a segment between pedestrians, car riders, and (motor)cyclists.

The key characteristics of the spie chart in these examples are the ability to vary both the height and width of each segment, along with the ability to slice each segment into distinct sections.

Creating a Spie Chart

I am no expert with JavaScript, so creating the spie chart was an interesting process. If you want to skip that and jump straight to code, you can find it on GitHub, with examples here, and code here.

I decided to use chart.js as the basis for the chart, which proved to be a good decision, as it’s nicely written and extremely extendible.

Chart.JS Structure

Chart.js consists of a core file — named Chart.Core.js — and several others, with one for each type of chart supported. I decided to copy and extend the polar area chart, since it allows for pie-like segments that have different heights (though each segment has the same width).

Each of the chart type files has the same general structure, but i’ll discuss the Chart.PolarArea.js file here.

At the top of the file, a defaultConfig object provides defaults to be used for the chart’s display, including whether a scale is shown, whether animations are used, and the specifics of these features.

The majority of the code is contained within an extension of the Chart.Type object, which is the base class for all of the chart types. For a polar area chart, this contains an array of segments (the SegmentArc type), and set-up code for displaying a scale and tooltips.

The key functions in the PolarArea chart type are:

  • addData(), through which incoming data is passed from the calling function, and parsed into individual SegmentArc objects.
  • draw(), which is (somewhat obviously) called to draw the chart. When an animation is used for the chart, this function is called repeatedly, with the size of each segment gradually increasing as a result of a call to an easing function.

Each call to draw() adjusts the size of each segment by changing the start and end angle of the segment, and then the draw() function for each segment is called. This draw function is implemented in the Chart.Arc type in the core file and is common to pie charts and polar area charts.

Writing the Spie Chart

The basis of my spie chart is the existing polar area chart, modified to support variable width segments and slices. The Chart.Core.js file remains un-altered.

To begin, I updated the expected input format of the chart to allow for each of these changes.

The input format looked like this for a given segment:

But now looks like this:

I created a new Chart.Arc type, called SliceArc, and updated the addData() function to parse the incoming slices into this object (meaning the SegmentArc object contains many SliceArc objects).

The draw() function is significantly updated, with the main change being that each slice needs to be updated as the chart is drawn, rather than each segment (essentially, more looping).

The SegmentArc.draw() function is overridden, and now draws individual slices in a segment rather than the whole segment in one call. (I think this logic should be moved to the SliceArc call itself, which may make it possible to call the original Arc.draw() function without modifying it)

Tooltips were particuarly challenging to fix, because the existing inRange() and tooltipPosition() functions are very much designed around the location and position of the segment, and not slices. To avoid modifying the showTooltip() function (which is relatively large and in Chart.Core.js [called here]), I pass a slice object and make it look like a segment object (this is more than a little hacky).

Both share the same base class, so this in some ways reflects the extensible design of chart.js, but it involved adding some state to the slices that I don’t think would have otherwise been duplicated from the segment object.

Finally, I updated the tooltips template, which now iterates through slices as well as segments, and highlights slices on hover rather than segments (an implementation of which is shown in the spie-enhanced.html example).

Thoughts on Solution

Now that I have a better understanding of the structure of the codebase, I think I could do a better job modularizing certain functions and re-using existing core components. That said, i’m impressed how easy it was to create the new chart, and how little new code had to be written.

I thought about trying to allow the chart to behave like a multi-level pie chart, but there are already solutions for this, and adding an extra dimension would make things even messier.

Creating a Java REST API With Apache Jersey

In our latest development cycle we’ve been working on creating an official API to manage and control AetherStore. As part of this process I’ve been experimenting with Jersey 2, the Java reference implementation for JAX-RS, the Java API for RESTful Web Services.

This post discusses an example API using Jersey. The example is itself fairly well commented, explaining why certain pieces of code are needed, and how they relate to the project. This post covers the structure of the project and discusses some of its more interesting features.

The example is of an API representing a set, which allows:

  • Strings to be added as part of a GET request parameter
  • Strings to be added as part of a PUT request body.
  • The set of all strings stored to be returned.

If you’re using Jersey, it’s important to note that a lot of examples online use Jersey v1, which causes problems because this version of Jersey uses an entirely different namespace — v1 usescom.sun.jersey, whereas v2 uses org.glassfish.jersey, and a number of classes are either named differently, or are in different sub-packages.

The code on GitHub should work straight out of the box if you’re using Eclipse for Java EE.

Areas Covered

The code is useful if you’re interested in one of the following features in relation to jersey:

  • Running in eclipse.
  • Setting up appropriate Jersey Maven dependencies.
  • Setting up your web.xml to work with Jersey.
  • Creating a basic API.
  • Using JSON to wrap requests and responses.
  • Using an exception mapper for more readable error handling.
  • Injecting dependencies into resource classes (the API classes).
  • Unit testing Jersey.
  • Mocking calls used by our Jersey API.
  • Unmarshalling responses from a Jersey API.

Reading the Code

This section describes how the code is structured at a high level. There are some comments on specific lines of code, but I’d recommend looking at the comments in the code for a closer look at individual features.

The Apache Jersey Core

At the core of a Jersey application is the pom.xml file, which specifies all of the dependencies in the code — including Jersey itself — and the versions being used. If you’re using another example, it’s important to note the version of Jersey you’re using. Here we are using Jersey 2.6:

The web.xml file specifies how your servlet is named, and where the main Jersey Application class is. This Application class is used to start your Jersey servlet.

In this example our application class is called SetApplication. It registers bindings for a few classes, which we’ll discuss later. At this point the most relevant call to our application is:

This tells the servlet container where to look for the resource classes that form the API. In this case our resource class is called SetResource.

SetResource (API) Class

To recap, this project implements an API which supports two calls to add a string to a set (/set/add/<value> and /set/add), and a single call to get all entries in the set (/set/get). The class definition for this class annotated (shown below), to set the base path of the API call to be /set:

Then the methods in this class are further annotated to describe the API calls under this path. For example, the following code is executed when a /set/add/{value} request is made.

The full call to this method will be /set/add/{value}, where value is a variable that is mapped to the value parameter of the method as a result of the @PathParam annotation. We want the call to consume and produce a JSON response, so we specify this in the @Produces annotation. The marshalling to JSON is handled automatically, but we have to specify the marshaller dependency in thepom.xml file, as follows:

There are two methods which include the /add path, but they accept different numbers of inputs so there is no conflict. The class also contains an @Inject annotation, which tells the container to inject a dependency into the callHandler field. If we look back at the SetApplication class, we can see that this value is injected by registering it with the application through the register() call:

Unit Tests

I’ve written two unit test classes. One, SetApiTests provides what are essentially end-to-end integration tests, which call the API and check that its operations perform as expected. The second,MockedSetApiTests provides an example of using mocking to test just the API calls themselves. Both test classes extend JerseyTest, which handles the heavy lifting of setting up a servlet container and providing the API. To run correctly, JerseyTest requires a test framework provider, which is the servlet container used to run the test. In this example I’ve used the jetty container, where the dependency is specified in the pom.xml class with the following code:

In the MockedSetApiTests class, the SetCallHandler, which manages the logic behind the API call (and is injected), is mocked out:

The configure call is required by JerseyTest to properly configure the application, which in this case requires us to pass the mocked dependency so that it can be injected into the SetResource. In the tests themselves, the call to the API is relatively simple:

This makes the API call and returns the result (whether it is a success or failure to theresponseWrapper). This can then be queried to establish whether the call was successful (by getting the HTTP response code):

If successful, we can then obtain the returned value:

The test addMultipleSingleCall shows an example of an API request with a message body (in this case a set of Strings), also showing how to package up this request parameter in the test:

The examples in this post don’t include any parameters that are non-standard Java types, but doing so is relatively simple. By default only the public fields in the class are serialized, and a default constructor is required, but the class doesn’t have to implement serializable.

How To Run

To run this example in Eclipse for Java EE:

  1. Download / clone the code from GitHub.
  2. In Eclispe, go to File -> New -> Java Project
  3. Untick ‘Use default location‘ and navigate to the path of the jersey2-example repository.
  4. Re-tick the ‘Use default location‘ option, which sets up the project name as jersey2-example.
  5. Click finish to create the project.
  6. To run, either run the unit tests in JUnit, or right-click on the project and select Run as -> Run on Server.

If you want to create your own eclipse project, you can follow this example, but note that this is for Jersey v1, so you need to adjust the options used in steps 3 and 5 (i’d compare them to the example in my GitHub repo).

To run standalone (the previous steps are required):

  1. Right-click on the project, Export -> WAR File.
  2. Set the location to store the WAR file, and change the specified server runtime if necessary.
  3. Run the WAR in your favorite application server, or standalone with Jetty Runner.

Additional Resources

The following links are resources I found useful in writing this example. Where the examples refer to v1 of Jersey I’ve said so — examples are included because some part of them is useful, but be careful to note places where v1 specific code is used. This includes anywhere where a com.sun.jersey namespace is used.

This is a repost of a blog I wrote over on the AetherWorks Blog earlier this year.

CrossRef++ (A Microsoft Word Add-in)

This is a Word 2010 that replaces Word’s in-built ‘add cross-reference’ tool. Why? — because it has annoyed me greatly whilst writing my thesis!

Installer (EXE Version).

Source code as ZIP, on Github.

Please note that this only works in Word 2010, not on any earlier versions.

Why this is ‘needed’

The standard word tool (pictured below) quickly becomes tedious to use in large documents for a number of reasons:

  • You have to switch between references for things (like figures and numbered items) constantly, and you have to use a drop-down box to do this.
  • It doesn’t remember your last referenced item, so if you’re constantly referencing a figure that is two pages down the list, you constantly have to scroll down to that reference.
  • It doesn’t remember the size of your reference box, so even if you have a massive monitor, you can only ever use a tiny fraction of it to search for references.

What CrossRef++ Does Differently

  • References are displayed in a task pane, which typically stretches the length of a screen, but can be moved around as well.
  • It remembers (roughly) where your last used reference was for each type (figure, numbered item, etc.), so you don’t have to scroll as much as before.
  • It provides a few big buttons to at the top to change reference type, which makes it quicker to change.

What it Doesn’t Yet Do

  • It doesn’t support all types of references (for endnotes and other things you need to use the old tool).
  • It doesn’t handle re-sizing the task pane well.
  • It doesn’t allow you to search through references, though I’d like to do that eventually.

Using Windows Management Instrumentation (WMI) in .NET

A few weeks ago I posted an article about performance counters in .NET, which discussed how to get measurements on things such as CPU and memory utilization. Perhaps the biggest limitation of this class is the inability to get static information about the machine on which the program is running. For example, what if you want to find out how many processors a machine has, and the speed of each? Or if you need to get the IP address associated with a network interface?

Windows Management Instrumentation comprises a set of classes which allow you to query the system on anything from processors, to printers, to keyboard drivers.Essentially everything installed on the system can be queried as long as you know the correct command. This post provides some examples of WMI being used and links to the relevant MSDN sources which state what can be obtained in full.

Note: For the following examples to work make sure you import System.Management in your Visual Studio project.

MAKING A QUERY

In WMI all results are accesed through SQL-like queries. For example, to find how much phsyical memory a computer has a query object is created, as follows:

For a list of the attributes that can be selected from this look here.

Now we have a query object we can execute a query using ManagementObjectSearcher:

You can iterate over this collection as you would any other. The following code just prints out the amount of phyiscal memory on each piece of memory.

The ManagementObject has a number of properties (listed on the MSDN page) which can be retrieved. In this example the conversion to an integer value is completely unnecessary, but I’ve left it in because it becomes important when storing the value, rather than just printing it out.

WHAT QUERIES ARE POSSIBLE?

To see what classes can be queried have a look at the left of this MSDN page. Any part of the local configuration can be queried, making WMI a very powerful tool. However none of this is particularly efficient – you’ll notice a delay when executing some queries, so make sure you only do what’s necessary and cache the results.

If you’re looking for more dynamic results (such as CPU Utilization) I’d recommend looking at my post on Performance Counters.

.NET Performance Counters

This post discusses .NET Performance counters and their use in monitoring resource utilization.

.NET provides various classes for monitoring the resource utilization of the CPU, memory, disk and network cards. Sadly, the documentation is particularly vague with respect to what can be monitored and how this can be done. This article describes basic use of performance counters and gives a sample class (see end of text) which displays some of these on the command line.

PERFORMANCE COUNTERS
The PerformanceCounter class is used as the basis for all monitoring. This can be set up as follows:

This constructor takes three parameters:

  • categoryName refers to the type of resource to be monitored. The following values are permissible on all machines (you can also create your own custom counters):
    • Processor
    • System
    • Thread
    • Memory
    • LogicalDisk / PhysicalDisk
    • Network Interface
  • counterName is the name of a specific counter within that category. For example, if monitoring the processor % Processor Time specifies that we want to look at the percentage of time the processor was busy during the sample period. The following pages list counters for each of the resources provided by Microsoft:
  • InstanceName specifies the instance of the resource being monitored. For example, when monitoring disk utilization you may want to specify which drive is to be monitored. Alternatively, to monitor all instances just enter _Total.

SOME EXAMPLES

Measures the percentage of time the processor was busy during the sample period.

Shows the amount of physical memory available to the processor (in megabytes).

Shows the percentage of free disk space (over all logical disks).

An estimate of the current bandwidth of the network interface. The instanceName in this case must be the name of the network card being monitored.

If you don’t know the name of your network card the following code may be helpful. It returns a string array of all the network cards on your local machine:

The same method can be used to find the names of logical disks or other devices on the local machine – just replace the Network Interface string with the category of your choice.

OBTAINING MEASUREMENTS
Now that we’ve instantiated our monitors we need some method of obtaining samples. For example, the following code extracts a sample from our processor counter:

There are a number of other fields and methods that may be of interest. To get a human readable description of the counter being used use the CounterHelp field:

FURTHER READING
This article covers the basics of the PerformanceCounter class. To see it used in practise check out some of these examples: