Alternative GoPro File Importer

I’ve been obsessed with my GoPro ever since I bought one last year, but was always a little frustrated with the process of importing and organizing files. The result is this little side project, an application that allows you to import GoPro files with a little more control than the default application allows.

Download GoPro Organizer

More specifically, the program allows you to:

– Sort media by type so that photos, videos, and timelapses are in separate directories.
– Choose whether files are stored by date taken or in a single directory.
– Rename files to the time they were taken (and choose the time format used).
– ‘Import’ files from any directory structure, allowing you to move your existing imported GoPro files into a different format.

GoPro In Action
A gratuitous example of a photo post import.

GoPro Studio separates timelapses into different directories but doesn’t do the same for photos, which makes it more difficult to create movies when you have to sort through a set of photos.

GoPro files follow a naming convention that allows the importer to differentiate between photos and timelapses (amongst other things), but it isn’t particularly helpful when you’re browsing a directory of photos. My application allows you to rename the files as they are imported, using the time they were taken as the new name of the file.

If you’ve already imported files from your GoPro’s SD card, the organizer allows ‘imports’ from any location and directory structure (provided the file names have not been changed), making it easy to re-sort files into a different format.

This solves a problem for me, and I hope it does for you! If you like it or have thoughts about other things it can do, please message me in the comments.

Download GoPro Organizer

Switching to HTTPS with Let’s Encrypt

One of my more trivial resolutions for this year was to switch to using HTTPS with Let’s Encrypt, a free certificate authority that recently entered open beta. It ended up being extremely easy, so i’d recommend it to anyone looking to make the switch.

How To Set Up

The set-up process was fairly trivial for me, running Ubuntu 14.04 with Apache 2.4. Their set-up script requires Python 2.7 or above, which caused issues for me on another system.

The first step is to follow the initial instructions on their website, to clone the repo and run the letsencrypt command to download required dependencies:

I couldn’t use the automated apache installer with my system, but the standalone mode was just as simple.

First, you have to stop your web server, to allow them to run an automatic domain validation tool (ACME, if you’re interested). Next, run this command (changing the domain):

This took less than a minute, and my web server was up and running again. From there I updated my apache configuration, shown below:

And that’s it!

 

Fixing Common PowerMock Problems

PowerMock is a great tool for static / final mocking, but exhibits odd behavior in some cases. This post covers three of the most common PowerMock problems I’ve encountered.

PowerMock
A test class annotated for PowerMock.

Intro to PowerMock

PowerMock is useful in cases that a tool like Mockito can’t handle — namely mocking final classes and static methods.

There are many reasons why you shouldn’t aspire to mocking static methods, but sometimes it can’t be avoided.

To mock with PowerMock, you have to use the @PrepareForTest annotation, which sets up the class being mocked (it dynamically loads another class to wrap it). You can also use the @PowerMockIgnore annotation to specify classes that should be ignored when the mock occurs.

This post includes some examples of PowerMock in action, which you can find here.

Common PowerMock Problems

Mocking Core Java Classes

By default, a lot of Java system classes are ignored by PowerMock, meaning you can’t mock them the way you would any other class.

To show what happens when you do this type of mock, consider this example using the java.net.NetworkInterface class.

I have one other class in the test (creatively called OtherClass), whose call takes a NetworkInterface object, calls it’s isUp() method, and returns the result:

My first test shows that the mock is created successfully by calling isUp() method from within the test method (code):

This test passes, but if you try and use this same mock in a different class, it won’t work (code):

The NetworkInterface class, as a standard Java class, is ignored by PowerMock, so the call to isUp() doesn’t result in the expected mocked call.

To fix this, you need to @PrepareForTest the class calling NetworkInterface#isUp(), OtherClassrather than the class being mocked (code):

The test now passes, and the mock is successfully executed.

@PrepareForTest disables code coverage tools for the prepared class, which is hard to fix. The only way around this problem is to not mock Java system classes, which you could do by wrapping system class and all calls to it.

OutOfMemory PermGen Errors

PowerMock seems prone to the ‘java.lang.OutOfMemoryError: PermGen space' error, because it does so much dynamic class loading. There are some useful blog posts on this (1), but they tend to be a little too optimistic about this problem being completely fixed.

There are a number of ways to fix a PermGen error:

  1. Increase your PermGen size. Since it’s only an issue with tests, this may not be such a big deal (2)
  2. Pool mocks to limit the volume of classloading (3)
  3. Avoid using mocks entirely, instead using anonymous test classes (3)

Test Class Doesn’t Run: java.lang.VerifyError

I had this issue using PowerMock in unit test classes that had a superclass.

It happens because Oracle made Java’s bytecode checking stricter in Java 7, so it can be fixed with by stopping this checking using -noverify as a JVM argument.

If you’re not the kind of person to take that as an acceptable answer (good call), I fixed this issue by updating my version of the JDK (in my case from 7_72 to 7_79). It’s not clear that one version of the JDK definitively fixed this problem (people seem to report it as being fixed before 7_72, for example), but it’s worth updating the JDK and your version of PowerMock before resorting to -noverify.

Other Resources

Continuous Integration for GitHub Java Projects

This post discusses how to setup a Java GitHub project with code coverage and continuous integration that is run automatically on each commit.

Working on the WordBrain Solver I wrote about last month, I thought it’d be interesting to set the project up to do this, and set up a few extra reporting tasks. This project manages:

  • Build. Managed by maven.
  • Code coverage. Instrumented by Jacoco.
  • Reporting. FindBugs and PMD reports.
  • Continuous Integration. Updates automatically pushed to Travis CI.
  • Code coverage reports. Published on Coveralls.

The result is a set of tags displaying the state of the project on GitHub:

Result Badges

Maven

The majority of the setup for all of these tools is in the project’s pom.xml file.

This is broken into three key sections:

  • Dependencies, which includes each of the libraries used in the project. Libraries that are only used in testing are given the test scope, meaning they are unit included in the release of the project.
  • Build, which describes where the code is located and how to build the code. This includes the plugins used to run unit tests and instrument code coverage.
  • Reporting, which includes plugins that are used to generate analysis or reports of the code. This includes the generation of JavaDoc, and plugins such as PMD and FindBugs.

Instrumenting Code Coverage

Code coverage is included in the build section of the pom. The code coverage tool i’m using, Jacoco, is executed in two phases:

The pre-unit-test phase attaches the jacoco agent to the JVM used to run the unit tests. In this execution, surefireArgLine is used to add the agent to the surefire test runner as an extra JVM parameter here.

The post-unit-test phase is run after the unit tests have completed, analyzing the jacoco output to produce a nice HTML report.
The jacoco agent is attached to the unit tests (run with surefire), by adding to the surefire argline here.

To generate code coverage results (created by default in target/jacoco-ut), run the test command:

Reporting: FindBugs and PMD

Published Results

While the code coverage tools form part of the maven build process, the maven reporting process includes plugins such as FindBugs and PMD.

In this pom, I added reporting for: JDepend, FindBugs, PMD, JavaDoc, and Surefire.

Adding them is as simple as adding the maven dependency, and running them is as simple as:

Continuous Integration: Travis CI

Travis CI Page

Travis CI is free for open source projects, so it’s a great tool to use if you have a project like this one. All it requires is a yml file specifying what the server should run (and signing up to the website):

If this file is set up correctly, you shouldn’t need anything else in your project. Travis automatically builds every time you push to your main branch.

Coverage Reporting: Coveralls

Coveralls Site

Coveralls automatically accepts coverage reports from Travis CI and displays them in nice interface.

As with Travis, it’s free for open source projects, and as with Travis, it doesn’t require much configuration.

Once you’re signed up, the only addition to the project is a maven build plugin:

Pushing updates from your local machine or another build server requires you to enter an authentication token here, but if you’re using Travis you don’t need any additional setup. The following line in the travis.yml file creates and sends the coverage report with no additional configuration:

Results

Result Badges

The result of this configuration is a pair of badges showing the current state of the build and the code coverage.

More importantly, it sets the project up for future work with a great set of tools for monitoring the state of development.

Are You A Good Estimator?

Producing accurate estimations for software projects is notoriously challenging, but why? It all starts with understanding what it takes to make a good estimate.

What is a good estimate?

An estimate is an approximation of something, implying that it is based on uncertainty. Clearly a good estimate is accurate, but since this isn’t always possible, it’s more useful if it at least encodes how uncertain we are.

If I say that a project will be completed in 4 months, it removes an important piece of information — my confidence in the estimate. It’s unlikely that the project will take exactly 4 months, but  is it a low risk project which might take between 3-5 months, or is it based on so many unknowns that it could take over a year? The estimate isn’t more useful with a narrow range if it is based on little to no understanding of the problem.

This is the point made by Steve McConnell in “Software Estimation: Demystifying the Black Art”, where he argues that the illusion of accuracy can be more dangerous for project estimation than a wide estimate. If we can acknowledge that the estimate is not solid, then we can at least start to improve our knowledge of the problem and begin to make it more accurate.

“Estimates don’t need to be perfectly accurate as much as they need to be useful.” – Steve McConnell.

How good are your estimates?

Perhaps unsurprisingly, most people overestimate their own ability to make accurate estimations.

To show this, McConnell provides a test (which you can try for yourself here), where you have to estimate the answer to 10 questions with a 90% confidence that the correct answer is in the range of your estimation.

Try it, and come back here. How did you do?

Very few people answer these questions with 90% confidence, partly because we are conditioned to believe that a good estimate is a narrow estimate.

In fact, a lot of the comments on the answers page argue that the questions are poor, because you’d have to be an expert to produce any meaningful (accurate, narrow) estimates. But this is precisely the point!

If you can answer with 90% confidence, but with a very wide range, then you are at least acknowledging that you don’t have enough knowledge to accurately answer the question.

And that’s the first step to fixing the problem.

This is a repost of a blog I wrote over on the AetherWorks Blog earlier this year.

How Zero-Conf Works

When you connect a printer to your local network, how does your computer find it?

It has to be addressable, which means it needs an IP address (and ideally a hostname), and it needs to be discoverable so that you can find it from your computer. When that works, you see a window like this:

Searching for Devices

These tasks are covered by the zero-configuration protocol, which describes how to make this work even when DHCP and DNS servers, which assign IP addresses and hostnames, are not available.

The zero-conf specification covers three broad goals, which I discuss in this post:

  1. Allow devices to obtain an IP address even in the absence of a DHCP server.
  2. Allow devices to obtain a hostname, even in the absence of a DNS server.
  3. Allow devices to advertise and search for (discover) services on the local link.

How Zero-Conf Works

1. Obtaining an IP Address

For a device to route messages on a network, it needs an IP address. This is relatively trivial if a DHCP server is available, but when this isn’t the case, zero-conf uses link-local addressing to obtain one.

The address assigned by link-local addressing is in the 169.254.0.0/16 block, which is only useful within the link (because routers won’t forward packets from devices in this address space[1]).

To obtain an address the device sends ARP requests to establish whether its desired IP address (chosen at random from the 169.254 range[2]) is available. This is a two-step process:

  1. First, an ARP probe is sent asking for the MAC address of the machine with a given IP. This probe is sent a number of times to confirm that the given IP address is not in use (there will be no response if it isn’t).
  2. If no reply is received, the device sends out an ARP announcement saying that it is now the machine with the given IP.

There are various protocols for conflict resolution of addresses that I won’t discuss here[3].

2. Obtaining a Hostname

Once a device has an IP address it can be contacted through the network, but IP addresses are volatile and for the device to be consistently discoverable it needs a hostname (which is less likely to change). Assigning as hostname is simple if you have a DNS server, but, if you don’t, zero-conf can assign hostnames using Multicast DNS.

Multicast DNS (mDNS) uses IP multicast to send broadcasts on the local link (something we wrote about recently).

To claim a local hostname with mDNS, a device sends DNS messages to probe for the uniqueness of the hostname[4]. Three queries are sent in 250ms intervals, and if no device reports using this hostname, the requesting device then sends an announce message to claim ownership.

In the case of a conflict (where two devices believe they own the same hostname), lexicographic ordering is used to determine a winner.

mDNS hostnames must use a local top-level domain (.com, .org, .gov, etc.) to distinguish them from globally accessible hosts. Apple devices and many others use the .local domain.

3. Browsing for Services

Once a device has an IP address and hostname, it can be contacted, but only if we know the name of the specific device we are looking for. If we don’t, DNS Service Discovery (also known as DNS-SD) can be used to search for services available on the local link. Rather than looking for a printer called ‘jose’, we can look for all devices supporting a print protocol (and then select ‘jose,’ if available).

What this looks like

Services are advertised in the form:

ServiceType.Domain

For example, _ipp.example.com advertises devices supporting the Internet Printing Protocol in theexample.com domain.

Individual services are identified by a specific instance name:

InstanceName.ServiceType.Domain

So our printer, ‘jose,’ would be identified as:

jose._ipp._tcp.local[5]

How this works

DNS-SD uses mDNS to announce the presence of services. To achieve this, it uses DNS PTR and SRV records.

PTR records  (pointer records) are used in lookups[6] to search for a specific service type (_ipp._tcp.local in the above example) and return the name of the SRV record for each service supporting the specified protocol (there is one PTR record for each SRV record).

A query for _ipp._tcp.local will return jose._ipp._tcp.local and all other printers supporting the IPP protocol in the local domain.

SRV records (service records), record the protocol a service supports and its address. If the PTR record is used to find devices supporting a protocol, the SRV record is used to find a specific device’s hostname.  For the printer ‘jose,’ the SRV record would contain the service type and domain, and the hostname for the printer itself:

_ipp._tcp.local | jose.local

At this point we have discovered and can connect to our printer.

In addition to this, there are various extensions to zero-conf that I don’t describe here. These include:

  • TXT records, which allow extra attributes to be recorded in the service announcement (for example, extra information needed to make a proper connection to a service).
  • Subtypes, which allow the same service to advertise different levels or types of support within a service.
  • Flagship Service Types, which enable applications to determine when the same device is supporting and announcing multiple protocols. This makes it possible to remove duplicate entries in a listing, where a device supports multiple protocols that perform the same function.

Implementations

The most commonly used implementation of zero-conf (or at least the most written about) is Apple’sBonjour.

We have used this library as the basis for our own implementation, linking in with the provided dns_sd.jar wrapper. There are various native implementations that I haven’t yet tried.

If you’d like to read more on zero-configuration, I’d recommend the O’Reilly Zero Configuration Networking book, which provides an exhaustive description of everything I’ve touched on in this post.

Other sources are included below:


[1] This is good because it stops local information from potentially clogging upstream networks.

[2] Unless it has already been on this network, in which case it uses its previously chosen address.

[3] Edgar Danielyan discusses this in an article of the Internet Protocol Journal.

[4] This is the same process as with regular unicast DNS, but with some optimizations. For example, individual queries can include multiple requests and can receive multiple responses, to limit the number of multicast requests being made.

Unicast DNS packets are limited to 512 bytes, whereas mDNS packets can have 9,000 bytes (though you can only transmit 1,472 bytes without fragmenting the message into multiple Ethernet packets).

[5] The _tcp line specifies the service runs over TCP (rather than UDP).

[6] PTR records are similar to CNAME records but they only return the name, rather than performing further processing (source).

This is a repost of a blog I wrote over on the AetherWorks Blog earlier this year.

Eclipse Cheatsheet

I created a cheat sheet for the eclipse IDE which you can find here: eclipse cheat sheet [Google Docs]

It’s intended as an accompaniment to a refresher talk I’m giving to the incoming JH class on using eclipse. They’ve been using it on and off for two years now, so the aim is to give cover some of the features they might not know about rather than the basics.