Sunday, April 7, 2019

Hour Analysis for GPSTC

When playing around with GPS prototypes, I developed my own analysis software to get around limitations and bugs of commonly used programs like GPS Action Replay (GPSAR; version 5.28) and GPSResults (version 6.170). Eventually, I shared my software ("GPS Speedreader") with a few similar-minded folks, so the question came up if Speedreader is accurate enough for posting to the GPS Team Challenge (GPSTC).

That lead to a validation study where I compared the results from more than 50 GPS files, most of them random downloads from ka72.com. This lead to a few interesting observations which might be of interest to anyone who posts sessions to GPSTC. Especially for the "1 hour" category, I discovered a few surprising things.

One of the tracks that stood out was the one you can download from https://www.ka72.com/Track/t/381385

Here's what the speed track looks like if you open it in GPSResults:
The first surprise was that the number GPSResults gave for one hour was lower than the 9.082 knots that ka72.com reported - only 8.466 knots! I checked in GPSAR, and it also reported 8.47 knots. Then I remembered that GPSResults used a minimum speed filter of 5 knots by default. When I disabled it by setting it to 0 knots, GPSResults reported a 1-hour speed of 9.015 knots - much closer to ka72.com. So the first big surprise was:
GPS Action Replay uses a 5 knot minimum speed filter for 1 hour!
I have always used GPSAR for analyzing my files, and never realized that. Even worse, unlike GPSResults, GPSAR does not let you disable the minimum speed setting.

There are different views of the 5-knot minimum filter in speedsurfing. When the older analysis programs were originally written almost 2 decades ago, the GPS Team Challenge did not yet exist. Speedsurfing was about top speed, 5x10 second averages, and 500 meters speed. Results for longer distances, longer times, and total distances were perhaps calculated, but mostly ignored. Furthermore, the GPS units available back then were quite prone to report speeds of 1-2 knots even when stationary. In this context, a 5-knot minimum speed filter makes sense.

But things have changed since then. Some clever Australians developed the GPS Team Challenge to get some team competition going. Having only the established disciplines 2 second, 5x10 seconds, and 500 meter disciplines would have given teams with spots like Sandy Point an unfair advantage, so they added disciplines that are better suited for other spots to even things out: alpha 500; the nautical mile (speed over 1852 meters); one hour average speed; and total distance. To win the monthly or yearly ranking, teams must post good results from all disciplines, since the overall score is computed from the ranking in each category. With 6 quite different categories and a requirement for posts from 2 sailors before it counts, it really takes a team to do well!

Besides the increased emphasis on total distance and long-distance speed on the GPSTC, another change happened over time: GPS units became more accurate. Modern GPS units like the Motion GPS rarely report speeds above 0.1 knots when stationary. Taken together, this makes a 5-knot minimum speed filter a very questionable thing indeed. A good one-hour results is above 20 knots, so it requires sailing about 40 kilometers. At most sailing areas, this requires about 20-40 jibes - sometimes even more. Sure, some of the best speedsailors in great conditions can plane through 40 jibes in a row and sail without crashing for an hour, but most of the time, one-hour runs will include jibes where we loose most speed, or periods of slogging in lulls. Why on earth should those times not count to the 1-hour average? In the example above, the slogging time was not very long, but still, including speeds below 5 knots increased the 1-hour average by more than 0.5 knots.

My conclusion from this is that I will not use GPSAR again for posting to GPSTC if there is even the slightest chance that my distance numbers matter for the monthly rankings.

But the story does not end here. In the track above, my Speedreader gave an even higher number. It selected a slightly different region, which upon close inspection made more sense. So the suspicion arose that there might be undiscovered bugs in the old code for 1-hour speed calculations. When I looked at more files, I found several examples that supported this conclusion, and even provided hints about why the older algorithms sometimes fail.

For an example, let's look at Boz' track from May 15, 2017 (downloadable at https://www.ka72.com/Track/t/305912). Here's the region GPSResults picked for the best hour:
Boz had his GPS set to a minimum speed of 5 knots, so a lot of points where the speed was lower are missing from the track. The reported 1-hour speed is 8.372 knots, corresponding to a travel distance of 15.5 km. But GPS Speedreader gave a higher speed:
By selecting an hour about 10 minutes further back, Speedreader found a region with a travel distance of 17.3 km - about one nautical mile longer! Accordingly, the 1-hour speed is about one knot higher.

Theoretically, this could be some kind of bug in Speedreader, so let's look at this region in GPSAR. First, here is what GPSAR selected as the fastest 1 hour:
 This is the roughly the same region that GPSResults picked. But is that really the region where we travel the most distance in one hour? Let's look at the track points table in GPSAR for the first valid point in the region Speedreader picked:
And for the last point:

The difference in accumulated distance is about 17.7 km, traveled in less than one hour. So this region should have been picked for the best hour in GPSAR, too!

So why did GPSResults and GPSAR (and, most likely, ka72.com) pick the wrong regions? Simple answer - because the 1-hour algorithms were never designed for tracks with a large number of missing points. The typical search algorithm will start at every valid point, find the point one hour after it, and calculate the speed for this region. But look at the graph from Speedreader: the best hour starts in the middle of the missing points! So the programs never even consider that an hour could start there. They can handle missing regions at the end of a one-hour run, but not at the start!

So why does Speedreader find this region? Mostly by luck! Speedreader was designed originally for comparing two GPS files. But if files sometimes miss some points, that can make the comparison complicated, so Speedreader simply fills in missing areas with points of speed 0. That allows it to consider every "missing point" in large missing regions as a possible start, and therefore succeeds in finding the best hour in this example.

There are other ways to also find the best hour, without having to add the interpolated point. Perhaps the easiest is to also look for hours going backwards - from every valid end point, calculate the average 1 hour speed towards the front of the file. This would find the best one hour in this example.

But at least until other GPS analysis software has incorporated such a fix, the best thing to do it to set the minimum logging speed to zero. All the examples of incorrect 1-hour results I have found were from files with a non-zero logging speed. Note that a minimum speed above zero can also affect the nautical mile or alpha results: if those happen to have a point with a speed below the threshold, the run may not be counted due to the missing point.