Thursday, January 27, 2022

Measuring Speed Errors: Sampling Frequency

This is the first of several posts where I look at the accuracy of speed measurements and the sources of errors in detail. To get started, let's look at the tracks and speed graph from an impressive speedsurfing session (click on images for a larger version):

This is the track from a recent speedsurfing session in Tasmania, Australia, available at ka72.com. The section for top speed runs right next to a sandbar is quite short: the blue line in the image is only about 220 m long. But despite the short runs, the speedsurfer hit more than 40 knots in every speed run, and reached a top speed of 44.3 knots. 

I am using this session as an example because it has some very rapid speed changes: each speed run took only about 40-50 seconds. With rapid speed changes like this, we obviously must take speed measurements often enough to get the correct speed, or our measurements will be off. In signal processing, the term "Nyquist theorem" is often used to describe this issue; "sampling rate" that are too low for the signal create "aliasing errors".

A zoom-in on the top 10 second speed run illustrates this:
The blue curve shows the original speed measurements, which were done ten times per second (at 10 hz). The red curve used only every second point of the original data, corresponding to measuring the speed at 5 hz. The green curve used just every 10th point, and shows what the data would look like at 1 hz (the rate of the Locosys GT-31 that was the "Gold standard" GPS device for many years). 

The graph shows clearly that measuring speed just once per second does not capture the details of getting faster and slowing down very well - but how much "aliasing error" do the slower sampling rates introduce? Here is what the calculated top speeds at 1 hz, 5 hz, and 10 hz look like for the categories used in the GPS Team Challenge:

Interestingly, the differences relative to the 10 hz numbers are quite small: 0.064 and 0.026 for the top 2 seconds; about 0.01 knots for the 5x10 second average; and about 0.02 knots or less for the other categories.

But looking at just one result for each categories leaves us a bit at the mercy of chance - perhaps the difference were low for the fastest 2 second run, but larger for other speed runs? So let's have a look at the fastest 5 runs in the 2 second, 10 second, and nautical mile categories:
That's a lot of numbers, but we can just plug them into a spreadsheet, calculate the differences relative to the 10 hz numbers, and then find the average and maximum differences. Here are the results for the speed session:

The average differences measured for the 5 hz data range from about 0.06 knots for 2 second runs to 0.002 knots for the nautical mile. The largest observed difference is 0.131 knots for the 4th-fastest 2-second run. Here's a zoom in of this region (again with blue = 10 hz, red = 5 hz, green = 1 hz):
Basically, most 5 hz values used in this region happened to be higher than the points that were not used, so the 5 hz average ended up higher than the 10 hz average. Over longer periods, point-to-point variations will not show the same "up-down-up-down" patterns as this region, so it will become less and less likely that the sub-samples are mostly higher or lower values, which leads to the drop in the observed differences.

The numbers above are for just one file. What happens if we look at more GPS tracks - from other spots, other people, other units? To find out, I repeated the analysis above with a total of 10 files, which include 40+ knot sessions from Albany and Lake George, as well as a number of slower sessions from other spots. Here are the results:

The numbers cover quite a range - from an average "aliasing error" of 0.004 knots when comparing 5 hz and 10 hz data for nautical mile results, to a maximum of 0.349 knots when comparing 1 hz data to 10 hz data for 2 second runs. Of the 10 files in this analysis, 4 showed differences near or above 0.2 knots for 2-second runs (when looking at the top 5 runs in each file). In other words, the chances that a 2-second result obtained from 1 hz data is off by 0.2 knots or more are quite high. 
In contrast, the observed differences between 5 hz and 10 hz are much smaller - typically around 0.02 knots. One one of the fifty runs included in this analysis had a difference of 0.131 knots; all other runs had a measured difference below 0.1 knots.

But what do these numbers mean? To put them into perspective, I looked at differences from runs that were recorded with 2 GPS units at 10 hz speeds. I included a total of 10 files from 5 different sessions, recorded by 3 different speedsurfers. Here are the results:

The next graphs compare our measured estimates of "aliasing" errors to the measured differences between 2 units - first the average differences:

In each category, the measured "aliasing error" is at least twofold lower than the observed difference between two units. The picture for the observed maximum differences is similar:
The analysis above is the first actual measurement of "aliasing error" in speedsurfing. The result indicates that, compared to the current accuracy of the best GPSTC-approved units, the aliasing error is small for 5 hz data. In contrast, the measured error for 1 hz data is larger than the typical "2 unit difference". In absolute terms, this primarily affects 2 second results, and to a lesser effect 10 second runs. 

What the analysis above did not address was the impact of sampling rate on the accuracy estimates of the final results due to random and non-random errors in the speed measurement from other sources. This is a topic that can appear very simple ("more is better"), but can actually be quite complex when looking at the underlying assumptions and error sources in detail. I plan to address some of these issues in future posts.