This week, we completed our event detection algorithm. Instead of pursuing a sine curve fitting model, we decided to use the established NOAA predicted tides as the ground-truth. We fitted those NOAA predictions to our sensor data with a horizontal and vertical shift. Using residual values of the sensor data compared to NOAA predictions as our criteria for judging “interesting-ness”, we were able to confirm that high residual values indeed corresponded to established “interesting” events that Dr. Clark informed us of. To capture a variety of interesting events, we also allowed the option of our test dataset to be either 1 hour, 1 day, or 3 days to capture both short and long term interesting events. Finally, test data with less than a certain number of points (ie, 50) were also flagged.
In the images below, the left plot shows the training set (7 days of data) and the right shows the test set (currently set to 1 day). The residual values for both are 9.4 and 4128.19, respectively. From this, we can hypothesize that the adjusted NOAA curve fit the training data relatively well, but the spike in test residuals perhaps indicates an event in the past day. Indeed, we can see that in the test plot, there is a significant spike downwards on that day.
We also made some quality of life changes to our visualizations. Earlier we were having trouble where the lines of the plot would fill lines between missing data or “fuzzy” data. We moved from a line plot to a scatter plot in response to this; the eye can fill in the pattern for the line plot anyways. We also created a better selector for the sensors, so now you can create custom groups for whatever sensors you want to compare to each other.