Although a primary challenge we’ve encountered last week is getting familiar with the project space, especially considering that there are multiple groups in this space, we’ve also gained much more clarity on the direction and goals of our project. We were able to position ourselves and define what our scope was after speaking with a few other people from the project. Instead of creating a public-facing product, we’ve pivoted to creating visualizations and perhaps a predictive model that would be (1) able to assess the robustness of the data by comparing it with astronomical tide predictions and (2) visualize and analyze summary statistics for each of the sensors (max, min, average water level) over a longer time frame to (a) assess the long-term changes of these sensors and (b) to predict whether an event is indeed unusual/interesting, and would require further human assessment, or not.
Now that we’ve got a better (but still not excellent) handle on D3, we’ve started finalizing some basic visualizations and making mock-ups of more complicated plots. One of our big successes has been the “Hurricane visualization”( created using D3 and Leaflet). We took data from temporary water level sensors that were deployed by USGS to monitor flooding during Hurricane Irma and Matthew. These sensors reported the max height that water level reached during the storms. There is also a permanent sea level sensor at Fort. Pulaski that gauges water levels for the entire Georgia coast. We created a visualization that compared the water levels at Fort. Pulaski to the observed water levels at the various sensor locations. This plot aims to communicate the need for a permanent sensor network by pointing out that water levels vary from location to location.
Using D3, we’ve also gotten started on creating basic plots of the sensors that visualize the sensor data over time in different ways (linear and radial). This will serve as the base for our exploratory plots mentioned earlier (in goal 2), but we’ve encountered some challenges in inputting the csv in a useful format as well as some of the csv data being in non-chronological order. This is where we’ll be continuing our efforts in the next week, hopefully having a multi-line plot finished that will summarize max/min/average water levels for all ~30 sensors over the course of a few months.
Finally, we got the opportunity to visit the 2019 MLSE conference, hosted by Georgia Tech (Columbia next year). This conference celebrates the interdisciplinary aspect of machine learning; data science isn’t just something for computer scientists to fawn over, but a tool for revolutionizing all fields. There were talks by scientists from STEM fields such as materials science and biomedical engineering as well as data scientists who focused on public policy and social good. In addition, the conference kicked off with a Women in Data Science Day, featuring talks and workshops focused on sharing experiences.