GwinNETTwork: Week 9

What an amazing experience this program has been! We’re so grateful for getting to meet new people and learn new things as part of this program.

We are Angela Lau (Cornell University ’22), Jason Chen (Purdue University ’20), and David Li (Stony Brook University ’19). Our team, nicknamed GwinNETTwork, worked on the Connected Vehicle Technology Master Plan based in Gwinnett County, the second largest county in the state of Georgia. Traffic in the Atlanta metro area can get extremely congested. Therefore, the long term goal of the project was to connect vehicles and traffic signals to reduce this congestion. The portion of the project that we worked on focused on emergency vehicles under Gwinnett County Fire and Emergency Services, and we mainly focused on analyzing data from on-board sensors (vehicle location, time, speed, bearing values, etc.) positioned in the emergency vehicles themselves to see where they might experience delays when responding to emergency situations.

Upon receiving the collected data and doing initial visualizations of the data points, we wrote an initial series of Python scripts to filter out points that were deviating from fire truck routes, points that were not within range of an intersection, points receding from an intersection, and taking into account when an emergency vehicle may turn on an intersection. Finally, we took some first steps on working with the data from the traffic signal sensors. Another series of scripts were generated to obtain signal status at the time of intersection approaches by emergency vehicles, as well as to obtain the slowest approaches on each signal color (red, yellow, green) by average speed. In addition to consulting our advisor, Dr. Angshuman Guin from Civil and Environmental Engineering, we also consulted Gwinnett County firefighters midway through the summer to get their perspective as to how certain emergency scenarios may be approached.

To wrap up our program, we prepared a presentation and poster detailing what we achieved in these past ten weeks, as well as where the project can go in the future, and gave our final presentations this Wednesday night. The results are available for display at this website (, where we use two Leaflet.js bubble maps to display average speed and delays at intersections throughout the county.  In the coming weeks, we hope to continue working on associating the signal data with the GPS data before handing off the project to Gwinnett County.  This project definitely has the potential to be applicable to solving traffic problems in the entire metropolitan Atlanta area and we’re excited to see where it can potentially go.

As we finish up this final blog post of the 2019 program, we would like to thank our advisor, Dr. Angshuman Guin, and the graduate student working on the Georgia Smart Communities Corp, Zixiu Fu, for their endless help and support this summer. We would also like to thank Gwinnett County for providing this project and allowing us to visit one of the fire stations to get insight into intersection approaches that was used to help us when filtering out data points. Lastly, we would like to give a big thanks to the directors of the Civic Data Science program, Ellen and Chris, for their support and giving us the opportunity to participate in this experience. From every day working alongside each other to every project meeting to every intern outing, it’s safe to say it’s been a great ride these past ten weeks at the very least. Thank you all for such an amazing summer, and best of luck with the upcoming school year!

Fun times with our entire group!

FloodBud Week 9

Hello potential CDS intern! We are the FloodBud team, with Maddie Carlini (Colby ‘21), Kutub Gandhi (Rice ‘20), and Jade Wu (UNC Chapel Hill ‘20) and we worked as part of the Smart Sea Level Sensors (SLS) team in the City of Savannah and Chatham County. The complex geography of Chatham County causes intricate differences in flooding patterns across the region, affecting local residents in different ways. So far, a single tide gauge at Ft. Pulaski has been used to monitor the entire Georgia coastline. However, this one gauge does not provide enough granularity to capture trends in flooding across the entire region. The SLS project is working to quantify the intricacies of this tidal system. 

Over the past year, the SLS  team has deployed over 30 sensors across Chatham County, with a goal of over 100 installed by 2020. Each sensor records water level, air temperature, and barometric pressure measurements every five minutes. As the sensor network expands, there is a growing need to monitor and maintain the sensors. Thus far, researchers must manually check the datastream for each sensor to assess its health. This process can become inefficient and time consuming. We aim to devise tools that will (1) streamline the monitoring of the sensor network for researchers and (2) make the sensor data more accessible to the public. 

In order to assist with sensor maintenance, we built an anomaly detection model to flag sensors that are outputting anomalous data. To give a primer on what our data looks like, refer to the plot below. The top graph displays normal, health sensor data, with normal fluctuations in the high and low tide. The highlighted chunks in the bottom plot show examples of anomalous data that we would want our model to detect. 

The issues with the sensors fall into two main categories: issues with the sensor itself and environmental signals. Issues with the sensor involves fuzzy, incorrect, or missing data points. On the other hand, environmental signals are more complex and are caused by weather events. Our model aims to flag both kinds of anomalies. 

The general idea of our model is to take Ft. Pulaski water level readings and apply vertical and phase shifts based of off each sensors’ output to create a unique fitted function for each sensor. We then use the past three days as our testing period. We calculate the least squared difference between the testing period data and the fitted prediction function for each sensor. The testing comparison are made over one hour, one day, and three day time windows. We then flag sensors if the test errors for any of the time windows are above a set threshold. 

We came up with two main deliverables to answer our research questions. To address sensor maintenance, we created an email alert service that sends our research mentor the sensors that are currently flagged for being anomalous. The email includes the names of the seniors sorted into groups for type of anomaly. In addition, plots of the sensors for the flagged time period are attached so that a human can check for what kind of event occurred. 

Our second deliverable is a public facing website to host the data visualization and exploration tools that we created. 

The website connects to our anomaly detection model by displaying the current day’s flagged sensors. The user can also see what kind of anomaly each sensor was flagged for. 

Thank you for tuning in! Hopefully by the time you read this, our website will be up and running 🙂


Albany Hub: Week 9

Today marks the completion of the CDS program. Boy, does time fly.

It feels like just yesterday we met with our PI, Dr. Asensio, to go over initial designs of the database we constructed over the last 10 weeks. From the beginning, the objective of this project was to build a comprehensive database to help city officials and Georgia Tech evaluate the impacts of housing investment on utility consumption. The main challenge we faced was that city data were spread across many different departments and entities, many of which had different data entry practices. We also obtained a lot of the data from sources outside the city, such as the Census, NOAA, and a private real estate data company, since this information is not housed within Albany’s databases. Collecting this data turned out to be a bigger challenge than expected, as each dataset posed unique challenges related to access, standardization, or volume.

To wrangle all these disparate datasets into a workable structure, much of our work this summer focused on using automatic processing methods to merge data and evaluate performance in new ways that were not previously possible. This involved standardizing housing addresses within Albany (spelling, street endings, cardinal directions), geocoding all those addresses, parsing data from HUD reports, converting datasets to time series format, and then linking all of these datasets into a relational database structure. In the end, we were able to build a SQL Server database hosted on Azure that links information on utilities, taxes, each housing project, Census data at the block group and tract levels, weather, and real estate information. We used Python to clean and merge the data, ArcGIS for some spatial exploration, and RStudio for preliminary analysis. While we didn’t come away with many tangible insights to share with the city, we created the infrastructure necessary to transition into the analysis phase of the project. The data have come a long way, and we can’t thank everyone involved with the CDS program enough for giving us the opportunity to work with real data that will be used to make a significant impact.

We presented our work in its final form to ESRI and Albany city officials on Wednesday ahead of the CDS end-of-program presentations that same day. They were excited to see the work we had done, and were interested in scheduling a meeting with city officials. We can’t wait to hand the database off to see what kinds of stories will be told. All of our scripts are commented, our process has been documented, and we have constructed a visual schema and data dictionary for the database. This will allow the city to more easily maintain the database and add data in the future. Hopefully, the database will help the city make better-informed policy decisions and initiate conversations between the city and its citizens in the future regarding energy efficiency, housing investment, and neighborhood blight.

GwinNETTwork: Week 8

Time flies—our last week is coming in soon (next week!), but we’re not ready to leave yet 🙁

This week, we wrapped up working on the histogram and began associating signal data with the fire truck data.

To wrap up the program, we designed this year’s final t-shirts and stickers.

Our t-shirt logo/program sticker design
The designs for all of the projects to be put on the back of the shirt

Meanwhile, we have been filtering through the most updated data in the server to update our website maps (speed and delay) and to generate our new histogram. In the process, we’ve run through many bugs and problems that has made the process lengthier than expected. The good thing is that we finally finished these this week!

This weekend, we’ll be taking a group trip to Sweetwater Creek State Park for a picnic and hike!

FloodBud Week 8

Things are coming together!


Our visualizations are going on their own website and we have an email system that updates the maintenance crew on the sensors that are causing trouble. While they aren’t fully pat and polished, they are extremely close. In addition, we have cleaned up or visualizations and added further interactivity to the plots. Our webpage now includes a leaflet map that displays circle icons for each sensor for added spatial information. The user can click on a sensor’s icon to toggle plotting the data. The icons also change from outlines to solid to indicate which are currently plotted. The tool can be used by Dr. Clark and citizens alike, including exploration features that can further highlight and inform people of coastal flooding in Georgia. 

Our main challenge is putting our materials on a server to be accessible to the public – our final product will look something like what’s pictured above. 


With our final week coming up, we are planning to make our final presentation and poster for the showcare Wednesday afternoon. After that, we hope to only have documentation and commenting left to clean up! 


Albany Hub: Week 8

This week has centered around preparing the database to be shipped off to ESRI and the Albany team. We’ve written documentation, commented our code, and compiled our many cleaning scripts into larger chunks that anyone could run to reproduce our results. We were lucky enough to meet with Clayton Feustel, a Ph.D. student working with Amanda and Ellen on a similar housing database for Atlanta. He reviewed our current database and suggested some future directions in terms of documentation and databasing best practices.

We also added on to the Census table, per Albany’s request. Our database now contains data as far as back as 2009 as opposed to a single year and demographic measures for each block group such as race and age. Including data over multiple years will allow us to evaluate the success of the housing programs over time and improve the accuracy of our analysis.

To accurately conduct analysis, we must also normalize utility consumption by fluctuations in weather, or control for the weather. If we take the data at face value, there may be instances of high consumption of gas (for heating) or electricity (for cooling) that are the result of a particularly cold or hot month. Thus, extremes or anomalies in weather must be accounted for in the consumption of utilities. Without this, we would be misinterpreting our data and could not say confidently whether a housing project actually made a difference in utility consumption. At the moment, we’re still working on a weather normalization process. Many utility management companies test changes in consumption of a single house by setting a base year for weather and consumption and applying adjustments to the consumption data of future years they’d like to compare. So far, this is the strategy we’ve been thinking about — we just need to generalize it. This is what we will continue to work towards for weather normalization.

In the final week of the CDS program, we will conduct preliminary analysis now that the database is largely complete and prepare a poster that details some of our findings as well as the process we went through to construct the database. Our goal with the preliminary analysis is to show what kinds of programs receive the most funding and the return, if any, of housing investment on utility consumption. We look forward to finally using the database to tell a story about the effectiveness of federal housing policies in Albany.

Missing our time in Albany!

GwinNETTwork: Week 7

While the other teams were taking overnight visits to their sites earlier this week, we continued working back at Georgia Tech.

In the first half of the week, we met a couple of times with Zixiu Fu, the master’s student working on the project, to get more information on the signal data. Currently we are processing the data for him to understand how the traffic light data works, and to see if we can tell when the fire truck passed through the traffic light intersection based off of the traffic light’s behavior. Memory errors and slow runtimes, due to the size of the data, required us to create more scripts to break up these files into smaller chunks to access the data stored inside.

We also received emergency logs from Dr. Guin, showing us where the fire trucks are located. We geocoded the fire truck locations based off of the given addresses to figure out where the logged emergencies occurred. With this, we hope to connect the routes that we already analyzed to emergency responses—helping us determine when a fire truck is responding to an emergency.

On Thursday, we took a day trip to one of the Gwinnett County fire stations with Dr. Guin. At the station, we hoped to better understand the firefighter’s perspectives and procedures when responding to emergencies. We asked about how they how they handle approaching an intersection, how fast they generally travel when on the road, and how timing can affect their response to an emergency call. Overall, the emergency personnel all favored this project because it would greatly benefit them in the long run.

Next week, we will collaborate with Zixiu to integrate our GPS data with the signal data. Specifically, we hope to start optimizing a query that will analyze a traffic intersection and obtain the signal status so that we can see how it matches up with the existing data that we have. We want to visualize the number of different fire trucks meeting at certain intersections, since it’s common for different stations to cross paths during emergency responses.

FloodBud Week 7

This week we were able to take a trip to Savannah and Chatham County to explore the area. Mainly, we helped our mentor, Dr. Clark, carry out firmware updates on some of the sensors as well as scout locations for new sensors and gateways. Visiting the area gave us context to better understand the sources of the data that we have been working with all summer. We knew these sensors by names and ID’s, now we actually got to see where they were placed and why those locations were chosen. In addition, we had the opportunity to meet with Kate Ferencsik and Nick Deffely from the Savannah office of sustainability to learn more about the community engagement side of the Smart Sea Level Sensors project. 

With the end of the program approaching, we tied up some loose ends and gave some thought as to what we want our final deliverables to look like. For one thing, we’ve been working on the visualization (front-end) and event detection (back-end) separately; an immediate to-do was connecting the two components. Now, the front-end has an option to display an anomaly layer, which will show the anomalous sensors for that specific day. We’ve also been working off a static csv to feed into the visualization, which we’ve now updated to the live Sea Level Sensors API. 


In terms of final deliverables, we’re thinking of two complementary parts: one public-facing website and one tool intended for Dr. Clark. The former will be a primer on the context of the project and will give a brief description of our research question/methods, with a final pane displaying the front-end aspect of our visualization. For Dr. Clark, we’ll be sending him a more detailed anomaly report through an automated email, informing him of which sensors were anomalous for that day as well as their respective error values. 


Albany Hub: Week 7

What a busy weekend it was! Last week, we had Thursday and Friday off to celebrate the Fourth. We all went to different events in Atlanta to celebrate — some of us went to Ponce City Market and others went to Centennial Olympic Park, all to watch some incredible fireworks displays. If you go to Centennial to watch fireworks next year, make sure to arrive early! On Sunday, Dr. Asensio took us to an Atlanta United soccer game. It was a close finish, but in the end, the game was a draw, 3-3. It was some of our first times going to an MLS game, so we were really happy to have had the opportunity to go.

The next day, we got ourselves together, started our data dictionary, and finally drove down to Albany. After our three-hour drive down south, we checked into the hotel and grabbed some dinner at Harvest Moon. The next morning, before our busy day of meetings, we stopped in The Cookie Shoppe for breakfast, mostly featuring plenty of biscuits. Afterward, we walked to our meeting location. Here’s the basic schedule of our day: 

  • 9 – 10:30 Albany’s DCED department, a discussion of the housing projects
  • 10:30 – 12 conversations with Albany’s utilities department
  • 1 – 2:30 Georgia Smart Communities working meeting
  • 2:30 – 4 meeting with Fight Albany Blight (FAB)

Through these meetings, we feel that we’ve gained a much better understanding of the context of the project and the places and people we are working with. It was lovely to meet everyone in Albany. Following our busy day of meetings, Ms. Shuronda Hawkins was kind enough to give us a tour of downtown Albany. Before we left for Atlanta, we grabbed dinner at The Flint with our team and Ms. Hawkins. 

Since returning to Atlanta, we’ve come back feeling refreshed and excited to finish up our work on the project. We look forward to finalizing our database so it can be passed onto Albany and they can continue on with their incredible work.

GwinNETTwork: Week 6

It was a shorter week than usual with Independence Day yesterday, but we were still able to work efficiently. We revised our website to display two bubble maps, one displaying the speed around intersections, and a new bubble map displaying the delay at each of the intersections. Additionally, we gave the maps a new visual look, including easier to read color-coding for each of the data points. We were able to get our dataset limited near the intersections. This week we also got new data onto the maps.

A screenshot of the newest speed map (

Currently, we are working to improve our scripts to continue to identify and group the types of movements (left turn, right turn, through) at each intersection This way, we can make the map even more precise and specific.

We also obtained log files for fire truck routes from four fire stations, and are currently working to export them into CSV files so that we can make use of them in our project.

Finally, we started working on the signal data with Zixiu Fu, the Georgia Smart Communities Corp grad student, who is also working on this project! Hopefully, we can start getting the algorithm for connecting the signals to the vehicles soon.

Hope everyone had a great Fourth of July!