Now that we’re half way through the Data Science for Social Good program, it’s a good time to talk about what we’ve done and what we plan to finish. To recap, we have three deliverables we want to complete by the end of the program.
- Visualize distribution of trees by species: The first interactive visualization shows the variety of trees throughout Atlanta- we are drawing inspiration from the NYC Street Trees by Species. The second interactive visualization shows the tree population broken down by neighborhood, similar to An Interactive Visualization of NYC Street Trees.
- Identify tracts of forest within a city: The Atlanta Tree Commission can use the Tree Trust Fund to plant new trees, relocate trees, and/or purchase existing forested land. Identifying tracts of forested land with acreage and cost information will help them make more informed decisions. Using ArcGIS, we want to identify areas with dense tree cover in relation to cost and continuity using tree canopy data and parcel data.
- Prioritization of potential planting sites: Finding ideal planting locations can be challenging and time consuming. Using multiple types of data, including parcel data and impervious surfaces, we want to develop a statistical model to prioritize potential planting sites. In addition, we will develop an interactive application that allows users to visualize planting sites and dynamically prioritize the factors the model is based on.

The visualization of trees by species and neighborhood are nearly complete. These visualizations were built using D3 and Leaflet, and are currently being tested with different users.
We’ve developed maps through analyzing many datasets such as the Urban Tree Canopy, impervious surfaces, parks, and parcels. Some example maps include contiguous tracts of forest and impervious surfaces. Using this information we have identified potential planting sites.

Our next step involves developing an application for identifying tracts of forest and an application for prioritizing potential planting sites. Though these are different applications, they will share the same backend. The main constraint we have is the size of the geospatial data we are trying to visualize. The CSV file that contains the parcel points alone is approximately 2Gb. This is not surprising since there are about 160,000 parcels in Atlanta. Processing that on the browser is not feasible, so we will use a database.