The visit last week with the fire inspectors was enlightening, as it gave us the perspective of the inspectors who would be using our final deliverables. For our first deliverable, we will be putting together a list of properties that require permits, based on a set of criteria in the City of Atlanta Fire Ordinance code. Some of those properties, (roughly 2,600), are already being inspected and have already been issued permits, but there are many other businesses in the city not being inspected, for a variety of reasons. We wanted to find other businesses in the city of the same type as those currently being inspected, since, if motor vehicle repair places need inspection, for example, then AFRD would want to know how many motor vehicle repair businesses they are and aren’t inspecting in the city. Below is a histogram of the top 20 currently inspected business types, shown in blue, with the other businesses of the same type not inspected, shown in orange.
These are grouped according to their SIC code (Standard Industrial Classification), which helps provide us with a consistent way to classify the type of business, across multiple datasets. We obtained these classifications from a database of Business Licenses in the City of Atlanta, which has the geo-coordinates, names, and addresses of 20,000 businesses in the city (among other information), though this does not include buildings such as schools and day cares, which are inspected, but do not have business licenses. In order to find the SIC code for the businesses being inspected, we first matched by geo-coordinates, finding too many businesses with the same geo-coordinates in the Inspection database (5,000 matches, but only 2,600 total businesses have been inspected), because many businesses might share the same address (if they are in a mall, for instance). We then filtered by business name to find a more complete set of matches, using a string matching search method for “fuzzy matches” of strings. (ie: MCDONALDS vs MCDONALD’S).
Below is a spatial distribution of the top 5 inspected types of businesses, as well as their counterparts of similar types of businesses that have NOT been inspected, from the Business License database.
Top 5 inspected business types.
Non-inspected buildings, of the same types as the top 5 inspected types.
Next, we mapped the number and percentage of inspections of businesses of these types, aggregated by their location in the city, using the NPU, or Neighborhood Planning Unit as the unit of analysis for visualization purposes.
# of Top 5 Inspected Business Types, by Neighborhood (NPU).
# of Non-inspected Businesses, of same types as top 5 most Inspected, grouped by Neighborhood (NPU).
Percentage of Top 5 Inspected Business Types, grouped by Neighborhood (NPU).
These maps and visualizations can help the Atlanta Fire Rescue Department make more informed decisions about which businesses in the city should be inspected and see gaps in their inspection process, both in the types of businesses they inspect, as well as the locations where they inspect.
On the other front (for our second deliverable), we are creating a fire risk model using the AFRD information we have on where fires have occurred in Atlanta for the last five years, combined with the CoStar Real Estate property assessment data for the commercial properties in the city. For the last several weeks, we have been joining data from various sources, cleaning the 240 variables we have for each building in the city, and beginning to build a regression model to determine which factors of a building are more predictive for fires. Below is a visualization of the intersections of our various datasets:
Currently, we are building our model using the 371 CoStar properties which are in the AFRD Fire Incidents database (meaning, they had fires, shown above as #1 and #3) as our positive examples, and using the remaining 6,604 CoStar properties as the negative examples of buildings with similar information known about them, which did not have fires. After we build this model, we will be joining the FSAF Inspection dataset with the CoStar dataset, so that we can use the businesses from Deliverable #1, which are already being inspected (or which need to be), and run them through the fire risk model to prioritize their inspections by their fire risk score.
Other blog posts from our team:
Week 1 – Hello from the DSSG-ATL fire team!
Week 2 – Update from the Fire Team
Week 3 – A Day with Fire Inspectors
Week 4 – Understanding Fire Inspections