This week we put the final touches on the database. This involved cleaning addresses, cleaning the census data, and pulling in housing data from ATTOM’s data API. The ATTOM dataset contained properties of each property in Albany, such as square footage, number of rooms, flooring style, and the date of the most recent major improvements. We hope to use these fields to identify reference groups across Albany. These reference groups will allow us to analyze a difference in means between households that did and did not receive funding. In our context, reference groups will consist of groups of households with similar properties to the households that received project funding.
To begin this analysis, we constructed tables for each of our utility types (gas, electric, water, and sewage) and looked at the number of projects funded, number of unique addresses with each utility type, and mean consumption by block group. These tables serve to show preliminary findings in potential differences between funded and nonfunded homes in the context of utility consumption. We hope to investigate these tables further by looking at outliers, normalizing by square footage, and running t-tests between the two groups of houses.
Finally, we geared up for our pair of days in Albany. We met with Amanda Meng, a research scientist working on the open government data aspect of our project. She will be bringing us to Albany, where we can ask staff clarifying questions about the programs (eligibility requirements, monitoring of programs, direction and motivation of the programs) while she will simultaneously be conducting interviews with staff and participants of the housing projects.

Cheeeeeers from Albany! (in a few days)