Atlanta Map Room: Week 5 (part 2)

Muniba: This week, I continued sifting through code shared to us from the St. Louis Map Room in the hopes that it could be repurposed for our use in the Atlanta Map Room. However I soon realized that with current issues in their applications (like the reliance on Mapzen, which has since been shut down) as well as the considerable modifications necessary for our concept, it may be in our best interest to create our own, simpler, proof of concept application. In considering which mapping API to use, I ultimately moved forward with MapBox GL, because it supports map rotation, which is important to creating dynamic visualizations of the BeltLine. With the projector (relatively) set up, I had the opportunity to experiment with different base maps to see what displays best for our use. I created a basic map in MapBox GL with toggle-able data and BeltLine outline (below). In the upcoming week, I’m planning to continue developing this interface with new types of data, and create an interface for selecting a portion of the map.

 

 

 

 

 

Atlanta Map Room: Week 5

Annabel: This week I focused on obtaining the datasets to use for our layers – I’m working with data from Trees Atlanta, building permits, as well as demographics, for City of Atlanta, and restaurants around the BeltLine, as well as the tax assessment data I started work on last week. My main focus right now is geolocating the data – much of it has incomplete addresses and there’s a fairly large number of data points – and working with the APIs for both geolocation and Google Places, to get the restaurant locations. My struggles with the APIs right now generally relate to working with the limits of the APIs, especially in terms of numbers of queries made because there’s so much data. I’m also starting to think about how to visualize these data points in more unified ways – for example, there are thousands of building permits and I’m trying to think of ways to show evolution of the points in the dataset over time, which is challenging when there are multiple factors at play but only realistically one point on the map.

Featured, my inspiration for the moment (Yanni’s suggestion) for the demographic data is one of László Moholy-Nagy’s abstract paintings –

Image result for moholy nagy abstract paintings

Image credit: Wiki Art

Electric Vehicle Infrastructure: Week 5

This week we focused on two major items: finishing our survey IRB, and continu make progress on sentiment analysis. We’ve gotten a lot done, and are happy to have submitted our IRB, along with our survey! After long discussions, we have come to the conclusion that we will most likely have to proceed with either the nudge experiment– aimed at detecting bias against electric vehicle owners– or with the topic classification. This will primarily depend on how well the topic classification can be done, which we are still investigating. At this point we are waiting to hear back from IRB, and also from PlugInsights, to see if we will be able to use them as a data source.

 

We’ve made a lot of progress with sentiment analysis this week. We’ve spent a lot of time learning about different kinds of neural nets, and have done a lot of literature review to determine which models are likely to work best for our particular problem. Currently, we have trained a recurrent neural net using the Keras API with TensorFlow. It seems to be out-performing our past bag-of-words approach, and we are currently investigating it further to determine just how well it works, and how we can improve upon it. One of the methods we are planning to use to improve this performance is to use Bayesian optimization for hyper-parameter tuning of our model. We will also be constructing a convolutional neural network and seeing which one is better fit to accomplish our sentiment task. While we’ve been improving our sentiment classification, we’ve also started doing analyses of sentiment based on our SVM’s predictions. The idea here is that we can work on both aspects of the problem at the same time, and once our sentiment predictions are improved, we’ll update the inputs to our analyses and see a more accurate analysis of the EV infrastructure.

 

We’re very excited to get the survey kicked off, and to see some results from our sentiment analysis!

RatWatch: Week 5

As of Friday last week, the chatbot has been deployed and is now fully functional. 5 days into the collection period, and we already have about 30 reports. However, most of them are on the eastside of Atlanta, not the west. This has prompted us to rethink our marketing strategy on the westside in order to get more reports from the area. We plan on enhancing our advertising efforts for the project on the westside over the next couple of days in order to maximize the number of reports. In the meantime, we are continuing to monitor the reports being made in order to address issues with the software. We are also working very diligently to make the app even more useful to the community by providing visual maps and statistical information about the reports we are gathering. This information will be viewable on our new website in the following weeks, so stay tuned for more informationImage result for rat funny cartoon

In addition, historical rat sighting information, code violations, and other environmental data are being used to generate a model to help identify key areas that may be especially prone to rats. After geocoding this data, we were able to compute the intersection areas between the buffers and other environmental layers, create random dummy samples across the city of Atlanta, and derive a multivariate logistic regression model to assess which features had the most importance. Currently, the model includes land use, restaurants, and bodies of water, with plans to incorporate real estate, census data, and tree cover. According to the current model, high and multi-residential land, as well as restaurants, are associated with higher log odds of a rat sighting. This makes sense, although we have to look further to make sure the two are not confounding variables (denser residential areas may have more restaurants).

Seeing Like a Bike: Week 5

During the last week, we mainly focused on getting the air quality sensor, the PMS5003, to send data every second to the Raspberry Pi via a custom wifi chip, the ESP8266. The ESP8266 technically runs Ardunio code, so this should have been fairly straightforward, but it definitely did not end up this way.

We were given firmware directly from Purple Air, so Nic worked on converting these binary files to Assembly code, and modifying the code to change the output resolution. On the other hand, I worked on using existing code from the internet which claimed to run on the ESP8266. After working on these tasks for four days, we had not made any significant progress. We consulted with April and Chris on this, and decided to make the drastic step of taking apart the Purple Air, and hooking the PMS5003 directly up to an Arduino Uno, instead of the ESP8266.

However, the Purple Air uses very non-standard wiring connectors, so we had to wire the PMS5003 to the Ardunio by finding spare wires lying around the lab, connecting them by simply sticking the wires into the pins of the PMS and Arduino, and using electrical tape to keep everything together. After uploading some simple code we found on the internet to the Ardunio, we plugged in the sensor, opened the serial monitor on my laptop, and found 3-second resolution readings! While we would have ideally had 1-second resolution instead, after dealing with 80-second resolution for the last three weeks, 3-second resolution was a godsend. When we attempted to replicate these results with the Raspberry Pi, instead of my laptop, the wiring came undone, and we couldn’t get it working for the rest of the day.

This morning, Nic came in early, and rewired the entire system using a different set of wires, with slightly thicker pins. This proved to be much more sturdy, as it stayed in for much longer. Throughout our work today, the wiring only came undone once, and it was a much more easy fix than our work on Friday. However, for our actual placements on the bike, we will need a vastly better, and more permanent solution, as biking in the real world can be pretty bumpy.

With the new wiring, we were able to get the system running fairly quickly on the Pi, and after writing some basic Python code, we could translate the code from the serial monitor to a CSV file to compare to the CSV file from the GRIMM, our research-grade air quality sensor. Our next steps for the week are to compare the data generated from our two sensors, and start calibrating the data from the sensors on our test run to Piedmont Park.

As an aside, we added a new member to our team Friday morning. Welcome to Urvi Latnekar, a Computer Science at Bennett University near Delhi, India! She will be working with us for the rest of the summer, and brings to the team her experience working with Arduinos, and air pollution data in the farmlands of India.

 

Map Room: Week 4 (part 2)

We worked on two different components of the Map Room this week –

Muniba: I primarily considered the design of the interface to enable map room users to choose and project different areas of the BeltLine and data layers. Initially, we hoped to unwind the BeltLine and create a flat, strip map so the user could draw the path as one would walk it. However after further consideration, we decided to instead enable the user to select a rectangular area of a fixed size to zoom in on and map. After looking into different libraries, I ultimately decided to move forward with the MapBox API and p5 for creating maps and drawing, respectively. To display a user’s selection of the map, we will need the coordinates of the central point, the window’s dimensions, and the rotational angle of the rectangular box. After considering these design questions and potential tools, I started looking at the repositories shared with us from the St. Louis Map Room for their projection interface.

Image – Example of a rotated map of the BeltLine I created using the MapBox API.

Map Room: Week 4

We worked on two different components of the Map Room this week –

Annabel: My focus this week has been creating a prototype of the Map Room, to model the interaction between the participant input layer and the historical/social commentary data we’re overlaying it with via the projector. I’m using the tax assessment data from 2010-2017 in Fulton County for the prototype, focusing on the recently completed section of the Southwest trail, near Adair Park. Cleaning the dataset was a big chunk of my week. I’ve also been working on a guide to the tax assessment data – to cover the questions of obtaining the data, standards I’ve used working with it, and to address a variety of ethical questions related to the dataset – and I’m pulling some of the most pertinent information from that to contextualize these projected points for visitors to the Map Room. I’m doing this by creating two context panels, one above and one below the projected layer, to discuss these considerations. My current plan is to explore the data over the collection era in a timeline above and an explanation of the appeals system, especially the lack of transparency based on access level, below. 

(the prototype)

Electric Vehicle Infrastructure: Week 4

While week three was filled with quick wins, week four has been a slow trawl to make progress on critical objectives. A lot of time this week was focused on realignment with our faculty mentor, Professor Asensio, on what was necessary for our review categorization ML training set. After long discussions, we’ve decided to pivot away from using MTurk to utilizing two different tools: Qualtrics and PlugInsights. Qualtrics offers a crowdsourcing platform that higher participant demographic fidelity in comparison to MTurk. PlugInsights is a spin-off crowdsourcing platform from PlugShare, the company that provided us with the original dataset. The bonus of PlugInsights is all participants from the platform are EV drivers, immediately lending them credibility in understanding the nuance in the reviews we would ask them to classify.

In terms of sentiment analysis, after additional feature augmentation and hyper-parameter tuning, we’ve reached the peak of feasible performance with SVMs. Time was spent this week exploring neural network based learning algorithms and understanding how one could be properly implemented for our domain specific problem.

We also tried utilizing an SVM for the review classification problem on a small training set of 1,300 reviews. We reached around ~50% accuracy, which is reasonable considering the difficulties of multi-label data and the size of our training set, but here again, we’ve decided to look towards other methods. We’re excited to see where our new plans will lead us!

RatWatch: Week 4

We are now midway through week four, and time is flying by. This whole REU is centered on civic data, and last week we really got to explore more of the civics side. With the app deployed to the server and ready for use, we are actively trying to get the  word out to communities on the westside of Atlanta this week by distributing flyers in public spaces and making announcements on neighborhood listservs. Last Saturday, we attended the Proctor Creek Stewardship Council meeting to promote RatWatch to the members present. During the meeting, members from different neighborhoods passionately discussed issues surrounding the Proctor Creek watershed. By observing the council leverage and debate the involvements of local government and organizations, we truly felt that we were witnessing civic engagement at its core.

We also got a chance to present our projects at a civic hack night, which is an event held by Code for Atlanta where technologists and civic-minded people come together to tackle problems the city of Atlanta currently faces. We received a lot of positive feedback from the civic hackers, especially in terms of improving the user interface of the app. We were also able to interact with the civic hacking community in Atlanta by networking with a very diverse set of individuals, and sharing our ideas and passions with other civic-driven people. With one day left until data collection, we will have to work very quickly to implement the feedback we received during testing. We are very happy with the progress we’re making and look forward to seeing the results of our work in the form of crowdsourced data over the coming weeks.

Seeing Like a Bike Week 4

Last week, our main goal was to actually get the Purple Air and Raspberry Pi on the same network, and communicating with each other so we could get the sensor data in real-time. After a lot of work, we were able to connect the two, but we are still stuck on obtaining meaningful data from our sensors.

Both the Purple Air and the Raspberry Pi have the ability to broadcast their own networks for any device to connect to. Based on the guides we were reading, we needed to connect the Purple Air onto the Raspberry Pi’s wifi network. However, we were stuck on this for half a week trying to get the two to talk this way. We spent half a week alone on setting up the ad-hoc wifi network capabilities of the Raspberry Pi, while simultaneously connecting to the internet via ethernet.

Frustrated, we reached out to the author of a online project on Github, who had worked with the Purple Air before, and an affiliate of the University of Michigan. He told us that we needed to work the other way around, and connect the Pi to the Purple Air instead. Once we followed the steps he outlined for us, we were able to connect the two devices together fairly quickly.

However, as we learned, the Purple Air reads data every second, but only outputs its data every 80 seconds in an aggregated manner. This is far too slow for our application, as bikes can generally get very far in 80 seconds. We know technically, the underlying sensor, the PMS5003 Laser Particle Counter, is able to output data every second, and currently we are just working on actually obtaining this data.

Our setup with the Purple Air on the left, the Raspberry Pi on the right, and the monitor at the top.

To solve this, we are trying two different approaches. The first is to directly modify the firmware that the Purple Air runs on, which is what Nic is currently working on. The second approach is to bypass the Purple Air completely, and attempt to talk to the underlying sensor, the PMS5003, which is what I am working on. Both Nic and I have been making progress and facing challenges while trying to obtain the mystical 1-second data output.

The PMS5003 uses a wifi chip called the ESP8266, which runs Arduino code to connect the sensor to the broadcasting capability of the Purple Air. Because I don’t have any Ardunio experience specifically, I was playing around with the Ardunio IDE and trying things to immerse myself in this new technology. I tried uploading a simple script to one of our two Purple Airs, and it overwrote everything on the Purple Air, meaning that all the functionality was lost.

Fortunately, we were able to get in contact with the support team of the Purple Air, and they gave us a binary file containing the firmware for the Purple Air. From this, we can re-flash the Purple Air to get going again. Nic has been working on modifying this file we were given, and change the format of the output, so when we re-flash the sensor, it no longer aggregates the data. In the meantime, I found a very relevant project on Github designed for the PMS5003, and I am working on getting that setup.

Tonight, we visited Civic Hack Night, run by the awesome Code for Atlanta organization, and they graciously allowed us to present our projects to the greater community! We were able to connect with some very talented members of the tech community in Atlanta, and work together to solve the issues our project has been facing recently. Code for Atlanta runs Civic Hack nights twice a month, and you should definitely check them out if you are in the area.