Seeing Like a Bike: Cycling into the Sunset

The main focus of our project is to provide valid data improve the cycling conditions of the City of Atlanta. Our targets were the bikers who may not feel comfortable driving during the peak rush hour traffic.

To do this we had to measure the stress of cyclists, in different situations, and tag them based on a Level of Traffic Stress (LTS) model. To measure the stress of a cyclist we needed to take into consideration certain environmental factors like traffic, infrastructure and pollution.

Traditionally, information about these factors would be gathered through surveys and reporting incidences as they occur. The Seeing like a Bike Team though, chose to use a sensor based
approach to data collection.

The data provided would help us determine the LTS level of a certain road segment. An LTS of 1 would be a path where a not so confident person would be comfortable riding. While an LTS of 4 would be just for well-seasoned riders.

While the overarching problem is an urban and social one, when we talk about sensors it can be reduced to five main engineering problems. The problems, their solution and the system design we came up with can be seen below:


The front Master box was 3D printed, while the back Slave box was a laser cut ABS box. The sensors were screwed into place as can be seen in the video below:

Next came the data collection for which we thank Jeremy, Mariam, and Jihwan for helping out by taking the boxes out to collect data. The data visualizations can be seen below:

The above shows the proximity sensors during a ride from TSRB to Home Park. Here we can see the value deflection while passing cars and other obstacles.

The above shows the PM sensors along the vertical axis during a ride from Midtown to Downtown. Here the PM sensors show really high peaks at the exact same time when the rider crossed large clouds of smoke and dust.

Along with the data we also collected videos with a GoPro mounted on the bikes. This video was gathered to help tag different instances in the video and data. In this way several 5 second signatures were marked and saved. The next step was to make a classifier for the right and left side sensors. This was done by going through the video and tagging every obstacle that was apparent. One such video can be seen below:

The reason this was done was to train the machine learning model that was built. The time based data was first translated into a distance domain so as to remove the bias due to different speeds of different objects with respect to the bike. Data interpolation and Smoothing was done to finally arrive at the Proximity pattern features.

Two machine learning algorithms, Support Vector Machine (SVM) and Random Forest were used to predict the classifier of each data segment. To compare the performance of the prediction models to the baseline classification power (when randomly predicting classifiers), we plot them together with varying train-test sets.Accuracy was seen to be around 50%, which was good but not enough. This was simply due to the fact that we needed more data, as the model was learning and improving quickly, and was way better than the base score of around 18%.

In the future we would like to see the Prediction Power of the model improve significantly by the collection of more data for learning. A bit of Feature engineering would also significantly improve the accuracy. The model would then be used to set well-defined, data-driven boundaries for the LTS model which would help the policy makers make the city safer for cyclists of all comfort levels.

Actually, our overarching goal is to identify environmental factors that give rise to bike riders’ stress level. In order for this, the identification of environment should be achieved first as sensors cannot detect semantic-level objects. Once we can tune and refine the prediction model for detecting environmental fators through feature engineering and modeling, it would be possible to advance to answering the real question — how bicycle infrastructures and environmental factors affect bike riders’ stress level? and how these relationships can be used for constructing the Level of Traffic Stress (LTS) model?


SEEING LIKE A BIKE: Calibration and system de-bugging

Sensor calibration was the group’s main focus for the sixth week of the program. The Gas sensor and the GPS proved to be tougher to tame than the others.

The gas sensors seemed to be providing a wide range of values with absolutely no consistency. So to calibrate them we need to measure the voltage of the sensor when it is in an atmosphere devoid of the gas whose concentration is to be measured. This voltage serves as the base value which we feed into the code and which we would consider as the “zero”.

The first method we tried in an attempt to calibrate was to gently release Helium gas over the sensors using Helium balloons bought at the local store. This proved to be an inefficient method as the release of Helium was quicker than the response rate of the sensor and there was never really enough time for the sensors to record the voltages corresponding to the Helium before the balloon ran out of gas. Another downside to this method was that Helium balloons usually still do contain about 5% of normal air in them too, which would affect calibration of the sensors.

So to avoid this the next thought was to use a vacuum chamber which we could fill with Helium and allow the sensors to receive ample time to adjust to the environmental changes, while also getting rid of interference from the External air. We then went on a hunt for a vacuum chamber but turned up empty handed.

The option that opened up to us next was godsend. We are going to be given access to the nearby sensing station, where we will co-locate our sensor with the existing sensors and then be able to detect any variances in the values and adjust our sensors respectively.

Another issue that arose was the fact that every gas, if present in a certain concentration, has an effect on the other sensors.


The matrix situated at the front, has an LED array to indicate the status of different sensors.
The color “Green” means the sensor is working and receiving data.
The color “Blue” is working but not receiving correct data.
The color “Red” is that it is not working at all.

During the final setup and mounting of the system on a bike we realized the sensors from the Arduino were red. This posed a huge problem as the major portion of the system had now decided it just didn’t want to work!

At first we thought it was the fact that the raspberry had a certain delay on startup, which the arduino didn’t have and hence there was no sync between the two boards. We turned out to be right, partially.

It turned out that at first the arduino sent out garbage “nack” values. These values followed a particular sequence before the actual data transmission began. This was the cause of the time lapse between data transmission and collection. So we then proceeded to code the raspberry to be a Master device which would control the Slave Arduino. It could now find out when the last character in the sequence of garbage appeared and then reset the arduino for data transmission so that the two boards would be in sync.


Yesterday saw the first 15 min. test run of the system fully functional, but with certain sensors non calibrated.
The Pilot data on running preliminary data visualization yielded the following charts.


In the first row we see the Left Sonar Data and Right Sonar Data respectively.
The second row shows us the Gas Sensor Data.
The Left LIDAR shows lesser peaks, than in comparison to the Right LIDAR. This is correct keeping in mind the fact that on the Right we have many obstacles on the side where the sidewalk is present, as compared to the road side.

The Gas sensor data was plotted just to see if the sensors were producing data in real time conditions. They mean nothing as they have not yet been calibrated.

For some reason, 3min into the test, the GPS stopped writing to the JSON file and hence we were limited to the 3min for test data.
We aren’t really sure as to why this problem arose and that would bring me to the part of what we plan to do for the coming week.


The next week would see the usage of the sensing station for gas sensor calibrations. We would also be making adjustments for the effect of gases on the other’s sensors.
De-linking the data collection frequency of other sensors from the GPS time would also be a task. This would allow us to use data even when the GPS fails.
Speaking of failing GPS, we would have to understand and troubleshoot why the GPS stopped writing to the JSON file.