Seeing Like a Bike: Iterations for Better Data Quality

Our team has been refining the sensor boxes as we collect data. Our friendly colleagues volunteered to ride a bike, and each time, they came back with handful of data along with points that need to be fixed in hardware, software, and data itself. Even though technical challenges that come from physical shocks and vibrations can be hardly in perfection due to the nature of electrical parts used, many issues have been resolved through making small changes in a iterative way.

Try-and-Error: Data Collection and Sensor Box Refinement

Without help of our awesome colleagues, this would have not been possible.

Whenever there are LED indications of sensor malfunctions or wrong data found, we unpacked the box and examined the flaws. Minor (?) issues that we identified and fixed are as follows:

  • Occasional hiccups in the communication between the Pi and Arduino -> resolved by implementing the timeout and reset functionality.
  • Impedance issue: all of sudden, an Arduino board stoped working and sent out “NACK” signals, and never came back to normal after resetting the board -> By removing some amount of solders from our custom bridge, this could be resolved. Too much solder on PCB prevents weak signals coming in and out of Arduino due to the low impedance.
  • Cable order: some cables for the Sonar sensor were found that the order of pins was reversed. This did not raise an error, but data was wrong -> We examined all the cables, bridges, and sensor pins.
  • Broken wires: some cables looked fine by its appearance but we found that a wire was broken inside the socket. This can be prevented by using stronger cables and sockets (that are tenable to bike shocks) in the future.
  • Hardware errors: some Arduino boards, USB-to-TTL connectors, and sensors were found that they were damaged and out of order -> this is the hardest part to identify. After finding them, we had to replace these parts.

Gas Calibration Data Collected

With help of Raj, a Ph.D. student from the department of Environmental Science, we were able to co-locate our gas sensors at the official gas sensing station that is 10 minutes away from Georgia Tech. By comparing data between sensor data and official data from the sensing station, we expect to adjust gas sensors to some degree. Since the temporal resolution of the official data is one hour, it would be hard to adjust them very precisely. Even though, this would increase our gas sensor accuracy greatly.


Environmental Signatures and Ground Truth Data

If we can identify what objects are around the bike only by looking at the sensor data, it is possible to use sensor data for semantic-level analyses. Without guaranteeing the connection between the sensory data and real-world objects, modeling environmental factors using sensory data would be hardly convincing to audience due to the noisy nature of sensors. Our strategy to analyze the sensory data begins with creating semantic-level signatures and classifying each segment of streaming data from bikes. In order to do that, we recorded environmental information in videos and voices using GoPro and voice recorders. These qualitative data provides ground-truth information for the sensor data.

Based on the Level of Transportation Stress (LTS) model, we listed possible obstacles and objects in the biking routes. After aligning the GoPro video and sensor streams by time, we qualitatively tagged each segment of the video (only when the circumstance was not too complex). For example, when a vehicle passes by the bike and there is no other objects around in a video segment, we assume that its corresponding sensory data is a typical classifier for a vehicle passing-by. After a test riding in downtown Atlanta, we collected a ground-truth data.


Here are some examples for creating signatures: (1) a narrow street with cars parked in parallel, and (2) a city road with a car passing by the rider.


The temporal pattern of the corresponding Lidar data to this video segment is as follows:


Since the frequency of the proximity values might provide better indications for objects rather than the temporal pattern of it, we converted this signal into the frequency domain using the Discrete Cosine Transform.

This frequency signature can be used to classify similar environmental factors in the data. Similar to this, the case where a car is passing by the rider is as follows.

These two different cases show their unique patterns to some degree. The graph of a street with cars parked in parallel shows a regular change of Lidar values which resulted in a high middle-level frequency (around 4 to 7). Meanwhile,  the case where a car is passing by the rider shows a higher value in a low frequency (around 2-3) since the Lidar value changes radically at one time. Of course, these are exploratory signatures, and more ground-truth data and other sensors need to be aggregated/merged to provide robust signatures.

We are working on generating more ground-truth data. The classification performance for data segments depends on (1) the quality of signatures, (2) the quality of ground-truth data, and (3) the prediction model (feature engineering). We hope to finish the first-round classifications of sensory data in a few days with a high prediction performance.

We are reporting our final results at the DSSG final presentation on Monday (July 24th, 2017).

Week 9: The Final Push

After scrambling the past couple of weeks with paper submission deadlines and the mid-program presentation, we are now working on re-running our models, finalizing our estimates, and making updates to our interactive tools. Time is of the essence, as … [Continue reading]

Nearing the finish line

This past week we’ve been working on creating visualizations of the data we’ve collected and starting to prepare it to be put on the R Shiny Snap app.   Below is an analysis we created for understanding SNAP sentiment over time.  The y … [Continue reading]

The good, the bad and the ugly

The good: And there is light. While adjusting the first pilot sensing device (unit 1.0 beta #1), the team has been working on several parallel tasks for making possible to start to collect pilot data by the end of this week. Most of the … [Continue reading]

Almost there !!!

The past two weeks were pretty hectic. We spent long hours in the lab and did tons of number crunching. The prior week, we had our midterm presentation and last week we had a deadline for paper submission. Fortunately, both went pretty well. We … [Continue reading]

Pushing through milestones

This week was a season of growth for our team best characterized through milestones, triumphs, and valuable lessons. After the wake of the mid-term presentations, our team headed back to the drawing board to work out kinks in our poster, oral … [Continue reading]

Seeing like a bike

WEEK 6: WHAT WE HAVE DONE: Gas sensors calibration Material used: Wine preserver gas + vacuum bag Baseline calibration As we mentioned in our previous blog, our first try to calibrate the gas sensors was using helium balloons and … [Continue reading]

Half way to Justice

  This week we put a lot into JUMA, the Justice Map, for the Atlanta Legal Aid Society. After adding various socioeconomic layers, a search box, and edit features, our contact with the Society was so impressed she wants us to do much more! … [Continue reading]