Authors: Kimanii Daniel, Kaushal Mehta, Gregory Bell
Abstract
All major cities, our schools, and even theme parks like Disney World use forms of mass transit to move people from one location to another. A bus system usually has some form of schedule printed on arrival times at certain stops. These schedules are best guess on times and creating accurate schedules with live feeds is an expense few transportation departments can afford. Our project uses crowd sourced GPS location to estimate the time it will take for a bus to arrive at a given bus stop. A bus route was followed to create GPS points, breadcrumbs, for the route in the project. The bus starts with a person using software, the first passenger being the bus driver. Additional GPS data from passengers adds to the weight of the location based on the breadcrumb. The information returns estimates to future passengers based on the buses location and direction from the breadcrumb points. The method provided improved the accuracy of arrival times over the scheduled time by an average of 2 minutes.
Problem
Real time tracking of movement and estimations is becoming more and more popular as devices utilizing the Global Positioning System (GPS) becoming more readily available. There are a multitude of proposed uses for this new technology from tracking pets, to tracking loved ones to even as advanced as guiding planes to their destinations. We decided to leverage this technology to solve a very difficult problem of tracking the location of a bus on a bus route and estimating the time of arrival to a bus stop location. This issue has plagued passengers for years and many organizations have purported various options to attain a solution. None of these solutions are perfect or accurate. Furthermore, they require a high initial investment by the transportation board or private transportation agency to install GPS devices on vehicles to increase accuracy of estimations.
We propose an alternative more cost effective approach using crowd sourced GPS devices in concert with official bus schedules to better estimate the location of the bus and the time of arrival to a bus stop. The important elements in this solution is relying on the altruistic nature of passengers to share information for the better of all. This will all work using the GPS enabled phones of the passengers and a centralized server. The idea is to build a mobile app that allows the passengers to look up information on the ETA of the bus and also let the server know when they have boarded the bus. The mobile app will then begin transmitting the location of the device every 10 seconds to the server. Using known GPS locations on the route and the GPS readings from the devices, the server will build a probability distribution be measuring each GPS reading against the known locations and modifying the distribution. The known locations on the bus route we refer to as breadcrumbs.
Design Methodology
We defined our solution to solve two problems:
1. Localization of the bus which includes position and direction
2. Estimation of arrival time to a specified bus stop
Localizing the bus was pivotal to our estimations of arrival time, after-all if you don’t know where the bus is it’s impossible to estimate how long it will take to arrive. To help with localization we employed a modified particle filter concept and Bayes theorem. A known trail of GPS points on the route which we coined the breadcrumbs would be used to represent weighted particles in the filter. The bus will be tracked by these
breadcrumbs, meaning we will estimate which breadcrumb the bus is closest to and use that breadcrumbs exact known location as the position of the bus. The breadcrumbs on the route were created by driving the route and using a Google location tracker app for android called “My Tracks”. This app allowed us to take GPS points every second, which was then used to form a breadcrumb of GPS locations on the route. The appropriate breadcrumb points closest to real bus stops were then tagged as bus stop breadcrumbs. This is used to estimate the distance from localized bus location to the bus stop in question. We also calculated the distance between breadcrumb points using the GPS Longitude and Latitude points, these calculations were stored to speed up overall estimations processing. This distance forms part of the equation to calculate the Estimated Time of Arrival.
This solution is heavily dependent on crowd sourced GPS readings from devices traveling on the bus. The idea here is that users will share their location with our system once on the bus. We will store poll and store these locations every 10 seconds. Localizing the bus was solved by first creating a uniform probability distribution for all the breadcrumb locations and setting the weight of each breadcrumb point to 1/count(breadcrumbs). So in our sample dataset we have 31 breadcrumb points on one route, this would weight each breadcrumb point to 0.0322581. We then employed a weighting formula to augment the uniform distribution by multiplying the prior weight of a breadcrumb particle to the distance between the device GPS reading and that particle, the result of which is then subtracted from that prior weight. We do this iteratively for each device and each breadcrumb point, slowly changing the probability distribution. This is done cumulatively using this formula:
Posterior weight = Prior weight – (Prior weight * breadcrumb distance)
This formula ensures that the GPS readings that are closer to a breadcrumb are weighted higher. The GPS data is re-sampled for the new position and checked with the breadcrumbs to weight the location as the bus travels the route. As the system iterate through all the readings applying this formula cumulatively our belief of where the bus is will be closer to the breadcrumb that they’re all clustered closer to. Over time and with more independent device GPS readings the better the estimation of the location of the bus and the ETA.
If you look at the sample data above, you will see the breadcrumb data points and the probability distribution. The distribution starts with a Uniform weight of 0.0322581, the distance is calculated for each passenger GPS point and the breadcrumb GPS point. This distance is then used to determine the posterior weight of that breadcrumb point (using the formula above). For passenger 2, the distance of their device from all the points are calculated and the weight of that point is factored into the formula using the prior weight from passenger 1. This is repeated until there are no more GPS device readings for the period. the breadcrumb point with the highest weight is essentially closest point to the bus.
The direction the bus is traveling is very important for ETA calculations. Since the breadcrumb points are ordered 1 to 31 on the route we can look at the order of the last five localizations by descending weight, this will let us know if the bus is going to or away from the bus stop in question.
Estimation of arrival time now becomes a process of transposing the physics formula for Average Speed:
Time (ETA) = Total Distance / Average Speed
Distance – Summation of breadcrumb intra-distances
Average Speed – The average speed on all the GPS device readings
Here, distance was calculated by summing the distances of each breadcrumb point from the localized bus location to the bus stop taking into consideration the direction of the bus and composition of the route. The composition of the route proved to be an important factor in our estimation, because if the route is cyclic the calculation is different compared to if its not. The route can be cyclic as in the bus ends at the same location it start but using non-overlapping roads where as non-cyclic or linear bus routes use the same roads to go and come back to the main station. Therefore if the bus is going away from the bus stop in question then the breadcrumbs leading away from the bus stop needs to be calculated before calculating the breadcrumbs towards the bus stop. Remember these distances are stored so a simple query can yield this results
Now there’s the problem of average speed, was solved this by tracking not only the GPS location of the devices but also the time it was taken. Using a transposition of the same formula before the speed was determined
Average Speed = Total Distance / Total Time
So for example if the bus took 10mins (600 seconds) to travel 1000 meters then the average speed would be 1.67 meters / second. The ETA will update and become more accurate over time and as more devices are sharing their locations on the bus.
Proposed Usage
The main interface for this system will be via a mobile app. Users will choose the bus stop they want to go to and the bus stop they’re currently on. This is transferred to a server that then uses this information to estimate where the bus is and estimated time of arrival. It will then display the time of arrival to the user. If the server has no GPS device to collect readings from (no one is sharing their location or no one is on the bus) then it will failover to the established bus route schedule. Once the bus arrives and the user board the bus, the app will detect that the bus is in close proximity to the user and ask the user if they’re on the bus. If the user answers in the affirmative, then the app will start sending the location of the device to the server which will be used in further localizing the bus for other passengers waiting for the bus.
Considerations and Limitations
There were a few issues we had to take into consideration when designing this system. The system was designed to track one bus, we have not tested the viability of tracking multiple buses using this system.
Conclusion
The tracking project provided a very practical approach to a real world problem that passengers experience when using public transportation. We learnt that by using crowd sourced GPS points and known GPS locations on the bus route we where able to transform a uniform probability distribution to solve the problem of estimating how long the bus will take to arrive. This solves a huge problem in the transportation industry. We have noticed pedestrians at bus stops checking their watch or running after a bus that was missed be seconds. Our approach is on a small scale but If transportation agencies used a larger approach to this project, then it would be beneficial to their customers and public opinion of the agency. This project determined the solution is viable and proposed a more accurate, low cost estimation system primarily for passengers.