Extremely Fast “Solution” to the Large-Scale and Very Large-Scale Vehicle Routing Problem

: A solution to the vehicle routing problem (VRP) is presented that takes only quadratic space, O(n 2 ), and quadratic time, O(n 2 ) , if n is the number of stops on a route. The input is assumed to be a list of stops of length n in longitude, latitude format. The output is an origin-destination (OD) matrix of size O(n 2 ) , which takes O(n 2 ) time to build. The element (i, j) in the matrix is the approximate driving distance between stop i and stop j on the route. Each approximate driving distance takes constant or O(1) time to compute. (The approximate driving distance appears in previous work by the author, published in URISA GIS-Pro ‘19 and CalGIS 2020.) This OD matrix is well-suited for solving large-scale and very large-scale VRP problems, since computing approximate driving distances is lightning fast. For instance, using real-world data, it took less than one (1) second to produce a route with 5,156 stops. The OD matrix can be used with any exact or approximation algorithm to find a route, including the nearest-neighbor approximation algorithm: Starting at an origin, the next closest stop is visited repeatedly, ending at the destination once all stops have been visited. Determining the next stop to visit takes linear or O(n) time to compute, and this is done O(n) times. This solution to the VRP is a polynomial-time, O(n 2 ) , approximation; it is not exact, but is extremely fast.


Introduction/Literature Review
The vehicle routing problem (VRP) is defined as follows: Beginning at a depot, a route is formed by visiting all stops before returning to the depot [1]. The VRP is a generalization of the famous "traveling salesman" problem (TSP) [4]. Finding an optimal route in which travel distance is minimized is known to be NP-hard [5]. Both the VRP and TSP are NP-complete problems [7]. An approximate solution is presented which uses the polynomial-time (P-time) nearest-neighbor approximation algorithm [12]. The contribution in this work is that an origin-destination (OD) matrix is built on the fly with approximate driving distances [8][9][10]. The OD matrix is usually a pre-computed input to the VRP algorithm.
Here are several related works in the VRP published by the Transportation Research Record (TRR): Wang et al. (2013) consider the vehicle routing problem of simultaneous deliveries and pickups with split loads and time windows (VRPSDPSLTW) [15]. They formulate the problem as a mixed-integer programming problem and use a hybrid heuristic algorithm. Ripplinger (2005) considers the case of the rural school vehicle routing problem. Ripplinger develops a mathematic model, and a new heuristic. His results are superior to existing VRP for the pickup and drop-off of students in rural areas [11]. Tang and Miller-Hooks (2006) define the VRP with solution shape constraints. They propose an interactive heuristic which when coupled with effective shape measures, produces solutions with significantly improved layout [14]. Figliozzi (2010) considers the case of VRP for emissions minimization (EVRP). He provides both a formulation and solution to EVRP [2].
Large-scale routes have hundreds of stops, such as the routes for package carriers like UPS, FedEx, DHL, OnTrac, Amazon, and USPS. Very large-scale routes have thousands of stops, such as the routes for garbage and recycling trucks. The presented algorithm is a GIS solution to a GIS problem. The math is close to trivial. A distance or an approximate distance is really more of a concept or an idea than a mathematic number or equation.
The presented algorithm makes possible solutions to the VRP which were previously impossible. Computing routes with a large or very large number of stops in a fraction of a second forever changes route planning. Stops can be inserted and deleted, and a new route produced in real-time. Feedback from the client in the City of Chamblee, Georgia, indicates that the new routes are superior to previous routes, which were done partly "by hand." Adding a stop changes the route, but so can deleting one or more stops (it can change the order of the remaining stops).
The rest of the paper is organized as follows: 1.

Approximate Driving Distance
Each line of the input file has a longitude, latitude pair (in degrees) of a stop in the route. First, convert these from degrees to radians. Then, compute the x, y, z coordinates of each stop: x = longitude * r * Cos(ϕ 0 ) y = latitude * r z = elevation where r is the radius of the Earth, and ϕ 0 is a centrally located latitude in the dataset. This forms an equirectangular projection [3].
Let (x i , y i , z i ) and (x j , y j , z j ) be two stops in a route. The approximate driving distance between them is: distance(i, j) = Abs(x i -x j ) + Abs(y i -y j ) + Abs(z i -z j ) This approximate distance, the Manhattan distance [13], is both a better approximation of the actual driving distance than the Euclidean distance, and an order of magnitude faster to compute than the Euclidean [8][9][10]. See Zeager and Stitz (2016) for a description of Euclidean distance [16].
The Manhattan distance is extremely fast to compute: A pilot study was performed to determine how accurate Manhattan distances are compared to actual driving distances. A nonrepresentative study of 200 green taxi cabs rides in New York City on January 1, 2016, starting at 12 AM EST, had the following distribution: It was confirmed that this distribution is normal. On average, the Manhattan distance is about 80% of the actual driving distance, and the Manhattan sometimes overestimates actual driving distance.

OD Matrix and the Nearest-Neighbor Approximation Algorithm
As mentioned in the Abstract and Introduction, an OD matrix is built using approximate driving distances. This is important because computing approximate driving distances is lightning fast [8][9][10]. Building the OD matrix takes O(n 2 ) space and O(n 2 ) time, where n is the number of stops on the route.
Using this OD matrix, any exact or approximate algorithm can be used to find a route but for the sake of simplicity, the standard nearest-neighbor approximation algorithm is used. First, find the first stop (the one closest to the depot), and then successively the next closest stop, returning to the depot after the final stop is visited. This takes O(n 2 ) time, where n is the number of stops on the route.

Case Study
Using data from the City of Chamblee, Georgia, a route with 5,156 stops was computed in under one (1) second. It was, however, necessary to expand the stack size of the executable, because of the size of the OD matrix which was over 26.5 million cells (5,156 rows, and 5,156 columns). The client was satisfied with the route computed.
If actual driving distances were used instead in the case study, the computation would take around 308 virtual days instead of less than one second (1s), if we assume the computation of each actual driving distance takes around one virtual second:

Results
The map on the following page shows the produced route for the case study. The truck depot is located at the red star, stops are shown as blue dots, and the visitation order starting at the depot is given in unclassed colors (yellow to orange to red): No truck could visit all 5,156 stops in one day. From what is understood, daily routes for each truck visit around 1,100 to 1,200 stops.
But the master route of 5,156 stops is a good test case for the presented algorithm. It is amazing that it completes in less than one second (<1s).

Limitations/Discussion
The main advantage of this solution to the large-scale or very large-scale VRP is its improved speed. Its main drawback is its 20% lower accuracy. Approximate driving distances are retained only to find the order of stops on a route. Once the route is determined, it is important to use other tools, such as driving directions in ArcGIS, Waze, or Google Maps to fine tune the route.
Li et al. (2016) introduce the "six Vs" of Geospatial Big Data: volume, variety, velocity, veracity, visualization, and visibility [6]. In a world filled with Big Data, where the volume of data points to compute distances between, and the velocity at which these distances are expected to be computed, are both extremely high, a fast algorithm for computing approximate distances may be the only choice. This introduces the issue of veracity: How reliably accurate these approximate distances are.
A number of optimizations were skipped in the case study because they were unnecessary. Since the client's route was produced in less than one (1) second, compiler optimization flags were not used. Also, software floating-point operations were used, not hardware which are faster. There might be one or two optimizations in the code itself that were not implemented because it was deemed unnecessary.

Conclusion/Future Work
It is impossible to gauge how "good" the presented solution to the VRP is because it cannot be compared to the exact solution, which would require Factorial (5,156) or 5,156! time. This amount of time triggers an Overflow in the Calculator program built into Windows 10 Pro. The exact solution would likely take millions of years to compute. However, additional quantitative and qualitative data can be collected, such as: 1. Gallons of fuel saved using the new routes 2. Time saved using the new routes 3. Interviews with managers, truck drivers, and other employees Such data has not yet been collected, but if the opportunity arises it will be collected in the future.
A serial algorithm has been developed to "solve" the VRP (see Appendix). A next step could be to parallelize this serial algorithm to get good speedup on larger problems than the one presented here, which computed a "master route" with 5,156 stops in under one second (<1s).
It should be possible to "solve" other NP-complete problems using this solution to the VRP. All it takes is a translation or transformation from another NP-complete problem space to this VRP problem space, and back again after a solution to the VRP is computed.
The accuracy of the VRP algorithm should be explored, perhaps with a smaller problem with a known exact solution.
Finally, the VRP algorithm can be implemented as a Python toolkit within ArcGIS Pro.