Mapping Geospatial Networks

Summary

What if you had to find the quickest route with multiple stops? Questions start to add up:

  • What if you had multiple distribution locations?

  • Which warehouse should deliver to which customer?

  • Which delivery should be first?

  • What roads should be taken?

Businesses face this problem daily. Online stores, delivery services like Amazon, and even the restaurant industry are tasked to find the best route at the lowest cost.

I randomly picked 3 warehouse locations and 95 customers in Broward County, Florida. This analysis & distribution network can be duplicated and scaled for any business and location in the world.

LIBRARIES

library(osmdata)
library(osrm)

library(sf)
library(nngeo)
library(mapview)

library(tidyverse)
library(tidyquant)

Data

Bounding Box

A bounding box shows the minimum & maximum coordinates of a given location. Let’s see what the Broward County bounding box has.

broward_bbox <- getbb("Broward County, FL")

Customers & Distributor Data

First, we read in our random sample set of customers…

customer_sample_set <- read_rds("00_data/04_networks_broward_customer_sample_set.rds")

customer_sample_set
Simple feature collection with 96 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -80.43749 ymin: 25.98425 xmax: -80.08362 ymax: 26.30103
Geodetic CRS:  WGS 84
# A tibble: 96 × 3
   id                 type                 geometry
 * <chr>              <chr>             <POINT [°]>
 1 44664082           customer (-80.09194 26.22955)
 2 580514975591931520 customer (-80.11535 26.01145)
 3 53972487           customer (-80.12721 26.14451)
 4 29658273           customer (-80.12039 26.12528)
 5 661419175246481024 customer (-80.17177 26.11879)
 6 702164021465964672 customer  (-80.14492 26.0297)
 7 602170703800375680 customer  (-80.2359 26.10441)
 8 610699636025092096 customer (-80.17715 26.06205)
 9 580481794091230720 customer (-80.10324 26.12856)
10 53646829           customer  (-80.15227 26.1426)
# … with 86 more rows

And we need our random distributor sample set…

distributor_sample_set <- read_rds("00_data/04_networks_broward_distributor_sample_set.rds")

distributor_sample_set
Simple feature collection with 3 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -80.28177 ymin: 26.13737 xmax: -80.10083 ymax: 26.1581
Geodetic CRS:  WGS 84
# A tibble: 3 × 3
  id                 type                  geometry
* <chr>              <chr>              <POINT [°]>
1 661685640819565568 warehouse (-80.13461 26.13737)
2 42036239           warehouse (-80.10083 26.15479)
3 24283637           warehouse  (-80.28177 26.1581)

Highways & Medium-Sized Streets

With the bounding box coordinates we can get all OSM (Open Street Map) data for highways and other streets. Residential streets would take too long, but they can be added later.

# highways
broward_highways_sf <- opq(broward_bbox) %>%
    add_osm_feature(
        key   = "highway", 
        value = c("motorway", "primary", "motorway_link", "primary_link")
    ) %>%
    osmdata_sf() 

# medium streets
broward_medium_streets_sf <- opq(broward_bbox) %>%
    add_osm_feature(
        key   = "highway", 
        value = c("secondary", "tertiary", "secondary_link", "tertiary_link")
    ) %>%
    osmdata_sf()



# visualize
mapview(
    broward_medium_streets_sf$osm_lines, 
    color = "darkgreen",
    layer.name = "Streets"
) +
    mapview(
        broward_highways_sf$osm_lines,
        layer.name = "Highways",
        color = "purple"
    )

Customers & Distribution Centers

Let’s add a row id and visualize the 3 distribution centers and 95 customers.

  • Customers are blue

  • Distributors are magenta

broward_customers_sf <- customer_sample_set %>%
    rowid_to_column(var = "customer_id")


broward_distributors_sf <- distributor_sample_set %>%
    rowid_to_column(var = "distributor_id")

Nearest Neighbors

I decided to use nearest-neighbors to get the distance between locations. Alternatively we could use sfnetworks.

The Process

  1. First, connect each distributors to each customer, no matter the distance. The map looks overwhelming.

  2. Next, calculate the distance from each distributor to each customer. Even the farthest customer will be connected to a distributor.

  3. Finally, group by the the customer and keep the shortest distance to any distributor.

Now we have the best distributor for each customer.

Visualize Trip Points

The map below color-codes each distributor to their customer.

broward_route_points_sf <- broward_distributors_sf %>%
    
  bind_rows(broward_customers_sf) %>%
  select(type, distributor_id, customer_id, everything()) %>%
  
  # Adding in the distributor that the customer belongs to
  left_join(
      shortest_broward_network_sf %>% 
          select(distributor_id, customer_id) %>%
          as_tibble() %>%
          rename(distributor_to = distributor_id) %>%
          select(-geometry),
      by   = "customer_id"
  ) %>%
  
  # Cleanup distributor_to
  mutate(distributor_to = ifelse(is.na(distributor_to), distributor_id, distributor_to)) %>%
  mutate(distributor_to = as.factor(distributor_to))

Planning the Route

We use osrmTrip() to get the best route between each location for that distributor. This is our final product.

# visualize routes
mapview(
  broward_customers_sf,
  col.region = "blue",
  color      = "white",
  layer.name = "Customers",
  cex        = 12
) +
  mapview(
      broward_distributors_sf,
      col.region = "magenta",
      color      = "white",
      layer.name = "Warehouses",
      cex        = 20
  ) +
  mapview(
      warehouse_trips_sf,
      zcol = "distributor_to",
      color = tidyquant::palette_dark()[c(1,2,4)],
      layer.name = "Trip"
  )

Estimating trip cost

The National Private Truck Council (NPTC) averages $2.90 cost per mile including gas, maintenance, insurance, etc.

My estimated variables can be adjusted, but below is what I used.

  • $300 paid for each delivery truck driver

  • $2.90 cost per mile

# A tibble: 3 × 6
  distributor_to duration_min distance_miles driver_cost cost_per_mile total_c…¹
  <fct>                 <dbl>          <dbl>       <dbl>         <dbl>     <dbl>
1 1                      216.           77.9         300           2.9      526.
2 2                      107.           41.5         300           2.9      420.
3 3                      181.           90.9         300           2.9      564.
# … with abbreviated variable name ¹​total_cost

Any questions can be sent through the contact page. I’m happy to help.

Thanks for reading!