Section 01

The Problem

At LifeSeasons, warehouse workers pick products for multiple customer orders at the same time. A group of orders assigned to one cart is called a batch. A worker loads a cart, walks from shelf to shelf picking items across multiple orders, then returns to the packing station. The route a worker takes — and how far they walk — depends entirely on which orders were grouped together.

Batch
Group of orders
Assigned to one worker + cart
WMS
Current system
Auto-assigns orders to batches
Goal
Less walking
= faster picks = lower labor cost

The LifeSeasons Warehouse Management System (WMS) groups orders into batches automatically — but nobody knew if it was doing a good job. The goal of this project: find smarter ways to group orders so workers walk less, saving time and labor cost.

Every extra step a worker takes is time not spent packing. Across dozens of workers, hundreds of batches, and thousands of orders per week, even a 20% reduction in walking distance compounds into significant labor savings. The right batching algorithm treats this as a routing problem: group orders so the combined route is as short as possible.

Section 02

The Data We Had

10,004
Lines in history file
2,815
Actual pick events
1,017
Unique orders
930
Usable orders
87 dropped — missing inventory match
1,563
Shelf locations mapped
72,775
Max pick sequence
Physical walking order index

Data Sources

Order history — every pick event: order number, batch assignment, shelf location, units picked. This tells us exactly which shelf locations each order visited.

Shelf locations — each location has a name like 21-8-5-1 (Row 21, Bay 8, Level 5, Slot 1) and a pick sequence number (1 to 72,775). The pick sequence represents physical walking order — location 1,000 comes before location 2,000 on the walk path.

Batch assignments — which orders the WMS grouped together (e.g., BATCH6174, BATCH6180).

Shipping type — FedEx (urgent) vs UPS Ground (normal), extracted from order description text.

Warehouse Structure

Location names encode the physical layout: ROW - BAY - LEVEL - SLOT

The inventory data includes a zone field that LifeSeasons' own WMS assigns to every shelf location. Two zones appear in the data: Speed Cell and BackStock. These are LifeSeasons' own terms — not labels we assigned.

Speed Cell
Rows 1–3
High-frequency items picked constantly — low pick sequence numbers, closest to the packing station
Rows 4–9
Not in data
Zero inventory entries for these rows — purpose unknown, we have asked LifeSeasons
BackStock
Rows 10–21
Slower-moving bulk storage — high pick sequence numbers, further from the packing station
Bays per Row
Up to 15
Horizontal columns within each row
Levels
5 high
Vertical shelves within each bay
Slots per Bay
Up to 11
Horizontal positions within a bay level

~10 days of order data were used in this analysis. The shipping type (FedEx vs UPS) was parsed from free-text order descriptions. 366 FedEx orders and 564 UPS orders were identified in the usable dataset.

Section 03

How We Measure Route Cost

To compare algorithms, we need a single number that represents "how much walking does this batch require?" We define route cost as:

Route Cost  =  2 × max(pick sequence in batch) Worker departs from depot at sequence 0, walks to the farthest shelf, and returns. Lower cost = less walking = better.

For example, if a batch has pick locations at sequences 1,200 — 4,500 — 8,000, the route cost is 2 × 8,000 = 16,000. The worker must travel to position 8,000 and back, regardless of how many stops are in between.

Lower
= Better
Fewer steps for the worker
Higher
= Worse
More distance walked

Important caveat: This formula is a simplification. Pick sequence is an index — not actual meters. The real impact of this limitation is covered in detail in the Limitations section.

Section 04

Priority / Delivery Type Handling

Not all orders are equal. FedEx orders are urgent — they must ship out fast. UPS Ground orders have more flexibility. Every algorithm in this study uses wave splitting to handle this.

FedEx
Wave 1 — Priority
366 orders  |  Picked first, always
UPS
Wave 2 — Normal
564 orders  |  Picked after Wave 1
Wave Splitting Flow
Identical across all 3 algorithms — runs before batching begins
All Orders In
930 orders from history
Classify
FedEx vs UPS
from description
Wave 1
366 FedEx
batched first
Wave 2
564 UPS
batched second
Workers Pick
Wave 1 fully done
before Wave 2 starts

Zero mixing — no urgent FedEx order ever shares a cart with a normal UPS order. Wave splitting is identical across all algorithms. It is a pre-processing step that runs before any batching algorithm begins.

Section 05

The 4 Methods

We tested three purpose-built algorithms against the existing WMS (and a random baseline). All three run on top of the wave splitting described above.

Method 1

Jaccard Greedy

Groups orders that share the same shelf locations. Simple and fast — completely blind to actual walking distance.

  • Seed = order with the highest pick sequence
  • Fill batch by Jaccard similarity score (shared locations / total locations)
  • Keeps filling until batch is full or no similar orders remain
  • Analogy: grouping people going to similar neighborhoods
Method 2

Clarke-Wright Savings (CW)

Calculates savings for every possible pair of orders. Never merges orders that increase total walking.

  • Savings = cost(A alone) + cost(B alone) − cost(A+B together)
  • All pairs ranked by savings, highest first
  • Greedily merges pairs with greatest overlap
  • Analogy: finding which two routes overlap most and combining them
Method 3

RAG (Routing-Aware Greedy)

Seed-based. Picks one order, then keeps adding the order with minimum extra walking. Industry standard — used by Zalando.

  • Seed one order, then find neighbor with lowest marginal cost
  • Two-stage: Jaccard screen (top 15 candidates) then pick by min marginal cost
  • Called "DGA" (Distance-Greedy Algorithm) at Zalando
  • Analogy: a GPS that picks the next stop adding least distance
Method 4

CW + Consolidation

Clarke-Wright plus a second pass that combines small batches to ensure fuller carts.

  • Runs full CW first
  • Then iterates: merge smallest batches into nearby batches
  • Only merges same priority type (no FedEx + UPS mixing)
  • Ensures cart capacity is used as efficiently as possible
Section 06

Results

All numbers below are from real LifeSeasons data (~10 days, 930 orders). Batch capacity set to 15 orders per cart. Lower average route cost = better.

Average Route Cost Comparison
Lower bar = less walking = better. Bars scaled relative to Random baseline (81,649).
Random (no system) 81,649
Baseline
baseline — no optimization
WMS (current system) 51,901
+36.4% vs Random
current production system
Jaccard Greedy 38,659
+52.7% vs Random  |  +25.5% vs WMS
location-overlap grouping
RAG (Routing-Aware Greedy) 38,659
+52.7% vs Random  |  +25.5% vs WMS
industry-standard seed-based greedy
Clarke-Wright Savings 22,407
+72.6% vs Random  |  +56.8% vs WMS
pairwise savings merging
CW + Consolidation  Best 22,407
+72.6% vs Random  |  +56.8% vs WMS
CW with second-pass cart fill optimization
Method Avg Route Cost vs Random vs WMS Status
Random (no system) 81,649 baseline No optimization
WMS (current) 51,901 +36.4% better baseline In production
Jaccard Greedy 38,659 +52.7% better +25.5% better Tested
RAG (Routing-Aware Greedy) 38,659 +52.7% better +25.5% better Tested
Clarke-Wright Savings 22,407 +72.6% better +56.8% better Top Performer
CW + Consolidation 22,407 +72.6% better +56.8% better Top Performer

Results by Delivery Type

Breaking down performance by wave shows where each algorithm excels:

FedEx (Wave 1 — Urgent)

Clarke-Wright / CW+Consol. 20,082
Jaccard / RAG 23,577
CW is 14.8% better on urgent orders

UPS Ground (Wave 2 — Normal)

Clarke-Wright / CW+Consol. 23,936
Jaccard / RAG 48,581
CW is 50.7% better on standard orders

Clarke-Wright is the clear winner. It outperforms the current WMS by 56.8% on average and is especially dramatic on UPS Ground orders — over 50% improvement. The consolidation pass doesn't change the average cost but ensures carts are fuller, reducing the total number of trips needed.

Important disclaimer on these numbers: The 56.8% improvement figure is based on our current simplified distance model (pick sequence, not real walking distance). Once we receive the actual warehouse floor plan and physical dimensions, these numbers will change — they could be higher or lower. Do not treat 56.8% as a guaranteed outcome. It is a strong directional signal that CW significantly outperforms the current WMS, but the exact magnitude will only be confirmed once real distances are plugged in.

Section 07

Limitations

The Distance Problem — Most Important Limitation

Route cost is currently calculated as 2 × max pick sequence. This assumes location 10,000 is exactly double the distance of location 5,000. That is not true in a real warehouse.

In reality, warehouses have aisles. Location 5,000 might be at the end of aisle 10, and location 10,000 at the end of aisle 20. A worker picking both could walk aisle 10 and aisle 20 in one continuous trip — not double the distance. Two locations could have very different pick sequence numbers but be physically close to each other. The pick sequence index doesn't capture cross-aisle travel, backtracking, or the actual geometry of the floor.

This means all three algorithms are currently making decisions based on a simplified number that doesn't fully capture real walking distance. CW's savings calculation, RAG's marginal cost, and Jaccard's grouping are all affected. The rankings between algorithms are still meaningful, but the absolute cost numbers are approximations.

Why Not Manhattan Distance?

Manhattan distance is a well-known routing method that calculates exact walking distance between two shelf locations — horizontal steps + vertical steps, like navigating city blocks. It is actually the ideal method for warehouse routing and was considered for this project.

However, Manhattan distance requires knowing the actual X-Y coordinates of every shelf location — how far apart the aisles are, how long each aisle is, where the depot/start point sits. Without those physical measurements, Manhattan distance cannot be calculated.

What we currently have is only the pick sequence number — which tells us the walking order of locations (1st, 2nd, 3rd...) but not the actual physical distances between them. Using Manhattan without real coordinates would mean guessing measurements, which would produce less reliable results than what we have now. Manhattan is the goal once the warehouse floor plan arrives.

Other Limitations

Cart capacity guessed at 15

Real capacity varies by cart type: Red 2×5 = 10 orders, Green 5×9 = 45 orders, Blue 4×7 = 28 orders. Using the wrong capacity changes how many orders can share a cart and directly affects batch groupings.

Only ~10 days of order data

More order history would reveal seasonal patterns, product velocity trends, and give algorithms more pairs/groups to optimize. Results should improve with more data.

Shipping type extracted from text

FedEx vs UPS classification relies on parsing free-text order descriptions. This could miss edge cases, abbreviations, or unusual formatting — causing an order to be misclassified into the wrong wave.

Section 08

What's Next / How to Fix

To move from approximate results to exact route optimization, we need the following information from LifeSeasons. Once we have them, we can replace the pick-sequence proxy with real Manhattan distances — the industry-standard method for calculating exact warehouse walking distances. This will make every algorithm significantly more accurate and the true improvement over WMS will be confirmed with real numbers, not estimates.

The following is what we need from LifeSeasons to proceed to the next phase. Until we have these, all improvement figures remain directional estimates, not confirmed results.

With these inputs, we can build exact route cost calculations. The CW algorithm's 56.8% improvement over WMS is already impressive using approximate distances. With real warehouse geometry, that number will only get better — and we'll be able to give LifeSeasons a precise dollar figure on labor cost savings per day.

Summary — Current State vs Future State
Current State
Pick-sequence proxy for distance
~10 days of historical data
Capacity = 15 (assumed)
CW: 56.8% better than WMS
After Data Collection
Exact aisle-level distances
More historical order data
Real cart capacities per type
Even larger improvement vs WMS