The Problem
At LifeSeasons, warehouse workers pick products for multiple customer orders at the same time. A group of orders assigned to one cart is called a batch. A worker loads a cart, walks from shelf to shelf picking items across multiple orders, then returns to the packing station. The route a worker takes — and how far they walk — depends entirely on which orders were grouped together.
The LifeSeasons Warehouse Management System (WMS) groups orders into batches automatically — but nobody knew if it was doing a good job. The goal of this project: find smarter ways to group orders so workers walk less, saving time and labor cost.
Every extra step a worker takes is time not spent packing. Across dozens of workers, hundreds of batches, and thousands of orders per week, even a 20% reduction in walking distance compounds into significant labor savings. The right batching algorithm treats this as a routing problem: group orders so the combined route is as short as possible.
The Data We Had
Data Sources
Order history — every pick event: order number, batch assignment, shelf location, units picked. This tells us exactly which shelf locations each order visited.
Shelf locations — each location has a name like
21-8-5-1 (Row 21, Bay 8, Level 5, Slot 1) and a
pick sequence number (1 to 72,775). The pick sequence represents
physical walking order — location 1,000 comes before location 2,000 on the walk path.
Batch assignments — which orders the WMS grouped
together (e.g., BATCH6174, BATCH6180).
Shipping type — FedEx (urgent) vs UPS Ground (normal), extracted from order description text.
Warehouse Structure
Location names encode the physical layout: ROW - BAY - LEVEL - SLOT
The inventory data includes a zone field that LifeSeasons' own WMS assigns to every shelf location. Two zones appear in the data: Speed Cell and BackStock. These are LifeSeasons' own terms — not labels we assigned.
~10 days of order data were used in this analysis. The shipping type (FedEx vs UPS) was parsed from free-text order descriptions. 366 FedEx orders and 564 UPS orders were identified in the usable dataset.
How We Measure Route Cost
To compare algorithms, we need a single number that represents "how much walking does this batch require?" We define route cost as:
For example, if a batch has pick locations at sequences 1,200 — 4,500 — 8,000, the route cost is 2 × 8,000 = 16,000. The worker must travel to position 8,000 and back, regardless of how many stops are in between.
Important caveat: This formula is a simplification. Pick sequence is an index — not actual meters. The real impact of this limitation is covered in detail in the Limitations section.
Priority / Delivery Type Handling
Not all orders are equal. FedEx orders are urgent — they must ship out fast. UPS Ground orders have more flexibility. Every algorithm in this study uses wave splitting to handle this.
from description
batched first
batched second
before Wave 2 starts
Zero mixing — no urgent FedEx order ever shares a cart with a normal UPS order. Wave splitting is identical across all algorithms. It is a pre-processing step that runs before any batching algorithm begins.
The 4 Methods
We tested three purpose-built algorithms against the existing WMS (and a random baseline). All three run on top of the wave splitting described above.
Jaccard Greedy
Groups orders that share the same shelf locations. Simple and fast — completely blind to actual walking distance.
- Seed = order with the highest pick sequence
- Fill batch by Jaccard similarity score (shared locations / total locations)
- Keeps filling until batch is full or no similar orders remain
- Analogy: grouping people going to similar neighborhoods
Clarke-Wright Savings (CW)
Calculates savings for every possible pair of orders. Never merges orders that increase total walking.
- Savings = cost(A alone) + cost(B alone) − cost(A+B together)
- All pairs ranked by savings, highest first
- Greedily merges pairs with greatest overlap
- Analogy: finding which two routes overlap most and combining them
RAG (Routing-Aware Greedy)
Seed-based. Picks one order, then keeps adding the order with minimum extra walking. Industry standard — used by Zalando.
- Seed one order, then find neighbor with lowest marginal cost
- Two-stage: Jaccard screen (top 15 candidates) then pick by min marginal cost
- Called "DGA" (Distance-Greedy Algorithm) at Zalando
- Analogy: a GPS that picks the next stop adding least distance
CW + Consolidation
Clarke-Wright plus a second pass that combines small batches to ensure fuller carts.
- Runs full CW first
- Then iterates: merge smallest batches into nearby batches
- Only merges same priority type (no FedEx + UPS mixing)
- Ensures cart capacity is used as efficiently as possible
Results
All numbers below are from real LifeSeasons data (~10 days, 930 orders). Batch capacity set to 15 orders per cart. Lower average route cost = better.
| Method | Avg Route Cost | vs Random | vs WMS | Status |
|---|---|---|---|---|
| Random (no system) | 81,649 | baseline | — | No optimization |
| WMS (current) | 51,901 | +36.4% better | baseline | In production |
| Jaccard Greedy | 38,659 | +52.7% better | +25.5% better | Tested |
| RAG (Routing-Aware Greedy) | 38,659 | +52.7% better | +25.5% better | Tested |
| Clarke-Wright Savings | 22,407 | +72.6% better | +56.8% better | Top Performer |
| CW + Consolidation | 22,407 | +72.6% better | +56.8% better | Top Performer |
Results by Delivery Type
Breaking down performance by wave shows where each algorithm excels:
FedEx (Wave 1 — Urgent)
UPS Ground (Wave 2 — Normal)
Clarke-Wright is the clear winner. It outperforms the current WMS by 56.8% on average and is especially dramatic on UPS Ground orders — over 50% improvement. The consolidation pass doesn't change the average cost but ensures carts are fuller, reducing the total number of trips needed.
Important disclaimer on these numbers: The 56.8% improvement figure is based on our current simplified distance model (pick sequence, not real walking distance). Once we receive the actual warehouse floor plan and physical dimensions, these numbers will change — they could be higher or lower. Do not treat 56.8% as a guaranteed outcome. It is a strong directional signal that CW significantly outperforms the current WMS, but the exact magnitude will only be confirmed once real distances are plugged in.
Limitations
This means all three algorithms are currently making decisions based on a simplified number that doesn't fully capture real walking distance. CW's savings calculation, RAG's marginal cost, and Jaccard's grouping are all affected. The rankings between algorithms are still meaningful, but the absolute cost numbers are approximations.
Why Not Manhattan Distance?
Manhattan distance is a well-known routing method that calculates exact walking distance between two shelf locations — horizontal steps + vertical steps, like navigating city blocks. It is actually the ideal method for warehouse routing and was considered for this project.
However, Manhattan distance requires knowing the actual X-Y coordinates of every shelf location — how far apart the aisles are, how long each aisle is, where the depot/start point sits. Without those physical measurements, Manhattan distance cannot be calculated.
What we currently have is only the pick sequence number — which tells us the walking order of locations (1st, 2nd, 3rd...) but not the actual physical distances between them. Using Manhattan without real coordinates would mean guessing measurements, which would produce less reliable results than what we have now. Manhattan is the goal once the warehouse floor plan arrives.
Other Limitations
Real capacity varies by cart type: Red 2×5 = 10 orders, Green 5×9 = 45 orders, Blue 4×7 = 28 orders. Using the wrong capacity changes how many orders can share a cart and directly affects batch groupings.
More order history would reveal seasonal patterns, product velocity trends, and give algorithms more pairs/groups to optimize. Results should improve with more data.
FedEx vs UPS classification relies on parsing free-text order descriptions. This could miss edge cases, abbreviations, or unusual formatting — causing an order to be misclassified into the wrong wave.
What's Next / How to Fix
To move from approximate results to exact route optimization, we need the following information from LifeSeasons. Once we have them, we can replace the pick-sequence proxy with real Manhattan distances — the industry-standard method for calculating exact warehouse walking distances. This will make every algorithm significantly more accurate and the true improvement over WMS will be confirmed with real numbers, not estimates.
The following is what we need from LifeSeasons to proceed to the next phase. Until we have these, all improvement figures remain directional estimates, not confirmed results.
-
1Warehouse Dimensions
- How long is one aisle?
- How wide is the gap between two aisles?
- Where is the start/end point for pickers?
-
2Carts
- Which cart types do pickers actually use day to day?
- How is it decided which cart a picker uses for a given batch?
- Is it one order per tote/bin, or can one tote hold multiple orders?
-
3Current Batching Logic
- How does the WMS currently decide which orders go into which batch?
-
4Warehouse Map
- Can you share a floor plan or map of the warehouse?
-
5Warehouse Zones — Speed Cell vs BackStock
- Your inventory data labels every shelf as either Speed Cell (rows 1–3) or BackStock (rows 10–21). Can you confirm what each zone means operationally — e.g., are Speed Cell items picked more frequently because they are top-sellers?
- Rows 4–9 have no inventory entries in the data provided. What are those rows used for — staging, packing, office space, or something else?
With these inputs, we can build exact route cost calculations. The CW algorithm's 56.8% improvement over WMS is already impressive using approximate distances. With real warehouse geometry, that number will only get better — and we'll be able to give LifeSeasons a precise dollar figure on labor cost savings per day.