The Problem
At LifeSeasons, warehouse workers pick products for multiple customer orders at the same time. A group of orders assigned to one cart is called a batch. A worker loads a cart, walks from shelf to shelf picking items across multiple orders, then returns to the packing station. The route a worker takes — and how far they walk — depends entirely on which orders were grouped together.
The LifeSeasons Warehouse Management System (WMS) groups orders into batches automatically — but nobody knew if it was doing a good job. The goal of this project: find smarter ways to group orders so workers walk less, saving time and labor cost.
Every extra step a worker takes is time not spent packing. Across dozens of workers, hundreds of batches, and thousands of orders per week, even a 20% reduction in walking distance compounds into significant labor savings. The right batching algorithm treats this as a routing problem: group orders so the combined route is as short as possible.
The Data We Had
Data Sources
Order history — every pick event: order number, batch assignment, shelf location, units picked. This tells us exactly which shelf locations each order visited.
Shelf locations — each location has a name like
21-8-5-1 (Row 21, Bay 8, Level 5, Slot 1) and a
pick sequence number (1 to 72,775). The pick sequence represents
physical walking order — location 1,000 comes before location 2,000 on the walk path.
Batch assignments — which orders the WMS grouped
together (e.g., BATCH6174, BATCH6180).
Shipping type — FedEx (urgent) vs UPS Ground (normal), extracted from order description text.
Warehouse Structure
Location names encode the physical layout: ROW - BAY - LEVEL - SLOT
The inventory data includes a zone field that LifeSeasons' own WMS assigns to every shelf location. Two zones appear in the data: Speed Cell and BackStock. These are LifeSeasons' own terms — not labels we assigned.
~10 days of order data were used in this analysis. Priority Wave 1 includes DTC orders (identified by hyphenated order numbers, e.g. 4046858-3990825) and FedEx orders — 570 priority orders and 360 standard orders were identified in the usable dataset.
Brand Breakdown
Item numbers in the inventory data carry brand prefixes that let us identify brand directly — no guessing required. The algorithms now enforce that orders from different brands are never grouped into the same batch.
HC — e.g. HC006090. Grouped together, never mixed with NGVC.PL-NGVC. Natural Grocers Vitamin Cottage products, stored on Speed Cell 3.How We Measure Route Cost
To compare algorithms, we need a single number that represents "how much walking does this batch require?" We define route cost as:
For example, if a batch has pick locations at sequences 1,200 — 4,500 — 8,000, the route cost is 2 × 8,000 = 16,000. The worker must travel to position 8,000 and back, regardless of how many stops are in between.
Important caveat: This formula is a simplification. Pick sequence is an index — not actual meters. The real impact of this limitation is covered in detail in the Limitations section.
Priority / Delivery Type Handling
Not all orders are equal. FedEx orders are urgent — they must ship out fast. UPS Ground orders have more flexibility. Every algorithm in this study uses wave splitting to handle this.
from description
batched first
batched second
before Wave 2 starts
Zero mixing — no urgent FedEx order ever shares a cart with a normal UPS order. Wave splitting is identical across all algorithms. It is a pre-processing step that runs before any batching algorithm begins.
The 4 Methods
We tested three purpose-built algorithms against the existing WMS (and a random baseline). All three run on top of the wave splitting described above.
Jaccard Greedy
Groups orders that share the same shelf locations. Simple and fast — completely blind to actual walking distance.
- Seed = order with the highest pick sequence
- Fill batch by Jaccard similarity score (shared locations / total locations)
- Keeps filling until batch is full or no similar orders remain
- Analogy: grouping people going to similar neighborhoods
Clarke-Wright Savings (CW)
Calculates savings for every possible pair of orders. Never merges orders that increase total walking.
- Savings = cost(A alone) + cost(B alone) − cost(A+B together)
- All pairs ranked by savings, highest first
- Greedily merges pairs with greatest overlap
- Analogy: finding which two routes overlap most and combining them
RAG (Routing-Aware Greedy)
Seed-based. Picks one order, then keeps adding the order with minimum extra walking. Industry standard — used by Zalando.
- Seed one order, then find neighbor with lowest marginal cost
- Two-stage: Jaccard screen (top 15 candidates) then pick by min marginal cost
- Called "DGA" (Distance-Greedy Algorithm) at Zalando
- Analogy: a GPS that picks the next stop adding least distance
CW + Consolidation
Clarke-Wright plus a second pass that combines small batches to ensure fuller carts.
- Runs full CW first
- Then iterates: merge smallest batches into nearby batches
- Only merges same priority type (no FedEx + UPS mixing)
- Ensures cart capacity is used as efficiently as possible
Results
All numbers below are from real LifeSeasons data (~10 days, 930 orders). Each order is now assigned to its correct cart type based on rules from the LifeSeasons operations meeting, and batched using that cart's confirmed or estimated capacity. Brand grouping is enforced (HC with HC, NGVC with NGVC). DTC orders (hyphenated order numbers) are priority Wave 1. Lower average route cost = better.
| Method | Avg Route Cost | vs Random | vs WMS | Status |
|---|---|---|---|---|
| Random (no system) | 53,563 | baseline | — | No optimization |
| WMS (current) | 51,901 | +3.1% better | baseline | In production |
| Jaccard Greedy | 39,602 | +26.1% better | +23.7% better | Tested |
| RAG (Routing-Aware Greedy) | 39,602 | +26.1% better | +23.7% better | Tested |
| Clarke-Wright Savings | 30,784 | +42.5% better | +40.7% better | Top Performer |
| CW + Consolidation | 32,855 | +38.7% better | +36.7% better | Tested |
Results by Delivery Type
Breaking down performance by wave shows where each algorithm excels:
Priority / DTC (Wave 1 — Urgent)
Standard (Wave 2 — Normal)
Clarke-Wright is the clear winner. It outperforms the current WMS by 40.7% on average, with the biggest gains on standard (Wave 2) orders — nearly 29% better than Jaccard/RAG. These numbers now use per-cart-type capacities (blue: 28, red: 10, green/pack-one: ~20 est.) rather than a single flat capacity, making them significantly more realistic.
Note on CW + Consolidation: With the smaller per-cart capacities now applied, the consolidation pass actually performs worse than CW alone (32,855 vs 30,784). This happens because the pass merges small batches from different shelf areas, increasing travel distance. CW alone is the recommended approach.
Important disclaimer: The 40.7% improvement figure uses a simplified distance model (pick sequence, not real walking distance). Once we receive the actual warehouse floor plan, these numbers will be refined — the relative ranking of algorithms is reliable, but the exact magnitude will be confirmed with real geometry.
Limitations
This means all three algorithms are currently making decisions based on a simplified number that doesn't fully capture real walking distance. CW's savings calculation, RAG's marginal cost, and Jaccard's grouping are all affected. The rankings between algorithms are still meaningful, but the absolute cost numbers are approximations.
Why Not Manhattan Distance?
Manhattan distance is a well-known routing method that calculates exact walking distance between two shelf locations — horizontal steps + vertical steps, like navigating city blocks. It is actually the ideal method for warehouse routing and was considered for this project.
However, Manhattan distance requires knowing the actual X-Y coordinates of every shelf location — how far apart the aisles are, how long each aisle is, where the depot/start point sits. Without those physical measurements, Manhattan distance cannot be calculated.
What we currently have is only the pick sequence number — which tells us the walking order of locations (1st, 2nd, 3rd...) but not the actual physical distances between them. Using Manhattan without real coordinates would mean guessing measurements, which would produce less reliable results than what we have now. Manhattan is the goal once the warehouse floor plan arrives.
Other Limitations
Real capacity varies by cart type: Red 2×5 = 10 orders, Green 5×9 = 45 orders, Blue 4×7 = 28 orders. Using the wrong capacity changes how many orders can share a cart and directly affects batch groupings.
More order history would reveal seasonal patterns, product velocity trends, and give algorithms more pairs/groups to optimize. Results should improve with more data.
FedEx vs UPS classification relies on parsing free-text order descriptions. This could miss edge cases, abbreviations, or unusual formatting — causing an order to be misclassified into the wrong wave.
What's Next / How to Fix
To move from approximate results to exact route optimization, we need the following information from LifeSeasons. Once we have them, we can replace the pick-sequence proxy with real Manhattan distances — the industry-standard method for calculating exact warehouse walking distances. This will make every algorithm significantly more accurate and the true improvement over WMS will be confirmed with real numbers, not estimates.
The following is what we need from LifeSeasons to proceed to the next phase. Until we have these, all improvement figures remain directional estimates, not confirmed results.
-
1Warehouse Dimensions
- How long is one aisle?
- How wide is the gap between two aisles?
- Where is the start/end point for pickers?
-
2Carts
- Which cart types do pickers actually use day to day?
- How is it decided which cart a picker uses for a given batch?
- Is it one order per tote/bin, or can one tote hold multiple orders?
-
3Current Batching Logic
- How does the WMS currently decide which orders go into which batch?
-
4Warehouse Map
- Can you share a floor plan or map of the warehouse?
-
5DTC / SLA Order Identifier
- From the data, DTC orders appear to have hyphenated order numbers (e.g.
4046858-3990825), while non-DTC orders have plain numeric IDs (e.g.0350216). Can you confirm this is the correct way to identify same-day SLA orders? - Is there a Shopify store field or another flag we should use instead?
- From the data, DTC orders appear to have hyphenated order numbers (e.g.
-
6Full SKU Restrictions List
- We know items 272 and 284 (large boxes) must go on blue (4×7) carts and cannot be auto-bagged.
- We know item 505 (My Best Heart, crushable) cannot go in the auto bagger.
- Can you provide the complete list of SKUs with size or packing restrictions so these rules can be built into the automation? These items do not appear in our current 10k sample.
-
7Warehouse Zones — Speed Cell vs BackStock
- Your inventory data labels every shelf as either Speed Cell (rows 1–3) or BackStock (rows 10–21). Can you confirm what each zone means operationally — e.g., are Speed Cell items picked more frequently because they are top-sellers?
- Rows 4–9 have no inventory entries in the data provided. What are those rows used for — staging, packing, office space, or something else?
-
8Cart Assignment Rules — How Does an Order Get Assigned to a Cart Type?
This is one of the most critical inputs for the algorithm. Right now, we have classified every order into a cart type (pack-one, green, blue, red) using estimated rules based on what was mentioned in the meeting. These rules are not confirmed and may be wrong, which would affect all our results. We need LifeSeasons to confirm the exact logic.
- What determines which cart type an order goes on? Is it based on the number of unique shelf locations in the order, the total item quantity, the physical size/weight of the items, or a combination? For example: does a 1-location order always go on a pack-one cart, or are there exceptions?
- Can you confirm the capacity (number of orders per cart) for each cart type? From the meeting we understood: blue = 28 orders, red = 10 bins. We do not have confirmed numbers for pack-one and green carts — we assumed 20 for both. Are these correct?
- Are these rules fixed, or does the picker or Sherry make a judgment call? For instance, can the same order ever go on either a green or a blue cart depending on availability that day? Or is the assignment always deterministic based on order properties?
- Are there cart types we are missing? We are currently aware of four types: pack-one, green, blue, and red. Are there any other cart types in use at the warehouse?
Why this matters: Our algorithm splits all orders into groups by cart type before batching. If the cart assignment rules are wrong, the groups are wrong, and so are all the improvement numbers. Getting this confirmed is essential before we treat any results as reliable.
With these inputs, we can build exact route cost calculations. The CW algorithm's 40.7% improvement over WMS is already impressive using approximate distances and per-cart-type capacities. With real warehouse geometry, that number will be refined further — and we'll be able to give LifeSeasons a precise dollar figure on labor cost savings per day.