Grow Fragrance · Parquet Lake 2023–2025 · April 10, 2026
The −50% floor protects against underforecasting collapsing fragrances by applying a minimum. But there is no analogous upper bound. This asymmetric constraint is a systematic bias generator: it prevents one class of error at the cost of introducing the opposite class.
Backtest case — Coastal Tide (Y4): Actual decline was only −17%, but the −50% cap was applied, causing a 38% underforecast. The cap forces a pessimistic forecast on fragrances that don't deserve it.
This cannot be fixed by tuning the floor value. It requires either removing the floor entirely and letting the pooled decay curves carry the distribution, or implementing a symmetric upper bound so the constraint is directionally neutral.
| Fragrance | Actual YoY | Applied Cap | Bias Direction |
|---|---|---|---|
| Pomelo | −71.2% | −50% floor | Underdecays ← floor binds |
| Coastal Tide | −17% | −50% floor | Overdecays ← floor misapplied |
Fragrance-level MAPE at n=5 is not a stable estimate. With only 5 fragrances, each contributes 20% of the metric — removing any single fragrance shifts the headline number materially. MAPE is also sensitive to small-denominator effects: fragrances with small actuals inflate percentage error regardless of forecast quality.
The headline fragrance-level MAPE must always be reported with this instability context — not as a precise model performance number. sMAPE (symmetric MAPE = 2 × |forecast − actual| / (forecast + actual)) is less sensitive to these effects and should be the primary reported metric.
The deeper issue: MAPE measures forecast accuracy, but inventory decisions are profit/loss decisions under uncertainty. The correct instrument is Newsvendor — it optimizes stocking quantities given a cost-of-underage vs cost-of-overage ratio, which is the actual business decision.
| Metric | Value | Context |
|---|---|---|
| Fragrance MAPE | ~high | Unstable — each fragrance = 20% of metric |
| sMAPE | 56.8% | Directionally consistent with MAPE; less denominator-sensitive |
| Remove Pomelo | Shifts materially | Single outlier drives the headline number |
| Right instrument | Newsvendor | Optimizes inventory under cost asymmetry — not MAPE |
Confidence interval coverage in the backtest was far below nominal. CIs built on decay rate uncertainty alone showed only 1 of 5 actuals inside the 70% CI (expected: 3–4) and 3 of 5 inside the 90% CI (expected: 4–5). Coastal Tide's actual fell above the 90% upper bound; Pomelo's fell below the 90% lower bound — complete misses at the widest interval level.
Structural cause: Layer 3 growth rate uncertainty accounts for 63% of actual forecast error but is entirely absent from CI construction. Intervals will remain underconfident until growth uncertainty is added.
Monte Carlo path fixes this by drawing jointly from the decay distribution and the growth rate distribution — producing intervals that reflect the true uncertainty structure of the model.
| CI Level | Expected Coverage | Actual Coverage | Status |
|---|---|---|---|
| 70% CI | 3–4 of 5 | 1 of 5 | Severely underconfident |
| 90% CI | 4–5 of 5 | 3 of 5 | Still underconfident |
The backtest does not test the pooled decay curve approach. In backtest mode (train on 2024 only), there are zero year-over-year transitions — so the pooled curve is empty and the engine falls back to the −50% cap for all returning fragrances.
The backtest is answering: "what happens if you apply a −50% decay cap to everything at 28.5% growth?" It is not a test of the pooled approach at all.
Any claim that this backtest validates the model design requires this disclosure. The pooled decay curves — the core innovation — have never been tested out-of-sample. Historical depth (more years of YoY data) is the prerequisite for a real test.
| What the backtest tests | What it does NOT test |
|---|---|
| −50% floor applied to 28.5% growth | Pooled empirical decay curves |
| Fragrance-level allocation sensitivity | Out-of-sample decay distribution |
| Total volume accuracy | Pooled curve shape or breadth |
The 28.5% growth target is accepted as the planning input. What it currently lacks is a documented validation process — a set of leading indicators that could confirm or refine the target before each production run.
Given the engine's 239 unit/percentage-point sensitivity to the growth rate input, establishing a repeatable validation process is the highest-leverage reliability improvement available without any additional engineering. A 1pp error in the growth target propagates to 239 units of misallocation across the fragrance mix.
The fragrance collapse signals identified in this analysis — pre-season velocity cliffs, within-season trajectory divergence, low repeat retention — are exactly the leading indicators that could confirm or invalidate the growth target each cycle.
| Leading Indicator | What It Validates | Source |
|---|---|---|
| Klaviyo list engagement YoY | Demand baseline trend | Klaviyo MCP |
| Prior-season reorder rate | Retention signal for returning fragrances | Shopify / Amazon orders |
| Ad spend trajectory | Canopy investment direction | Amazon Ads / Meta Ads |
| Pre-season velocity cliff | Collapse risk — SSN pattern | Weekly order data |
| Within-season divergence | Collapse risk — Pomelo pattern | Mid-season YoY |
Y1→Y2 transitions are excluded from the collapse trigger. Launch years often see distorted volumes (over- or under-stocking, limited distribution, promotional spikes) that make YoY comparisons unreliable as a health signal. Collapse criteria engage from the second repeat season onward, when demand should be stabilising around a true run rate.
| Fragrance | Classification | Confidence | Primary Worst YoY | Worst Season | Units Lost | Severe Seasons (≥30% down) | Summer YoY |
|---|
⚠ Scent families elsewhere in this report are inferred from fragrance names only — not from analytical composition clustering. Treat as a directional hypothesis pending proper GC-MS or note-level tagging.
Bar height = count of fragrances in each YoY% bucket. Two distributions shown side-by-side: the growth/launch phase (Y1→Y2) and the maturity/decay phase (Y2→Y3). The -50% floor line shows where the current FM-01 max_decay sits relative to the actual distribution. Smoothed KDE curves are on the next tab.
Smoothed probability density — height = relative likelihood of landing at that decay rate. KDE uses Gaussian kernels with Silverman's rule bandwidth. Collapse and stable sub-populations shown as separate curves so you can see whether they separate or overlap. The -50% floor cuts through the combined curve, not just the tail.
SSN Q1 2026: 257 units in January → 1 in February → 0 in March. Same cliff pattern as Feb 2025 (47→2 units). SSN is repeating its collapse signal at the same point in the calendar year. Whether this is an active inventory decision or organic demand loss is unknown from order data alone — but the pattern is identical.
Monthly unit curves for 2023, 2024, 2025 reveal whether collapse was cliff-like mid-season (SSN) or a uniform compression across all months (Pomelo). Switch season windows to see pre-season vs in-season patterns.
Two signals correlate with collapse: (1) Amazon ad spend cutoff — Pomelo and SSN had spend cut to ~zero in 2025 after declining in 2024. Causality direction unknown. (2) Retention rates for collapse fragrances (3.5–3.9%) run below stable fragrances (5.7–7.6%) but the gap is narrow. Data gaps for higher-signal covariates listed below.
| Source | Signal | Priority |
|---|---|---|
| Amazon listing health | Review count, avg stars, buy box %, suppression events | High |
| Klaviyo per-fragrance | Campaign open/CVR tagged by featured fragrance | High |
| Google Trends | Search demand for scent names / scent families | Medium |
| Fragrance composition | Top/mid/base notes, GC-MS family (not name-based) | Medium |
| Repeat purchase cohorts | Formal fragrance-level retention series | Available |
| Inventory / stockouts | Was demand suppressed by OOS, not lost? | Medium |
Replace FM-01's fixed max_decay point estimate with a random draw from the empirical distribution below. Each Monte Carlo iteration samples a decay rate; 10K iterations produce a demand distribution. The Newsvendor CR then finds Q* on that distribution rather than a single number.