Fragrance Collapse Analysis

Five Red Flags — FM-01 Forecast Engine

These are the five structural concerns that must be understood before any validation conversation treats backtest results as definitive. They are not reasons to abandon the engine — they are the analytical context required to interpret its outputs correctly.

🚩 01 Decay Cap Flag 4 in Dave's diagnostic — asymmetric bound biases in both directions

The −50% floor protects against underforecasting collapsing fragrances by applying a minimum. But there is no analogous upper bound. This asymmetric constraint is a systematic bias generator: it prevents one class of error at the cost of introducing the opposite class.

Backtest case — Coastal Tide (Y4): Actual decline was only −17%, but the −50% cap was applied, causing a 38% underforecast. The cap forces a pessimistic forecast on fragrances that don't deserve it.

This cannot be fixed by tuning the floor value. It requires either removing the floor entirely and letting the pooled decay curves carry the distribution, or implementing a symmetric upper bound so the constraint is directionally neutral.

Fragrance	Actual YoY	Applied Cap	Bias Direction
Pomelo	−71.2%	−50% floor	Underdecays ← floor binds
Coastal Tide	−17%	−50% floor	Overdecays ← floor misapplied

Fix path: remove floor · let pooled empirical curves define the decay distribution · add symmetric upper cap if needed

🚩 02 Vendor MAPE Layer Flag 2 in Dave's diagnostic — MAPE is unstable at n=5; Newsvendor is the right instrument

Fragrance-level MAPE at n=5 is not a stable estimate. With only 5 fragrances, each contributes 20% of the metric — removing any single fragrance shifts the headline number materially. MAPE is also sensitive to small-denominator effects: fragrances with small actuals inflate percentage error regardless of forecast quality.

The headline fragrance-level MAPE must always be reported with this instability context — not as a precise model performance number. sMAPE (symmetric MAPE = 2 × |forecast − actual| / (forecast + actual)) is less sensitive to these effects and should be the primary reported metric.

The deeper issue: MAPE measures forecast accuracy, but inventory decisions are profit/loss decisions under uncertainty. The correct instrument is Newsvendor — it optimizes stocking quantities given a cost-of-underage vs cost-of-overage ratio, which is the actual business decision.

Metric	Value	Context
Fragrance MAPE	~high	Unstable — each fragrance = 20% of metric
sMAPE	56.8%	Directionally consistent with MAPE; less denominator-sensitive
Remove Pomelo	Shifts materially	Single outlier drives the headline number
Right instrument	Newsvendor	Optimizes inventory under cost asymmetry — not MAPE

🚩 03 Monte Carlo Path Flag 5 in Dave's diagnostic — CI undercoverage because growth uncertainty is absent; Monte Carlo fixes it

Confidence interval coverage in the backtest was far below nominal. CIs built on decay rate uncertainty alone showed only 1 of 5 actuals inside the 70% CI (expected: 3–4) and 3 of 5 inside the 90% CI (expected: 4–5). Coastal Tide's actual fell above the 90% upper bound; Pomelo's fell below the 90% lower bound — complete misses at the widest interval level.

Structural cause: Layer 3 growth rate uncertainty accounts for 63% of actual forecast error but is entirely absent from CI construction. Intervals will remain underconfident until growth uncertainty is added.

Monte Carlo path fixes this by drawing jointly from the decay distribution and the growth rate distribution — producing intervals that reflect the true uncertainty structure of the model.

CI Level	Expected Coverage	Actual Coverage	Status
70% CI	3–4 of 5	1 of 5	Severely underconfident
90% CI	4–5 of 5	3 of 5	Still underconfident

Missing: Layer 3 growth uncertainty (63% of actual error). Fix: Monte Carlo draws from both decay + growth distributions jointly.

🚩 04 Historical Depth Flag 1 in Dave's diagnostic — backtest tests the floor, not the pooled curves, because of thin history

The backtest does not test the pooled decay curve approach. In backtest mode (train on 2024 only), there are zero year-over-year transitions — so the pooled curve is empty and the engine falls back to the −50% cap for all returning fragrances.

The backtest is answering: "what happens if you apply a −50% decay cap to everything at 28.5% growth?" It is not a test of the pooled approach at all.

Any claim that this backtest validates the model design requires this disclosure. The pooled decay curves — the core innovation — have never been tested out-of-sample. Historical depth (more years of YoY data) is the prerequisite for a real test.

What the backtest tests	What it does NOT test
−50% floor applied to 28.5% growth	Pooled empirical decay curves
Fragrance-level allocation sensitivity	Out-of-sample decay distribution
Total volume accuracy	Pooled curve shape or breadth

Path to a real test: accumulate 2025 full-year data → build pooled curves from 2024 actuals → backtest against 2025. First valid test is the 2026 production run.

🚩 05 Fragrance Collapse Flag 3 in Dave's diagnostic — growth target has no validation process; collapse patterns are the signal it needs

The 28.5% growth target is accepted as the planning input. What it currently lacks is a documented validation process — a set of leading indicators that could confirm or refine the target before each production run.

Given the engine's 239 unit/percentage-point sensitivity to the growth rate input, establishing a repeatable validation process is the highest-leverage reliability improvement available without any additional engineering. A 1pp error in the growth target propagates to 239 units of misallocation across the fragrance mix.

The fragrance collapse signals identified in this analysis — pre-season velocity cliffs, within-season trajectory divergence, low repeat retention — are exactly the leading indicators that could confirm or invalidate the growth target each cycle.

Leading Indicator	What It Validates	Source
Klaviyo list engagement YoY	Demand baseline trend	Klaviyo MCP
Prior-season reorder rate	Retention signal for returning fragrances	Shopify / Amazon orders
Ad spend trajectory	Canopy investment direction	Amazon Ads / Meta Ads
Pre-season velocity cliff	Collapse risk — SSN pattern	Weekly order data
Within-season divergence	Collapse risk — Pomelo pattern	Mid-season YoY

Engine sensitivity: 239 units per 1pp growth rate. A 3pp error = ~717 units of misallocation. Validation process is the highest-ROI fix.

🔴 Collapsed

A fragrance with structural demand failure — decline that exceeds normal seasonal variation and does not self-correct. All criteria must hold:

1. Severity — YoY unit decline ≥ 50% in any season occurring in Year 2 or later (past the launch-adjustment phase)
2. Scale — Absolute unit loss ≥ 500 in the declining season (filters micro-presence noise)
3. Sustained — Decline is not a one-off; pattern persists across ≥1 additional season

Adjusted rule for cross-season collapse (Amber Santal pattern): Y1→Y2 decline ≥ 60% + sustained multi-season pressure qualifies even below the 500-unit threshold.

Confidence tiers: Confirmed = on prior watch list Signal = newly meets criteria, needs review

🟡 Watch

Fragrance showing meaningful deterioration but not yet decisive collapse. Any one condition:
• Y2+ season decline of 25–49%, OR
• Two or more seasons with ≥ 30% YoY decline (sustained pressure)

🟢 Stable

No Y2+ season exceeds a 25% YoY decline. Single-year dips followed by recovery.
Note: fragrances with no Y2→Y3 transition yet (insufficient history) default to Stable — not the same as confirmed health.

What "Year 2+" means

Y1→Y2 transitions are excluded from the collapse trigger. Launch years often see distorted volumes (over- or under-stocking, limited distribution, promotional spikes) that make YoY comparisons unreliable as a health signal. Collapse criteria engage from the second repeat season onward, when demand should be stabilising around a true run rate.

Fragrance Classification Register

All fragrances with ≥1 YoY observation. Sorted by classification then severity. 2026 data excluded (confounded Spring stocking).

Fragrance	Classification	Confidence	Primary Worst YoY	Worst Season	Units Lost	Severe Seasons (≥30% down)	Summer YoY

⚠ Scent families elsewhere in this report are inferred from fragrance names only — not from analytical composition clustering. Treat as a directional hypothesis pending proper GC-MS or note-level tagging.

Reading This Tab

Bar height = count of fragrances in each YoY% bucket. Two distributions shown side-by-side: the growth/launch phase (Y1→Y2) and the maturity/decay phase (Y2→Y3). The -50% floor line shows where the current FM-01 max_decay sits relative to the actual distribution. Smoothed KDE curves are on the next tab.

Season:

Summer

Spring

Fall

Holiday

Y1→Y2 Distribution

First repeat season. Launch volatility produces wide spread — some fragrances post strong growth, others immediately contract. Collapse fragrances in red.

Y2→Y3 Distribution

Second repeat season — the critical decay window. The -50% floor (dashed red) sits inside the distribution, clipping real outcomes not just tail events.

Kernel Density Estimation

Smoothed probability density — height = relative likelihood of landing at that decay rate. KDE uses Gaussian kernels with Silverman's rule bandwidth. Collapse and stable sub-populations shown as separate curves so you can see whether they separate or overlap. The -50% floor cuts through the combined curve, not just the tail.

Season:

Summer

Spring

Fall

Holiday

KDE — Y1→Y2 Transitions

Blue = stable fragrances. Red = fragrances that later collapsed. Do early Y1→Y2 declines predict eventual collapse?

KDE — Y2→Y3 Transitions

By Y2→Y3 the populations separate. Collapse cluster sits left of -50% floor. Stable cluster spreads right. Floor clips ~28% of Summer observations.

Combined KDE — All Transitions Overlaid

Y1→Y2 (amber) vs Y2→Y3 (blue). See how the distribution shifts left — and narrows — as fragrances age into their maturity phase.

Season:

Summer

Spring

Fall

Holiday

All

YoY Strip Plot — All Fragrances

Each row = one fragrance. ● = full-season 2024→2025. ◆ = prior year (2023→2024). Sorted by 2024→2025 performance. Red = collapsed/known, blue = stable.

Season:

Summer

Spring

Fall

Holiday

All

Year:

Both

23→24

24→25

Mid-Season Signal vs Full Season

X = mid-season YoY (first half). Y = full-season YoY. Points on diagonal = mid perfectly predicted final. Below = second half underperformed. Collapse frags cluster bottom-left.

🚩 Sea Salt Neroli — Demand Cliff Repeating in 2026

SSN Q1 2026: 257 units in January → 1 in February → 0 in March. Same cliff pattern as Feb 2025 (47→2 units). SSN is repeating its collapse signal at the same point in the calendar year. Whether this is an active inventory decision or organic demand loss is unknown from order data alone — but the pattern is identical.

Fragrance:

SSN

Pomelo

Coastal

Blonde

Woodland

Weekly Units — Q1 2025 vs Q1 2026

Solid = 2026, Dashed = 2025. Look for divergence from the prior-year trajectory.

Avg Weekly YoY% — Q1 2026 vs Q1 2025

Average of available weekly comparisons. Red = confirmed collapse frags.

March 2026 vs March 2025 (Complete Month)

March is the only fully complete spring month available. Growth fragrances (+) show genuine demand. Declines may reflect seasonality or delisting.

In-Season Shape

Monthly unit curves for 2023, 2024, 2025 reveal whether collapse was cliff-like mid-season (SSN) or a uniform compression across all months (Pomelo). Switch season windows to see pre-season vs in-season patterns.

Season:

Summer (May–Aug)

Spring (Mar–Apr)

Fall (Sep–Oct)

Full Year

Covariate Findings

Two signals correlate with collapse: (1) Amazon ad spend cutoff — Pomelo and SSN had spend cut to ~zero in 2025 after declining in 2024. Causality direction unknown. (2) Retention rates for collapse fragrances (3.5–3.9%) run below stable fragrances (5.7–7.6%) but the gap is narrow. Data gaps for higher-signal covariates listed below.

Amazon Ad Spend by Fragrance — Monthly

Pomelo: last spend Sep 2024. SSN: last spend Oct 2024. Both zero through 2025. Growing fragrances continued/increased spend.

Data Gaps — Covariates Not Yet Available

Signals that would improve collapse prediction. Priority = potential signal strength.

Source	Signal	Priority
Amazon listing health	Review count, avg stars, buy box %, suppression events	High
Klaviyo per-fragrance	Campaign open/CVR tagged by featured fragrance	High
Google Trends	Search demand for scent names / scent families	Medium
Fragrance composition	Top/mid/base notes, GC-MS family (not name-based)	Medium
Repeat purchase cohorts	Formal fragrance-level retention series	Available
Inventory / stockouts	Was demand suppressed by OOS, not lost?	Medium

How to Use

Replace FM-01's fixed max_decay point estimate with a random draw from the empirical distribution below. Each Monte Carlo iteration samples a decay rate; 10K iterations produce a demand distribution. The Newsvendor CR then finds Q* on that distribution rather than a single number.

Season:

Summer

Spring

Fall

Holiday

Probability Density — Y2→Y3

KDE smoothed density. Taller = more fragrances landed at that decay rate. The -50% floor (dashed) cuts through the body of the curve — not just the tail.

Bootstrap — 10,000 Simulated Draws

Resamples the empirical pool 10K times. Red bins = below -50% floor. Amber = negative but above floor. Blue = positive. Shape reflects real data: more Summer Y2→Y3 observations are negative than positive.

Python Implementation

Drop-in replacement for FM-01's fixed decay rate. Wrap in MC loop for demand distribution.