Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
174 changes: 0 additions & 174 deletions examples/07_tutorials/ai-reco-of-recos/.gitignore

This file was deleted.

21 changes: 0 additions & 21 deletions examples/07_tutorials/ai-reco-of-recos/LICENSE

This file was deleted.

2 changes: 1 addition & 1 deletion examples/07_tutorials/ai-reco-of-recos/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ The system consists of several components:

## Credits

This project uses the [Recommenders](https://github.com/microsoft/recommenders) library as a reference for recommendation algorithms.
This project uses the [Recommenders](https://github.com/recommenders-team/recommenders) library as a reference for recommendation algorithms.

## License

Expand Down
63 changes: 30 additions & 33 deletions examples/07_tutorials/ai-reco-of-recos/prompts/action_plan.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,36 +18,39 @@ This action plan will be recommended to run on Azure, please propose an Azure st

**Sections to Produce (exact headers)**
Based on the suggested candidate recommender algorithms above, please generate the following:
1. **Candidate Generators** – list 1-3 recall models; include data needed & batch cadence.
2. **Re-Ranker / Final Model** – architecture, loss, label definition, and latency target.
3. **Training Stack** – ETL tech, framework (PyTorch Lightning / Spark / LightGBM), HPO strategy, expected hardware.
1. **Architecture** - select the architecture (batch, real-time, hybrid), justify it with data usage, latency requirements
2. **Model selection** - select a model within Recommenders library, include features needed, loss, label definition, objective and latency target
3. **Training Stack** – ETL tech, expected hardware (CPU, GPU or Spark), HPO strategy, training time.
4. **Serving Path** – storage (Feature Store / Redis / Cosmos DB), ANN / cache layer, model runtime (ONNX-RT, Triton, etc.), P99 latency.
5. **Metrics & Roll-out** – offline metrics, guard-rail KPIs, A/B or bandit schedule, traffic % ramp.
5. **Metrics & Roll-out** – offline metrics, online metrics, guard-rail KPIs, A/B or bandit schedule, traffic % ramp.
6. **Docs & IaC** – artifacts to generate (README, dashboards), IaC spec (Bicep / Terraform), monitoring hooks.

---

**Output Example - the below is an example ONLY**

```
# 1 · Candidate Generators
· ALS-Spark on last-90-day interactions (nightly)
· TF-IDF on product titles & descriptions for cold SKUs (hourly)

# 2 · Re-Ranker / Final Model
· xDeepFM combining ID embeddings + price + brand + dense CTR features
# 1 · Architecture
· Hybrid architecture with candidate generation with high recall and real-time reranking with high precision
· Useful for a large dataset of 20M interactions daily and a latency requirement of <100ms
· Retraining cadence (e.g., hourly, daily)

# 2 · Model selection
· SASRec with user-item-interaction together with LightGBM combining text embeddings + price + brand + dense CTR features
· VW with logistic regression for real-time reranking
· Objective: binary add-to-cart, AUCPR optimised, 50 ms budget

# 3 · Training Stack
· Azure Databricks → Delta Lake → PyTorch Lightning on 4×A100
· Azure Databricks → Delta Lake → AzureML with multi-GPU server with 4x NVIDIA A100
· Hyperdrive Bayesian sweep (LR, layer dims)

# 4 · Serving Path
· Feature Store (Redis) → Faiss HNSW recall (≤20 ms)
· ONNX-Runtime on AKS for xDeepFM (≤30 ms)
· ONNX-Runtime on AKS for real-time (≤30 ms)

# 5 · Metrics & Roll-out
· Offline: nDCG@20, MAP@20; Online: ATC rate, GMV lift
· Offline: nDCG@20, MAP@20
· Online: add-to-cart rate, revenue lift
· Progressive rollout 1% ➔ 5% ➔ 25% over 2 weeks with sequential testing

# 6 · Docs & IaC
Expand All @@ -59,29 +62,27 @@ Fill each section thoroughly; bullet form is fine.
**Expected JSON Response**

{{
"candidate_generators": [
"architecture": [
{{
/* Name and flavour of the recall model that surfaces a
*large* pool of items (e.g., "Spark-ALS (implicit, 90-day)",
"Popularity-biased TF-IDF", "LightGCN Graph Recall"). */
"model": "Recommendation model name/type used for generating initial candidates",
/* Name and flavour of the architecture (e.g., "Batch architecture",
"Real-time architecture", "Hybrid architecture (recall-rerank)"). */
"type": "Recommendation architecture used",

/* All raw sources this generator needsIDs, interaction logs,
/* All raw sources this architecture needs: IDs, interaction logs,
catalogue metadata, embeddings store paths, etc.
Mention format + retention window if relevant. */
"data_needed": "Data sources and types required for this generator",
"data_needed": "Data sources and types required for this architecture",

/* How often the recall index or scores are refreshed.
/* How often the models are retrained.
Typical values: 'hourly', 'nightly', 'near-real-time (Kafka stream)'. */
"batch_cadence": "How frequently this generator runs (e.g., hourly, daily)"
"retraining_cadence": "How frequently this generator runs (e.g., hourly, daily)"
}}
],

"reranker": {{
/* Architecture that scores the relatively small candidate set
returned by the generator(s)—e.g., 'xDeepFM', 'Transformer
cross-encoder', 'Wide & Deep + price/brand features'. */
"architecture": "Model architecture used for reranking candidates",
"models": {{
/* Algorithm selection from Recommeders library—e.g., 'xDeepFM', 'Transformer
cross-encoder', 'Wide & Deep + price/brand features', 'SASRec', 'BPR'. */
"algorithms": "Algorithm or algorithm combinations",

/* Training objective—point-wise BCE, pair-wise hinge, listwise
LambdaRank, AUCPR, etc. */
Expand All @@ -101,16 +102,12 @@ Fill each section thoroughly; bullet form is fine.
'PySpark → Delta Lake', 'Azure Data Factory', 'dbt + Snowpark', etc. */
"etl": "Data extraction, transformation, loading technology",

/* Core model-training framework(s):
'PyTorch Lightning', 'LightGBM-CLI', 'TensorFlow-Keras', etc. */
"framework": "ML framework used for model development",

/* Hyper-parameter search strategy:
'AzureML HyperDrive (Bayesian)', 'Optuna + ASHA', 'manual grid'. */
"hpo_strategy": "Hyperparameter optimization approach",

/* Compute requested for a *single* full training run—
node type / count / GPUs, or 'CPU-only', etc. */
node type / count / GPUs, Spark, or 'CPU-only', etc. */
"hardware": "Compute resources required for training"
}},

Expand Down Expand Up @@ -138,7 +135,7 @@ Fill each section thoroughly; bullet form is fine.
"offline_metrics": ["Metrics used to evaluate model quality offline"],

/* Business KPIs tracked in production:
['CTR','GMV','Watch Time','Add-to-Cart Rate'], etc. */
['CTR','GMV','Watch Time','Add-to-Cart Rate', 'ARPU'], etc. */
"online_kpis": ["Business KPIs monitored during deployment"],

/* Roll-out pattern:
Expand Down
32 changes: 18 additions & 14 deletions examples/07_tutorials/ai-reco-of-recos/prompts/reco_selection.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,11 @@ Output a short, ordered list (`ranked_algos`) plus a 2-sentence rationale for ea
**Table A – Canonical Algorithms**

```
• Explicit-rating CF …… ALS, Surprise-SVD, LightFM, Wide&Deep, xDeepFM
• Explicit-rating CF …… ALS, Surprise-SVD, LightFM, Wide&Deep, xDeepFM, SAR
• Implicit-feedback CF …… BPR, SAR, LightGCN, NCF, RBM, ALS-implicit, xDeepFM, LightFM
• Sequential / session …… SASRec, Caser, GRU, NextItNet, A2SVD, SLi-Rec, SUM, SSEPT
• Content / hybrid …… LightGBM-GBT, TF-IDF, DKN, NAML, NPA, NRMS, LSTUR, VW, Wide&Deep, xDeepFM, LightFM
• Generative / VAE …… BiVAE, Multinomial-VAE, Standard-VAE
• Geo / Riemannian …… GeoIMC, RLRMC
```

---
Expand All @@ -26,32 +25,37 @@ Output a short, ordered list (`ranked_algos`) plus a 2-sentence rationale for ea

1. **Interaction\_Type**

* `explicit_ratings`
* `explicit_ratings` (ratings)
* `implicit_events` (clicks, views, purchases)
* `sequential_events` (time-ordered)
* `content_driven` (news, ads, cold items)
2. **Cold\_Start\_Severity** (`high` | `medium` | `low`)
3. **Scale\_and\_Latency**
3. **Architecture**

* `1 B events` → prefer Spark/graph models (ALS-Spark, LightGCN)
* `online_<100 ms` → prefer lightweight or two-stage (TF-IDF, VW, SAR similarities)
4. **Business\_Metric** (`CVR` | `WatchTime` | `ARPU` | `Engagement`)
5. **Regulatory\_or\_Explainability\_Needed** (`yes` | `no`)
6. **Available\_Features** (`ids_only` | `rich_context` | `text_KG` | `geo`)
7. **Compute\_Budget** (`cpu_only` | `gpu_ok` | `distributed_spark`)
* `batch` → lowest latency (5ms-20ms), cheaper, trained once a day (it doesn't capture real-time interactions), look-up to a quick database (Redis, SQL Server)
* `real_time` → high latency (100ms-200ms), model deployed and score in real-time
* `hybrid` → Also called recall-rerank architecture or 2 step recommender, mid latency (20ms-100ms), candidate generation, real-time reranking
4. **Scale\_and\_Latency**

* `1 B events` → prefer Spark or lightweight models (ALS-Spark, SARplus, xLearn)
* `<20ms` → prefer batch with heavy models (SASRec, BiVAE, LightGCN)
* `online_<100 ms` → prefer hybrid architecture with heavy models for candidate generation with high recall (SASRec, BiVAE, LightGCN) and lightweight models for real-time reranking (VW, LightGBM)
5. **Business\_Metric** (`CVR` | `CTR` | `WatchTime` | `ARPU` | `Engagement`)
6. **Regulatory\_or\_Explainability\_Needed** (`yes` | `no`)
7. **Available\_Features** (`ids_only` | `rich_context` | `text_KG` )
8. **Compute\_Budget** (`cpu_only` | `gpu_ok` | `distributed_spark`)

---

**Selection Rules (compressed)**

* *Explicit + ids\_only* → ALS / SVD.
* *Explicit + ids\_only* → ALS, SVD, SAR (similarity).
* *Implicit + ids\_only* → BPR (pairwise), SAR (similarity), LightGCN (graph), NCF (deep).
* *Sequential* → Transformer (SASRec/SSEPT) if gpu\_ok, else Caser/GRU.
* *Sequential* → Transformer (SASRec/SSEPT) or Caser/GRU if gpu\_ok, else LightGBM.
* *Content\_driven & high\_cold\_start* → LightGBM-GBT, TF-IDF, DKN/NRMS/LSTUR family.
* *Need\_CVR / ROI* → point-wise GBT (LightGBM) or xDeepFM / Wide\&Deep.
* *Need\_Interpretability* → GBT, TF-IDF, attention-based (DKN, NRMS), VW with per-feature weights.
* *Geo features* → GeoIMC, RLRMC.
* *Massive\_scale* → Spark-ALS, SAR-Plus, LightGCN with mini-batch sampling.
* *Massive\_scale* → Spark-ALS, SAR-Plus, LightGBM-Spark, LightGCN with mini-batch sampling.
* *Real-time bandit / low\_latency* → pre-compute candidates + VW or LightGBM re-rank.

---
Expand Down
Loading