Online Learning for CTR Models

Continuous model improvement through automatic retraining on new click data.

Overview
1. What is Online Learning?
2. Why Online Learning?
Architecture
Configuration
1. Enable Online Learning
2. Training Trigger Logic
Model Management
Usage Guide
1. Setup Online Learning
2. Monitor Online Learning
Best Practices
Troubleshooting
Advanced Topics
1. Incremental vs. Full Training
2. Multi-Model Strategy
Related Resources

Overview

What is Online Learning?

Online Learning enables CTR models to continuously improve by automatically retraining on newly collected click data. Unlike traditional batch training, online learning allows the model to adapt to changing user behavior patterns in real-time.

Key Concepts:

Incremental Training: Model updates with new data without full retraining
Automatic Triggers: Training starts automatically when enough new data arrives
Model Versioning: Separate checkpoints for online and offline models
Hot Reloading: Seamless model updates without service interruption

Why Online Learning?

Business Benefits:

Real-time Adaptation: Model adapts to current user preferences
Improved Accuracy: Continuous learning improves prediction quality
Reduced Staleness: Model stays current with latest trends
Automated Pipeline: No manual intervention needed

Technical Benefits:

Data Efficiency: Leverage every user interaction immediately
Incremental Updates: No need to retrain from scratch
Lower Latency: Smaller training batches complete faster
Easy Rollback: Keep multiple checkpoint versions

Architecture

Training Modes

The system supports two training modes that can run in parallel:

Mode	Trigger	Data	Model Path	Checkpoints	Use Case
Offline	Manual	Full dataset	`models/ctr_offline/`	All versions	Baseline, major updates
Online	Automatic	Incremental	`models/ctr_online/`	Last 5 only	Continuous improvement

Workflow

sequenceDiagram
    participant User
    participant System
    participant DataService
    participant ModelService
    
    User->>System: Search & Click
    System->>DataService: Record click event
    DataService->>DataService: Accumulate data
    
    Note over DataService: Check trigger threshold
    
    alt Threshold reached
        DataService->>ModelService: Trigger online training
        ModelService->>ModelService: Train incremental model
        ModelService->>ModelService: Save checkpoint
        ModelService->>ModelService: Clean old checkpoints
        ModelService->>System: Hot reload model
    end
    
    System->>User: Use updated model

Key Components

1. Data Service:

Monitors incoming click events
Counts new samples since last training
Triggers online training automatically

2. Model Service:

Maintains separate online/offline model directories
Manages checkpoint numbering and cleanup
Handles hot reloading of updated models

3. Configuration:

Enable/disable online learning
Set training trigger threshold
Configure checkpoint retention policy

Configuration

Enable Online Learning

In Web UI:

Navigate to “📊 第六部分：数据回收训练” tab
Find “🔄 在线学习配置” section
Toggle “启用在线学习” checkbox
Set “触发阈值” (default: 100 new samples)
System will now train automatically

Configuration Options:

{
    "online_learning_enabled": True,      # Enable/disable online learning
    "online_training_threshold": 100,     # Number of new samples to trigger training
    "checkpoint_retention": 5,            # Keep last N online checkpoints
    "model_prefix": "ctr_model_online"    # Online model file prefix
}

Training Trigger Logic

Automatic Trigger:

DataService tracks last_training_data_count
On each new click, checks if current_count - last_training_data_count >= threshold
If threshold exceeded, triggers ModelService.train_model() in background thread

Code Example:

def record_click(self, query, doc_id, position, clicked):
    """Record click event and trigger online training if needed"""
    # Save click data
    sample = self._create_sample(query, doc_id, position, clicked)
    self.samples.append(sample)
    
    # Check online learning trigger
    if self.online_learning_enabled:
        current_count = len(self.samples)
        if current_count - self.last_training_data_count >= self.online_training_threshold:
            print(f"🔄 触发在线训练 (新增 {current_count - self.last_training_data_count} 条样本)")
            self._trigger_online_training()

Model Management

Directory Structure

models/
├── ctr_offline/                    # Offline models (manual training)
│   ├── ctr_model_v1.pkl
│   ├── ctr_model_v2.pkl
│   └── ctr_model_v3.pkl
│
└── ctr_online/                     # Online models (automatic training)
    ├── ctr_model_online_001.pkl    # Oldest
    ├── ctr_model_online_002.pkl
    ├── ctr_model_online_003.pkl
    ├── ctr_model_online_004.pkl
    └── ctr_model_online_005.pkl    # Latest (active)

Checkpoint Numbering

Automatic Numbering:

Online checkpoints use sequential numbers: 001, 002, 003, …
Numbers auto-increment with each training
Format: ctr_model_online_{number:03d}.pkl

Cleanup Policy:

Keep only the last N checkpoints (default: 5)
Older checkpoints are automatically deleted
Prevents disk space exhaustion

Hot Reloading

Seamless Model Updates:

def train_model(self, model_type='logistic', is_online=False):
    """Train model with hot reloading"""
    # Train new model
    new_model = self._train(X, y)
    
    # Save to appropriate directory
    if is_online:
        model_path = self._save_online_checkpoint(new_model)
        self._cleanup_old_checkpoints()
    else:
        model_path = self._save_offline_model(new_model)
    
    # Hot reload: update active model
    self.model = new_model
    self.current_model_path = model_path
    
    print(f"✅ 模型已更新并自动加载: {model_path}")

Usage Guide

Setup Online Learning

Step 1: Enable Feature

Open Web UI → “数据回收训练” tab
Expand “🔄 在线学习配置” section
Check “启用在线学习”
Set threshold (e.g., 100 samples)

Step 2: Collect Data

Navigate to “🔍 在线检索与排序” tab
Perform searches and interact with results
System automatically records clicks
Monitor sample count in “数据管理” section

Step 3: Automatic Training

Training triggers automatically when threshold is reached

Check terminal output for training logs:

🔄 触发在线训练 (新增 102 条样本)
📊 开始训练 CTR 模型...
✅ 模型训练完成: models/ctr_online/ctr_model_online_006.pkl
🗑️ 清理旧检查点: ctr_model_online_001.pkl
✅ 模型已更新并自动加载

Step 4: Verify Model

New searches automatically use the updated model
Compare rankings before/after training
Monitor performance in evaluation metrics

Monitor Online Learning

Training Status:

Check terminal logs for training triggers
Monitor model version in use
Track performance metrics over time

Sample Count:

📊 数据统计
- 总样本数: 1,234
- 正样本数: 156 (点击)
- 负样本数: 1,078 (未点击)
- 上次训练: 1,132 样本
- 距离触发: 66 样本 (阈值: 100)

Best Practices

Threshold Selection

Choosing the Right Threshold:

Threshold	Training Frequency	Use Case
50-100	High (frequent)	High-traffic systems, rapid adaptation
100-500	Medium	Balanced approach, general use
500-1000	Low (stable)	Low-traffic systems, stable patterns

Considerations:

Too Low: Frequent training, may overfit to noise
Too High: Slow adaptation, miss recent trends
Recommended: Start with 100, adjust based on traffic

Data Quality

Ensure Quality Training Data:

Sufficient Volume: At least 50-100 samples per training
Balance: Mix of positive and negative samples
Diversity: Represent different queries and documents
Freshness: Recent data reflects current patterns

Model Monitoring

Track Key Metrics:

Training Frequency: How often models are updated
Model Performance: AUC, accuracy trends over time
Checkpoint Count: Ensure old checkpoints are cleaned
Disk Usage: Monitor online model directory size

Troubleshooting

Training Not Triggering

Problem: Online learning enabled but no automatic training

Solutions:

Check Threshold: Verify threshold is reasonable for traffic level
Check Sample Count: Ensure enough new samples accumulated
Check Logs: Look for error messages in terminal
Manual Test: Try manual training to verify system works

Model Not Updating

Problem: Training completes but model not updating

Solutions:

Check Hot Reload: Verify model service reloads after training
Check File Path: Ensure model file saved correctly
Restart Service: Reload model service if needed
Check Logs: Look for hot reload success messages

Disk Space Issues

Problem: Too many checkpoints consuming disk space

Solutions:

Reduce Retention: Lower checkpoint retention count
Manual Cleanup: Delete old online checkpoints manually
Monitor Usage: Set up disk space alerts
Archive: Move old models to archive storage

Advanced Topics

Incremental vs. Full Training

Current Implementation: Full training with latest data

Simple and reliable
No catastrophic forgetting
Good for moderate data volumes

Future Enhancement: True incremental training

Update model with only new data
Faster training for large datasets
Requires careful handling of model state

Multi-Model Strategy

A/B Testing:

Keep multiple online model versions
Route traffic to different models
Compare performance metrics
Deploy best-performing model

Ensemble Learning:

Combine predictions from multiple models
Offline model (stable) + Online model (adaptive)
Weighted average or stacking

CTR Prediction Models - Model architectures
Model Evaluation - Performance metrics
Data Collection - Sample management

Online Learning for CTR Models

Table of contents

Overview

What is Online Learning?

Why Online Learning?

Architecture

Training Modes

Workflow

Key Components

Configuration

Enable Online Learning

Training Trigger Logic

Model Management

Directory Structure

Checkpoint Numbering

Hot Reloading

Usage Guide

Setup Online Learning

Monitor Online Learning

Best Practices

Threshold Selection

Data Quality

Model Monitoring

Troubleshooting

Training Not Triggering

Model Not Updating

Disk Space Issues

Advanced Topics

Incremental vs. Full Training

Multi-Model Strategy

Related Resources