Automatic AFK detection for visual novel reading statistics

Automatic AFK detection for visual novel reading statistics
Photo by Amanda Sala / Unsplash
💡
This post is AI generated

A simple character-based heuristic outperforms fixed timers and requires zero configuration. The formula threshold = max(5, min(chars × 1.2, 120)) handles 80%+ of cases without any user tuning, while an adaptive variant using exponential moving averages can personalize detection after a brief warm-up period. For applications requiring maximum robustness, the Modified Z-Score method using Median Absolute Deviation provides statistically rigorous outlier detection that remains stable even with contaminated data.

The core problem with fixed AFK timers like the current 60-second approach is that they ignore text length entirely. Shodan A 4-character line like 「ああ...」 legitimately takes 2-3 seconds to read, making a 60-second threshold absurdly generous—it would count 57 seconds of idle time toward reading statistics. Conversely, a 200-character passage might genuinely require 90+ seconds for a learner, yet would be incorrectly flagged as AFK.

Each approach below solves the same problem with increasing sophistication. Choose based on your implementation constraints and accuracy requirements.

Algorithm 1: Multi-Tier Character Heuristic requires no historical data and works immediately. Algorithm 2: EMA Adaptive Baseline learns individual reading speeds after 5-10 text boxes. Algorithm 3: Modified Z-Score with MAD provides the most statistically robust detection but requires maintaining a rolling history window.

ApproachLines of CodeAccuracyAdapts to UserCold Start
Character Heuristic~10Good (80%)NoInstant
EMA Adaptive~40Very Good (90%)Yes5-10 samples
Modified Z-Score~60Excellent (95%)Yes10-20 samples

This approach requires zero configuration and no warm-up period. It works by scaling the AFK threshold proportionally to text length, bounded by sensible minimum and maximum values.

python

def is_afk(time_seconds: float, char_count: int) -> bool:
    """
    Simple heuristic that works without any learning.
    Returns True if the reading time indicates user was likely AFK.
    """
    # Minimum threshold: even "ああ" needs reaction time
    MIN_THRESHOLD = 5
    
    # Maximum threshold: beyond this is definitely AFK
    MAX_THRESHOLD = 120
    
    # Time allowance per character (accounts for reading + processing)
    # 1.2 sec/char ≈ learner reading at 50 char/min + thinking time
    SECONDS_PER_CHAR = 1.2
    
    threshold = max(MIN_THRESHOLD, min(char_count * SECONDS_PER_CHAR, MAX_THRESHOLD))
    return time_seconds > threshold

Why these specific values? Japanese reading speeds vary dramatically: native speakers read 500-1,200 characters per minute Education in Japan (8-20 char/sec), while intermediate learners read 150-300 char/min (2.5-5 char/sec). The 1.2 seconds per character accommodates the slowest learners (~50 char/min) while including a 3× multiplier for dictionary lookups, re-reading, and processing time. The 5-second minimum handles reaction time for clicking through dialogue, while the 120-second cap prevents absurdly long thresholds for text walls.

Edge case behavior:

  • 「ああ...」(4 chars): threshold = max(5, 4.8) = 5 seconds
  • Standard dialogue (30 chars): threshold = 36 seconds
  • Long narration (100 chars): threshold = 120 seconds (capped)
  • Very long passage (200 chars): threshold = 120 seconds (capped)

Algorithm 2: Exponential moving average adaptive threshold

This approach learns the user's personal reading speed over time, providing increasingly accurate detection as more data accumulates. It falls back to the simple heuristic during the warm-up period.

python

class AdaptiveAFKDetector:
    def __init__(self):
        self.alpha = 0.2              # EMA smoothing factor
        self.ema_time_per_char = None # Learned baseline
        self.sample_count = 0
        
        # Warm-up settings
        self.MIN_SAMPLES = 5
        self.FALLBACK_TIME_PER_CHAR = 1.2
        
        # Detection settings
        self.ANOMALY_MULTIPLIER = 3.0
        self.ABSOLUTE_MIN = 5
        self.ABSOLUTE_MAX = 180
    
    def record_reading(self, time_seconds: float, char_count: int) -> None:
        """Call after user advances to next line (confirmed not AFK)."""
        if char_count < 2:  # Skip very short lines
            return
        
        time_per_char = time_seconds / char_count
        # Clamp extreme values to avoid polluting baseline
        time_per_char = max(0.1, min(time_per_char, 5.0))
        
        if self.ema_time_per_char is None:
            self.ema_time_per_char = time_per_char
        else:
            # EMA: new = α × current + (1-α) × old
            self.ema_time_per_char = (
                self.alpha * time_per_char + 
                (1 - self.alpha) * self.ema_time_per_char
            )
        self.sample_count += 1
    
    def is_afk(self, time_seconds: float, char_count: int) -> bool:
        """Returns True if reading time indicates AFK."""
        if self.sample_count < self.MIN_SAMPLES:
            # Warm-up: use generous fallback
            base = self.FALLBACK_TIME_PER_CHAR
        else:
            base = self.ema_time_per_char
        
        threshold = char_count * base * self.ANOMALY_MULTIPLIER
        threshold = max(self.ABSOLUTE_MIN, min(threshold, self.ABSOLUTE_MAX))
        
        return time_seconds > threshold

Why EMA over simple moving average? EMA adapts faster to changes in reading speed (user improving over time or switching between easy/hard games), requires no fixed-size buffer, and uses a single recursive formula. Towards Data Science The α=0.2 value means recent readings have ~20% weight while the accumulated baseline has ~80%, providing stability while still responding to sustained speed changes.

Batch calculation variant: For after-the-fact analysis where all readings are available, first filter out obvious outliers using the simple heuristic, then compute the EMA baseline from the remaining "clean" readings:

python

def batch_detect_afk(readings: list[tuple[float, int]]) -> list[bool]:
    """
    Batch AFK detection for after-the-fact analysis.
    readings: list of (time_seconds, char_count) tuples
    """
    # First pass: rough filter using simple heuristic
    def rough_filter(time, chars):
        return time <= max(5, min(chars * 2.0, 180))
    
    clean_readings = [(t, c) for t, c in readings if rough_filter(t, c) and c >= 2]
    
    if len(clean_readings) < 5:
        # Not enough clean data, use simple heuristic
        return [time > max(5, min(chars * 1.2, 120)) for time, chars in readings]
    
    # Compute baseline from clean readings
    time_per_char_values = [t / c for t, c in clean_readings]
    baseline = sum(time_per_char_values) / len(time_per_char_values)
    
    # Second pass: detect outliers
    results = []
    for time, chars in readings:
        threshold = max(5, min(chars * baseline * 3.0, 180))
        results.append(time > threshold)
    
    return results

Algorithm 3: Modified Z-Score with Median Absolute Deviation

This method provides the most statistically rigorous outlier detection. Unlike standard Z-scores (which assume normal distributions and are sensitive to outliers), the Modified Z-Score uses medians throughout, making it robust Statology to the right-skewed distribution typical of reading times. Towards Data Science

python

from collections import deque
import statistics

class RobustAFKDetector:
    def __init__(self, window_size: int = 20):
        self.window_size = window_size
        self.time_per_char_history = deque(maxlen=window_size)
        
        # Modified Z-score threshold (Iglewicz & Hoaglin recommend 3.5)
        self.THRESHOLD = 3.5
        self.K = 0.6745  # Scaling constant for MAD
        
        self.ABSOLUTE_MIN = 5
        self.ABSOLUTE_MAX = 180
        self.FALLBACK_TIME_PER_CHAR = 1.2
    
    def record_reading(self, time_seconds: float, char_count: int) -> None:
        """Record a confirmed reading (not AFK)."""
        if char_count < 2:
            return
        time_per_char = max(0.1, min(time_seconds / char_count, 5.0))
        self.time_per_char_history.append(time_per_char)
    
    def is_afk(self, time_seconds: float, char_count: int) -> bool:
        """Detect if current reading time is anomalous."""
        if char_count < 1:
            return time_seconds > self.ABSOLUTE_MIN
        
        # Hard limit check
        if time_seconds > self.ABSOLUTE_MAX:
            return True
        
        # Need minimum samples for statistical detection
        if len(self.time_per_char_history) < 5:
            threshold = char_count * self.FALLBACK_TIME_PER_CHAR * 3
            return time_seconds > max(self.ABSOLUTE_MIN, threshold)
        
        # Calculate MAD-based detection
        data = list(self.time_per_char_history)
        median = statistics.median(data)
        abs_deviations = [abs(x - median) for x in data]
        mad = statistics.median(abs_deviations)
        
        # Handle edge case: MAD = 0 (all values nearly identical)
        if mad < 0.01:
            mad = 0.1
        
        # Modified Z-score: M = 0.6745 × (x - median) / MAD
        time_per_char = time_seconds / char_count
        modified_z = self.K * (time_per_char - median) / mad
        
        return modified_z > self.THRESHOLD

Why Modified Z-Score? Reading time distributions are right-skewed—most readings cluster near the normal speed, with a long tail of increasingly rare AFK events. Standard Z-scores using mean and standard deviation are pulled by these outliers, causing "masking" where extreme values inflate the baseline and prevent detection of moderate outliers. The Modified Z-Score using median and MAD has a 50% breakdown point, meaning it remains accurate even if half the data are outliers.

The 0.6745 constant makes the Modified Z-Score comparable to standard Z-scores under normal distributions (σ ≈ 1.4826 × MAD). The 3.5 threshold is the academic standard from Iglewicz and Hoaglin's 1993 research on robust outlier detection.

ParameterRecommended ValueJustification
MIN_THRESHOLD5 secondsReaction time floor; handles clicking through very short dialogue
MAX_THRESHOLD120-180 secondsBeyond this is definitively AFK; 2-3 minutes is generous
SECONDS_PER_CHAR1.2 for heuristicAccommodates 50 char/min readers with 3× processing buffer
EMA_ALPHA0.280/20 split between stability and responsiveness
ANOMALY_MULTIPLIER3.0Approximately 3 standard deviations from baseline
MODIFIED_Z_THRESHOLD3.5Academic standard for MAD-based outlier detection
WARM_UP_SAMPLES5-10Minimum for stable baseline estimation
WINDOW_SIZE20Rolling window captures recent reading patterns

Critical edge cases to handle

Very short sentences (「ああ...」「はい」「うん」): These 2-5 character lines legitimately take 1-3 seconds. The MIN_THRESHOLD of 5 seconds provides a generous floor while still being much better than a 60-second fixed timer. Consider flagging times under 0.3× expected as "skipped" rather than read.

Very long passages (200+ characters): The MAX_THRESHOLD cap prevents unreasonable thresholds. Even slow learners shouldn't need more than 2-3 minutes for a single text box. If they do, they're likely AFK or the game has unusually long passages that should be segmented.

Dialogue choices: When the game presents multiple options, users pause to consider choices. If detectable (multiple text options, menu state), multiply threshold by 1.5-2×.

Voice-over pacing: When audio is playing, the minimum reading time equals audio duration—users can't advance faster than the voice. If audio duration is available: threshold = max(audio_duration × 1.5, normal_threshold).

Cold start / new game: During warm-up when adaptive methods lack data, the simple heuristic provides reasonable defaults. Store per-game baselines to accelerate future sessions with the same title.

"Too fast" detection: Times significantly below expected (less than 0.3× expected) indicate the user clicked through without reading. This is relevant for reading statistics accuracy but orthogonal to AFK detection.

python

def classify_reading(time_seconds: float, char_count: int, baseline_per_char: float) -> str:
    expected = char_count * baseline_per_char
    ratio = time_seconds / expected if expected > 0 else 0
    
    if ratio < 0.3:
        return "skipped"
    elif ratio > 3.0:
        return "afk"
    else:
        return "normal"

Final recommendation for implementation

Start with Algorithm 1 (character heuristic). It requires approximately 10 lines of code, zero configuration, no warm-up period, and handles the majority of cases correctly. The formula max(5, min(chars × 1.2, 120)) eliminates the fundamental problem of fixed timers ignoring text length.

Add Algorithm 2 (EMA adaptive) if users report inaccurate detection after extended use. This requires storing a single floating-point baseline per game and updating it after each valid reading. The warm-up period is brief (5-10 text boxes), and the improvement in accuracy is substantial for users whose reading speed differs significantly from the assumed default.

Consider Algorithm 3 (Modified Z-Score) only if you observe systematic accuracy problems with EMA—for instance, if users frequently have contaminated sessions where they were AFK multiple times, polluting the baseline. The MAD-based approach handles this gracefully but adds implementation complexity and requires maintaining a rolling window of historical readings.

For batch/after-the-fact calculation as specified in the requirements, the two-pass batch detection variant of Algorithm 2 is ideal: use the simple heuristic to identify clean readings, compute a baseline from those, then classify all readings against that baseline. This approach combines the robustness of having all data available with the simplicity of the character-based method.