Automatic AFK detection for visual novel reading statistics
A simple character-based heuristic outperforms fixed timers and requires zero configuration. The formula threshold = max(5, min(chars × 1.2, 120)) handles 80%+ of cases without any user tuning, while an adaptive variant using exponential moving averages can personalize detection after a brief warm-up period. For applications requiring maximum robustness, the Modified Z-Score method using Median Absolute Deviation provides statistically rigorous outlier detection that remains stable even with contaminated data.
The core problem with fixed AFK timers like the current 60-second approach is that they ignore text length entirely. Shodan A 4-character line like 「ああ...」 legitimately takes 2-3 seconds to read, making a 60-second threshold absurdly generous—it would count 57 seconds of idle time toward reading statistics. Conversely, a 200-character passage might genuinely require 90+ seconds for a learner, yet would be incorrectly flagged as AFK.
Three recommended algorithms ranked by complexity
Each approach below solves the same problem with increasing sophistication. Choose based on your implementation constraints and accuracy requirements.
Algorithm 1: Multi-Tier Character Heuristic requires no historical data and works immediately. Algorithm 2: EMA Adaptive Baseline learns individual reading speeds after 5-10 text boxes. Algorithm 3: Modified Z-Score with MAD provides the most statistically robust detection but requires maintaining a rolling history window.
| Approach | Lines of Code | Accuracy | Adapts to User | Cold Start |
|---|---|---|---|---|
| Character Heuristic | ~10 | Good (80%) | No | Instant |
| EMA Adaptive | ~40 | Very Good (90%) | Yes | 5-10 samples |
| Modified Z-Score | ~60 | Excellent (95%) | Yes | 10-20 samples |
Algorithm 1: Character-based heuristic (recommended starting point)
This approach requires zero configuration and no warm-up period. It works by scaling the AFK threshold proportionally to text length, bounded by sensible minimum and maximum values.
python
def is_afk(time_seconds: float, char_count: int) -> bool:
"""
Simple heuristic that works without any learning.
Returns True if the reading time indicates user was likely AFK.
"""
# Minimum threshold: even "ああ" needs reaction time
MIN_THRESHOLD = 5
# Maximum threshold: beyond this is definitely AFK
MAX_THRESHOLD = 120
# Time allowance per character (accounts for reading + processing)
# 1.2 sec/char ≈ learner reading at 50 char/min + thinking time
SECONDS_PER_CHAR = 1.2
threshold = max(MIN_THRESHOLD, min(char_count * SECONDS_PER_CHAR, MAX_THRESHOLD))
return time_seconds > thresholdWhy these specific values? Japanese reading speeds vary dramatically: native speakers read 500-1,200 characters per minute Education in Japan (8-20 char/sec), while intermediate learners read 150-300 char/min (2.5-5 char/sec). The 1.2 seconds per character accommodates the slowest learners (~50 char/min) while including a 3× multiplier for dictionary lookups, re-reading, and processing time. The 5-second minimum handles reaction time for clicking through dialogue, while the 120-second cap prevents absurdly long thresholds for text walls.
Edge case behavior:
- 「ああ...」(4 chars): threshold = max(5, 4.8) = 5 seconds
- Standard dialogue (30 chars): threshold = 36 seconds
- Long narration (100 chars): threshold = 120 seconds (capped)
- Very long passage (200 chars): threshold = 120 seconds (capped)
Algorithm 2: Exponential moving average adaptive threshold
This approach learns the user's personal reading speed over time, providing increasingly accurate detection as more data accumulates. It falls back to the simple heuristic during the warm-up period.
python
class AdaptiveAFKDetector:
def __init__(self):
self.alpha = 0.2 # EMA smoothing factor
self.ema_time_per_char = None # Learned baseline
self.sample_count = 0
# Warm-up settings
self.MIN_SAMPLES = 5
self.FALLBACK_TIME_PER_CHAR = 1.2
# Detection settings
self.ANOMALY_MULTIPLIER = 3.0
self.ABSOLUTE_MIN = 5
self.ABSOLUTE_MAX = 180
def record_reading(self, time_seconds: float, char_count: int) -> None:
"""Call after user advances to next line (confirmed not AFK)."""
if char_count < 2: # Skip very short lines
return
time_per_char = time_seconds / char_count
# Clamp extreme values to avoid polluting baseline
time_per_char = max(0.1, min(time_per_char, 5.0))
if self.ema_time_per_char is None:
self.ema_time_per_char = time_per_char
else:
# EMA: new = α × current + (1-α) × old
self.ema_time_per_char = (
self.alpha * time_per_char +
(1 - self.alpha) * self.ema_time_per_char
)
self.sample_count += 1
def is_afk(self, time_seconds: float, char_count: int) -> bool:
"""Returns True if reading time indicates AFK."""
if self.sample_count < self.MIN_SAMPLES:
# Warm-up: use generous fallback
base = self.FALLBACK_TIME_PER_CHAR
else:
base = self.ema_time_per_char
threshold = char_count * base * self.ANOMALY_MULTIPLIER
threshold = max(self.ABSOLUTE_MIN, min(threshold, self.ABSOLUTE_MAX))
return time_seconds > thresholdWhy EMA over simple moving average? EMA adapts faster to changes in reading speed (user improving over time or switching between easy/hard games), requires no fixed-size buffer, and uses a single recursive formula. Towards Data Science The α=0.2 value means recent readings have ~20% weight while the accumulated baseline has ~80%, providing stability while still responding to sustained speed changes.
Batch calculation variant: For after-the-fact analysis where all readings are available, first filter out obvious outliers using the simple heuristic, then compute the EMA baseline from the remaining "clean" readings:
python
def batch_detect_afk(readings: list[tuple[float, int]]) -> list[bool]:
"""
Batch AFK detection for after-the-fact analysis.
readings: list of (time_seconds, char_count) tuples
"""
# First pass: rough filter using simple heuristic
def rough_filter(time, chars):
return time <= max(5, min(chars * 2.0, 180))
clean_readings = [(t, c) for t, c in readings if rough_filter(t, c) and c >= 2]
if len(clean_readings) < 5:
# Not enough clean data, use simple heuristic
return [time > max(5, min(chars * 1.2, 120)) for time, chars in readings]
# Compute baseline from clean readings
time_per_char_values = [t / c for t, c in clean_readings]
baseline = sum(time_per_char_values) / len(time_per_char_values)
# Second pass: detect outliers
results = []
for time, chars in readings:
threshold = max(5, min(chars * baseline * 3.0, 180))
results.append(time > threshold)
return resultsAlgorithm 3: Modified Z-Score with Median Absolute Deviation
This method provides the most statistically rigorous outlier detection. Unlike standard Z-scores (which assume normal distributions and are sensitive to outliers), the Modified Z-Score uses medians throughout, making it robust Statology to the right-skewed distribution typical of reading times. Towards Data Science
python
from collections import deque
import statistics
class RobustAFKDetector:
def __init__(self, window_size: int = 20):
self.window_size = window_size
self.time_per_char_history = deque(maxlen=window_size)
# Modified Z-score threshold (Iglewicz & Hoaglin recommend 3.5)
self.THRESHOLD = 3.5
self.K = 0.6745 # Scaling constant for MAD
self.ABSOLUTE_MIN = 5
self.ABSOLUTE_MAX = 180
self.FALLBACK_TIME_PER_CHAR = 1.2
def record_reading(self, time_seconds: float, char_count: int) -> None:
"""Record a confirmed reading (not AFK)."""
if char_count < 2:
return
time_per_char = max(0.1, min(time_seconds / char_count, 5.0))
self.time_per_char_history.append(time_per_char)
def is_afk(self, time_seconds: float, char_count: int) -> bool:
"""Detect if current reading time is anomalous."""
if char_count < 1:
return time_seconds > self.ABSOLUTE_MIN
# Hard limit check
if time_seconds > self.ABSOLUTE_MAX:
return True
# Need minimum samples for statistical detection
if len(self.time_per_char_history) < 5:
threshold = char_count * self.FALLBACK_TIME_PER_CHAR * 3
return time_seconds > max(self.ABSOLUTE_MIN, threshold)
# Calculate MAD-based detection
data = list(self.time_per_char_history)
median = statistics.median(data)
abs_deviations = [abs(x - median) for x in data]
mad = statistics.median(abs_deviations)
# Handle edge case: MAD = 0 (all values nearly identical)
if mad < 0.01:
mad = 0.1
# Modified Z-score: M = 0.6745 × (x - median) / MAD
time_per_char = time_seconds / char_count
modified_z = self.K * (time_per_char - median) / mad
return modified_z > self.THRESHOLDWhy Modified Z-Score? Reading time distributions are right-skewed—most readings cluster near the normal speed, with a long tail of increasingly rare AFK events. Standard Z-scores using mean and standard deviation are pulled by these outliers, causing "masking" where extreme values inflate the baseline and prevent detection of moderate outliers. The Modified Z-Score using median and MAD has a 50% breakdown point, meaning it remains accurate even if half the data are outliers.
The 0.6745 constant makes the Modified Z-Score comparable to standard Z-scores under normal distributions (σ ≈ 1.4826 × MAD). The 3.5 threshold is the academic standard from Iglewicz and Hoaglin's 1993 research on robust outlier detection.
Recommended default parameters with justification
| Parameter | Recommended Value | Justification |
|---|---|---|
| MIN_THRESHOLD | 5 seconds | Reaction time floor; handles clicking through very short dialogue |
| MAX_THRESHOLD | 120-180 seconds | Beyond this is definitively AFK; 2-3 minutes is generous |
| SECONDS_PER_CHAR | 1.2 for heuristic | Accommodates 50 char/min readers with 3× processing buffer |
| EMA_ALPHA | 0.2 | 80/20 split between stability and responsiveness |
| ANOMALY_MULTIPLIER | 3.0 | Approximately 3 standard deviations from baseline |
| MODIFIED_Z_THRESHOLD | 3.5 | Academic standard for MAD-based outlier detection |
| WARM_UP_SAMPLES | 5-10 | Minimum for stable baseline estimation |
| WINDOW_SIZE | 20 | Rolling window captures recent reading patterns |
Critical edge cases to handle
Very short sentences (「ああ...」「はい」「うん」): These 2-5 character lines legitimately take 1-3 seconds. The MIN_THRESHOLD of 5 seconds provides a generous floor while still being much better than a 60-second fixed timer. Consider flagging times under 0.3× expected as "skipped" rather than read.
Very long passages (200+ characters): The MAX_THRESHOLD cap prevents unreasonable thresholds. Even slow learners shouldn't need more than 2-3 minutes for a single text box. If they do, they're likely AFK or the game has unusually long passages that should be segmented.
Dialogue choices: When the game presents multiple options, users pause to consider choices. If detectable (multiple text options, menu state), multiply threshold by 1.5-2×.
Voice-over pacing: When audio is playing, the minimum reading time equals audio duration—users can't advance faster than the voice. If audio duration is available: threshold = max(audio_duration × 1.5, normal_threshold).
Cold start / new game: During warm-up when adaptive methods lack data, the simple heuristic provides reasonable defaults. Store per-game baselines to accelerate future sessions with the same title.
"Too fast" detection: Times significantly below expected (less than 0.3× expected) indicate the user clicked through without reading. This is relevant for reading statistics accuracy but orthogonal to AFK detection.
python
def classify_reading(time_seconds: float, char_count: int, baseline_per_char: float) -> str:
expected = char_count * baseline_per_char
ratio = time_seconds / expected if expected > 0 else 0
if ratio < 0.3:
return "skipped"
elif ratio > 3.0:
return "afk"
else:
return "normal"Final recommendation for implementation
Start with Algorithm 1 (character heuristic). It requires approximately 10 lines of code, zero configuration, no warm-up period, and handles the majority of cases correctly. The formula max(5, min(chars × 1.2, 120)) eliminates the fundamental problem of fixed timers ignoring text length.
Add Algorithm 2 (EMA adaptive) if users report inaccurate detection after extended use. This requires storing a single floating-point baseline per game and updating it after each valid reading. The warm-up period is brief (5-10 text boxes), and the improvement in accuracy is substantial for users whose reading speed differs significantly from the assumed default.
Consider Algorithm 3 (Modified Z-Score) only if you observe systematic accuracy problems with EMA—for instance, if users frequently have contaminated sessions where they were AFK multiple times, polluting the baseline. The MAD-based approach handles this gracefully but adds implementation complexity and requires maintaining a rolling window of historical readings.
For batch/after-the-fact calculation as specified in the requirements, the two-pass batch detection variant of Algorithm 2 is ideal: use the simple heuristic to identify clean readings, compute a baseline from those, then classify all readings against that baseline. This approach combines the robustness of having all data available with the simplicity of the character-based method.