Changes in version 0.3.0 (2026-03-13)                  

  - Language auto-detection: transcribe() now defaults to language =
    NULL, which detects the spoken language from the audio before
    decoding. New exported function detect_language() for standalone
    language identification. Breaking: previous default was language =
    "en". Code relying on the default now auto-detects instead of
    assuming English. Pass language = "en" explicitly to restore old
    behavior.
  - Segment-level and word-level timestamps via DTW alignment
  - Beam search decoding with temperature sampling and fallback
  - SDPA attention (FlashAttention on GPU)
  - whisper_pipeline() for cached model reuse across multiple
    transcriptions
  - Hardcoded special token table (eliminates added_tokens.json
    download)
  - Fixed invalid multibyte string crash in BPE decoder
  - Fixed DTW boundary guards and seek loop in transcribe_chunk()

                 Changes in version 0.1.0 (2026-02-06)                  

  - Initial CRAN submission
  - Native R torch implementation of OpenAI Whisper
  - Support for all model sizes: tiny, base, small, medium, large-v3
  - Automatic model download from HuggingFace
  - Model-specific special token handling for large-v3 compatibility
  - KV caching for efficient autoregressive decoding
  - Long audio chunking for files longer than 30 seconds
  - Optional timestamp and segment extraction