Package 'stt.api'

Title: 'OpenAI' Compatible Speech-to-Text API Client
Description: A minimal-dependency R client for 'OpenAI'-compatible speech-to-text APIs (see <https://platform.openai.com/docs/api-reference/audio>) with optional local fallbacks. Supports 'OpenAI', local servers, and the 'whisper' package for local transcription.
Authors: Troy Hernandez [aut, cre] (ORCID: <https://orcid.org/0009-0005-4248-604X>), Cornball AI [cph]
Maintainer: Troy Hernandez <[email protected]>
License: MIT + file LICENSE
Version: 0.2.1
Built: 2026-05-19 10:28:37 UTC
Source: https://github.com/cornball-ai/stt.api

Help Index


Clear native whisper model cache

Description

Removes cached native whisper models from memory. Call this to free GPU/RAM after batch processing is complete.

Usage

clear_native_whisper_cache()

Value

No return value, called for side effects (frees memory by removing cached models and triggers garbage collection).

Examples

clear_native_whisper_cache()

Set the API Base URL

Description

Sets the base URL for OpenAI-compatible STT endpoints.

Usage

set_stt_base(url)

Arguments

url

Character string. The base URL (e.g., "http://localhost:4123" or "https://api.openai.com").

Value

Invisibly returns the previous value.

Examples

set_stt_base("http://localhost:4123")
getOption("stt.api_base")

Set the API Key

Description

Sets the API key for hosted STT services (e.g., OpenAI). Local servers typically ignore this.

Usage

set_stt_key(key)

Arguments

key

Character string. The API key.

Value

Invisibly returns the previous value.

Examples

set_stt_key("test-key-123")
getOption("stt.api_key")

Speech to Text

Description

Convert an audio file to text using a local whisper backend or an OpenAI-compatible API.

Usage

stt(file, model = NULL, language = NULL,
    response_format = c("json", "text", "verbose_json"),
    backend = c("auto", "whisper", "openai"), prompt = NULL)

Arguments

file

Path to the audio file to convert.

model

Model name to use for transcription. For API backends, this is passed directly (e.g., "whisper-1"). For whisper, this is the model size (e.g., "tiny", "base", "small", "medium", "large"). If NULL, uses the backend's default.

language

Language code (e.g., "en", "es", "fr"). Optional hint to improve transcription accuracy.

response_format

Response format for API backend. One of "text", "json", or "verbose_json". Ignored for whisper backend.

backend

Which backend to use: "auto" (default), "whisper", or "openai". Auto mode tries whisper first, then openai API (if configured).

prompt

Optional text to guide the transcription. For API backend, this is passed as initial_prompt to help with spelling of names, acronyms, or domain-specific terms. Ignored for whisper backend.

Value

A list with components:

text

The transcribed text as a single string.

segments

A data.frame of segments with timing info, or NULL.

language

The detected or specified language code.

backend

Which backend was used ("api" or "whisper").

raw

The raw response from the backend.

Examples

## Not run: 
# Using OpenAI API
set_stt_base("https://api.openai.com")
set_stt_key(Sys.getenv("OPENAI_API_KEY"))
result <- stt("speech.wav", model = "whisper-1")
result$text

# Using local server
set_stt_base("http://localhost:4123")
result <- stt("speech.wav")

## End(Not run)

Check STT Backend Health

Description

Checks whether a transcription backend is available and working.

Usage

stt_health()

Value

A list with components:

ok

Logical. TRUE if a backend is available.

backend

Character. The available backend ("api" or "whisper"), or NULL if none available.

message

Character. Status message with details.

Examples

## Not run: 
h <- stt_health()
if (h$ok) {
  message("STT ready via ", h$backend)
}

## End(Not run)