| Title: | 'OpenAI' Compatible Speech-to-Text API Client |
|---|---|
| Description: | A minimal-dependency R client for 'OpenAI'-compatible speech-to-text APIs (see <https://platform.openai.com/docs/api-reference/audio>) with optional local fallbacks. Supports 'OpenAI', local servers, and the 'whisper' package for local transcription. |
| Authors: | Troy Hernandez [aut, cre] (ORCID: <https://orcid.org/0009-0005-4248-604X>), Cornball AI [cph] |
| Maintainer: | Troy Hernandez <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.1 |
| Built: | 2026-05-19 10:28:37 UTC |
| Source: | https://github.com/cornball-ai/stt.api |
Removes cached native whisper models from memory. Call this to free GPU/RAM after batch processing is complete.
clear_native_whisper_cache()clear_native_whisper_cache()
No return value, called for side effects (frees memory by removing cached models and triggers garbage collection).
clear_native_whisper_cache()clear_native_whisper_cache()
Sets the base URL for OpenAI-compatible STT endpoints.
set_stt_base(url)set_stt_base(url)
url |
Character string. The base URL (e.g., "http://localhost:4123" or "https://api.openai.com"). |
Invisibly returns the previous value.
set_stt_base("http://localhost:4123") getOption("stt.api_base")set_stt_base("http://localhost:4123") getOption("stt.api_base")
Sets the API key for hosted STT services (e.g., OpenAI). Local servers typically ignore this.
set_stt_key(key)set_stt_key(key)
key |
Character string. The API key. |
Invisibly returns the previous value.
set_stt_key("test-key-123") getOption("stt.api_key")set_stt_key("test-key-123") getOption("stt.api_key")
Convert an audio file to text using a local whisper backend or an OpenAI-compatible API.
stt(file, model = NULL, language = NULL, response_format = c("json", "text", "verbose_json"), backend = c("auto", "whisper", "openai"), prompt = NULL)stt(file, model = NULL, language = NULL, response_format = c("json", "text", "verbose_json"), backend = c("auto", "whisper", "openai"), prompt = NULL)
file |
Path to the audio file to convert. |
model |
Model name to use for transcription. For API backends, this is passed directly (e.g., "whisper-1"). For whisper, this is the model size (e.g., "tiny", "base", "small", "medium", "large"). If NULL, uses the backend's default. |
language |
Language code (e.g., "en", "es", "fr"). Optional hint to improve transcription accuracy. |
response_format |
Response format for API backend. One of "text", "json", or "verbose_json". Ignored for whisper backend. |
backend |
Which backend to use: "auto" (default), "whisper", or "openai". Auto mode tries whisper first, then openai API (if configured). |
prompt |
Optional text to guide the transcription. For API backend, this is passed as initial_prompt to help with spelling of names, acronyms, or domain-specific terms. Ignored for whisper backend. |
A list with components:
The transcribed text as a single string.
A data.frame of segments with timing info, or NULL.
The detected or specified language code.
Which backend was used ("api" or "whisper").
The raw response from the backend.
## Not run: # Using OpenAI API set_stt_base("https://api.openai.com") set_stt_key(Sys.getenv("OPENAI_API_KEY")) result <- stt("speech.wav", model = "whisper-1") result$text # Using local server set_stt_base("http://localhost:4123") result <- stt("speech.wav") ## End(Not run)## Not run: # Using OpenAI API set_stt_base("https://api.openai.com") set_stt_key(Sys.getenv("OPENAI_API_KEY")) result <- stt("speech.wav", model = "whisper-1") result$text # Using local server set_stt_base("http://localhost:4123") result <- stt("speech.wav") ## End(Not run)
Checks whether a transcription backend is available and working.
stt_health()stt_health()
A list with components:
Logical. TRUE if a backend is available.
Character. The available backend ("api" or "whisper"), or NULL if none available.
Character. Status message with details.
## Not run: h <- stt_health() if (h$ok) { message("STT ready via ", h$backend) } ## End(Not run)## Not run: h <- stt_health() if (h$ok) { message("STT ready via ", h$backend) } ## End(Not run)