Title: | Client for the Microsoft's 'Cognitive Services Text to Speech REST' API |
---|---|
Description: | Convert text into synthesized speech and get a list of supported voices for a region. Microsoft's 'Cognitive Services Text to Speech REST' API <https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech?tabs=streaming> supports neural text to speech voices, which support specific languages and dialects that are identified by locale. |
Authors: | Howard Baek [aut, cre, cph] , John Muschelli [aut] |
Maintainer: | Howard Baek <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0.9000 |
Built: | 2024-10-12 05:11:09 UTC |
Source: | https://github.com/fhdsl/conrad |
If 'voice' argument is not supplied to ms_synthesize()
, obtain full list of voices for
a specified region and by default, use the first voice in that list.
ms_choose_voice( api_key = NULL, gender = c("Female", "Male"), language = "en-US", region = "westus" )
ms_choose_voice( api_key = NULL, gender = c("Female", "Male"), language = "en-US", region = "westus" )
api_key |
Microsoft Azure Cognitive Services API key |
gender |
Sex of speaker |
language |
Language to be spoken |
region |
Subscription region for API key. For more info, see https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/regions |
List of gender, language, and full voice name
## Not run: # Default voice whose gender is Female, language is English, and region is 'westus' ms_choose_voice(gender = "Female", language = "en-US", region = "westus") ## End(Not run)
## Not run: # Default voice whose gender is Female, language is English, and region is 'westus' ms_choose_voice(gender = "Female", language = "en-US", region = "westus") ## End(Not run)
Create Speech Synthesis Markup Language (SSML)
ms_create_ssml( script, voice = NULL, gender = c("Female", "Male"), language = "en-US", escape = FALSE )
ms_create_ssml( script, voice = NULL, gender = c("Female", "Male"), language = "en-US", escape = FALSE )
script |
A character vector of lines to be spoken |
voice |
Full voice name, |
gender |
Sex of the Speaker |
language |
Language to be spoken |
escape |
Should non-standard characters be substituted? Should not
be used if |
A character string of the text and SSML markup
ms_create_ssml("hey I really like things & dogs", escape = TRUE) ms_create_ssml("hey I really like things") ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things') ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things', escape = TRUE)
ms_create_ssml("hey I really like things & dogs", escape = TRUE) ms_create_ssml("hey I really like things") ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things') ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things', escape = TRUE)
Determines if option(ms_tts_key)
is set or key is stored in an
environment variable (MS_TTS_API_KEY, MS_TTS_API_KEY1, MS_TTS_API_KEY2). If
not found, stops and returns an error. If found, returns the value.
ms_fetch_key(api_key = NULL, error = TRUE) ms_exist_key(api_key = NULL) ms_set_key(api_key) ms_valid_key(api_key = NULL, region = "westus")
ms_fetch_key(api_key = NULL, error = TRUE) ms_exist_key(api_key = NULL) ms_set_key(api_key) ms_valid_key(api_key = NULL, region = "westus")
api_key |
Microsoft Cognitive Services API key |
error |
Should the function error if |
region |
Subscription region for API key. For more info, see https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/regions |
API key
Logical vector, indicating whether user has API key.
Logical vector, indicating whether API key is valid.
ms_exist_key()
: Does user have API key?
ms_set_key()
: Set API Key as a global option
ms_valid_key()
: Check whether API key is valid
You can either set the API key using option(ms_tts_key)
or have
it accessible by api_key = Sys.getenv('MS_TTS_API_KEY")}, or
\code{api_key = Sys.getenv('MS_TTS_API_KEY1")}, or \code{api_key =
Sys.getenv('MS_TTS_API_KEY2")
res = ms_fetch_key(api_key = NULL, error = FALSE) # Don't provide api key but fetch it programmatically ms_exist_key(api_key = NULL) # Provide api key XXX ms_exist_key(api_key = "XXX") # Set api key XXX ms_set_key(api_key = "XXX") # Check whether API key is valid in westus ms_valid_key(region = "westus")
res = ms_fetch_key(api_key = NULL, error = FALSE) # Don't provide api key but fetch it programmatically ms_exist_key(api_key = NULL) # Provide api key XXX ms_exist_key(api_key = "XXX") # Set api key XXX ms_set_key(api_key = "XXX") # Check whether API key is valid in westus ms_valid_key(region = "westus")
Get Microsoft Text To Speech (TTS) or Cognitive Services Token from API Key
Check if token has expired
ms_get_token(api_key = NULL, region = "westus") ms_token_expired(token = NULL)
ms_get_token(api_key = NULL, region = "westus") ms_token_expired(token = NULL)
api_key |
Microsoft Azure Cognitive Services API key |
region |
Subscription region for API key. For more info, see https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/regions |
token |
An authentication of class |
A list of the request and token
Logical vector, indicating whether token has expired
# Get token where region is westus token = ms_get_token(region = "westus") # Check if token XXX has expired ms_token_expired(token = "XXX")
# Get token where region is westus token = ms_get_token(region = "westus") # Check if token XXX has expired ms_token_expired(token = "XXX")
Obtains a full list of voices for a specific region.
ms_list_voice(api_key = NULL, region = "westus")
ms_list_voice(api_key = NULL, region = "westus")
api_key |
Microsoft Azure Cognitive Services API key |
region |
Subscription region for API key. For more info, see https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/regions |
For more info, see Get a list of voices from the Microsoft documentation.
A data.frame
of the names and their long names.
# List voices for westus ms_list_voice(region = "westus") # List voices for eastus ms_list_voice(region = "eastus")
# List voices for westus ms_list_voice(region = "westus") # List voices for eastus ms_list_voice(region = "eastus")
If region is supported, this function returns the region. If not supported, throws a warning message.
ms_region(region = conrad::region)
ms_region(region = conrad::region)
region |
Subscription region for API key. For more info, see https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/regions |
Character vector of region
# Check if westus is supported ms_region(region = "westus") # Check if eastus is supported ms_region(region = "eastus")
# Check if westus is supported ms_region(region = "westus") # Check if eastus is supported ms_region(region = "eastus")
Convert text to speech by using Speech Synthesis Markup Language (SSML)
ms_synthesize( script, region = "westus", api_key = NULL, token = NULL, gender = c("Female", "Male"), language = "en-US", voice = NULL, escape = FALSE )
ms_synthesize( script, region = "westus", api_key = NULL, token = NULL, gender = c("Female", "Male"), language = "en-US", voice = NULL, escape = FALSE )
script |
A character vector of text to be converted to speech |
region |
Subscription region for API key. For more info, see https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/regions |
api_key |
Microsoft Azure Cognitive Services API key |
token |
An authentication token |
gender |
Sex of the speaker |
language |
Language to be spoken |
voice |
Full voice name |
escape |
Should non-standard characters be substituted? |
For more info, see Section Convert text to speech of the Microsoft documentation.
An HTTP response in hexadecimal representation of binary data
# Convert text to speech res <- ms_synthesize(script = "Hello world, this is a talking computer testing test", region = "westus", gender = "Female") # Returns hexadecimal representation of binary data # Create temporary file to store audio output output_path <- tempfile(fileext = ".wav") # Write binary data to output path writeBin(res, con = output_path) # Play audio in browser # play_audio(audio = output_path) # Delete temporary file file.remove(output_path)
# Convert text to speech res <- ms_synthesize(script = "Hello world, this is a talking computer testing test", region = "westus", gender = "Female") # Returns hexadecimal representation of binary data # Create temporary file to store audio output output_path <- tempfile(fileext = ".wav") # Write binary data to output path writeBin(res, con = output_path) # Play audio in browser # play_audio(audio = output_path) # Delete temporary file file.remove(output_path)
Verify whether given voice is compatible with specific region. If it is, provide the gender, full voice name, and language associated with given voice.
ms_use_voice(voice, api_key = NULL, region = "westus")
ms_use_voice(voice, api_key = NULL, region = "westus")
voice |
Full voice name ("Microsoft Server Speech Text to Speech Voice (XX, YY)") |
api_key |
Microsoft Azure Cognitive Services API key |
region |
Subscription region for API key. For more info, see https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/regions |
List of gender, language, and full voice name
## Not run: # Retrieve gender, full name, and language ms_use_voice(voice = "Microsoft Server Speech Text to Speech Voice (en-US, JacobNeural)", region = "westus") ## End(Not run)
## Not run: # Retrieve gender, full name, and language ms_use_voice(voice = "Microsoft Server Speech Text to Speech Voice (en-US, JacobNeural)", region = "westus") ## End(Not run)
This uses HTML5 audio tags to play audio in your browser. Borrowed from
googleLanguageR::gl_talk_player()
.
play_audio(audio = "output.wav", html = "player.html")
play_audio(audio = "output.wav", html = "player.html")
audio |
The file location of the audio file. Must be supported by HTML5 |
html |
The html file location that will be created host the audio |
For more info, see this Mozilla documentation
detailing the <audio>
HTML element.
No return value, called for side effects
## Not run: # Opens a browser with embedded audio play_audio(audio = "output.wav") ## End(Not run)
## Not run: # Opens a browser with embedded audio play_audio(audio = "output.wav") ## End(Not run)
This character vector contains region identifiers that support text to speech.
region
region
region
A character vector
https://learn.microsoft.com/en-us/azure/cognitive-services/Speech-Service/regions