RPPT Tensor Operations - Audio Augmentations.#
RPPT Tensor Operations - Audio Augmentations. More...
Functions | |
| RppStatus | rppt_non_silent_region_detection (RppPtr_t srcPtr, RpptDescPtr srcDescPtr, Rpp32s *srcLengthTensor, Rpp32s *detectedIndexTensor, Rpp32s *detectionLengthTensor, Rpp32f cutOffDB, Rpp32s windowLength, Rpp32f referencePower, Rpp32s resetInterval, rppHandle_t rppHandle, RppBackend executionBackend) |
| Non Silent Region Detection augmentation on HIP/HOST backend. More... | |
| RppStatus | rppt_to_decibels (RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, RpptImagePatchPtr srcDims, Rpp32f cutOffDB, Rpp32f multiplier, Rpp32f referenceMagnitude, rppHandle_t rppHandle, RppBackend executionBackend) |
| To Decibels augmentation on HIP/HOST backend. More... | |
| RppStatus | rppt_pre_emphasis_filter (RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcLengthTensor, Rpp32f *coeffTensor, RpptAudioBorderType borderType, rppHandle_t rppHandle, RppBackend executionBackend) |
| Pre Emphasis Filter augmentation on HIP/HOST backend. More... | |
| RppStatus | rppt_down_mixing (RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcDimsTensor, bool normalizeWeights, rppHandle_t rppHandle, RppBackend executionBackend) |
| Down Mixing augmentation on HIP/HOST backend. More... | |
| RppStatus | rppt_spectrogram (RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcLengthTensor, bool centerWindows, bool reflectPadding, Rpp32f *windowFunction, Rpp32s nfft, Rpp32s power, Rpp32s windowLength, Rpp32s windowStep, rppHandle_t rppHandle, RppBackend executionBackend) |
| Produces a spectrogram from a 1D audio buffer on HIP/HOST backend. More... | |
| RppStatus | rppt_mel_filter_bank (RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcDims, Rpp32f maxFreq, Rpp32f minFreq, RpptMelScaleFormula melFormula, Rpp32s numFilter, Rpp32f sampleRate, bool normalize, rppHandle_t rppHandle, RppBackend executionBackend) |
| Mel filter bank augmentation HIP/HOST backend. More... | |
| RppStatus | rppt_resample (RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32f *inRateTensor, Rpp32f *outRateTensor, Rpp32s *srcDimsTensor, RpptResamplingWindow &window, rppHandle_t rppHandle, RppBackend executionBackend) |
| Resample augmentation on HIP/HOST backend. More... | |
Detailed Description
RPPT Tensor Operations - Audio Augmentations.
Function Documentation
◆ rppt_down_mixing()
| RppStatus rppt_down_mixing | ( | RppPtr_t | srcPtr, |
| RpptDescPtr | srcDescPtr, | ||
| RppPtr_t | dstPtr, | ||
| RpptDescPtr | dstDescPtr, | ||
| Rpp32s * | srcDimsTensor, | ||
| bool | normalizeWeights, | ||
| rppHandle_t | rppHandle, | ||
| RppBackend | executionBackend | ||
| ) |
Down Mixing augmentation on HIP/HOST backend.
Down Mixing augmentation for audio data
- Parameters
-
[in] srcPtr source tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 2 or 3 (for single-channel or multi-channel audio tensor), offsetInBytes >= 0, dataType = F32) [out] dstPtr destination tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 2, offsetInBytes >= 0, dataType = F32) [in] srcDimsTensor source audio buffer length and number of channels (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize * 2) [in] normalizeWeights bool flag to specify if normalization of weights is needed [in] rppHandle RPP HIP/HOST handle created with rppCreate()
- Returns
- A
RppStatusenumeration.
- Return values
-
RPP_SUCCESS Successful completion. RPP_ERROR* Unsuccessful completion.
◆ rppt_mel_filter_bank()
| RppStatus rppt_mel_filter_bank | ( | RppPtr_t | srcPtr, |
| RpptDescPtr | srcDescPtr, | ||
| RppPtr_t | dstPtr, | ||
| RpptDescPtr | dstDescPtr, | ||
| Rpp32s * | srcDims, | ||
| Rpp32f | maxFreq, | ||
| Rpp32f | minFreq, | ||
| RpptMelScaleFormula | melFormula, | ||
| Rpp32s | numFilter, | ||
| Rpp32f | sampleRate, | ||
| bool | normalize, | ||
| rppHandle_t | rppHandle, | ||
| RppBackend | executionBackend | ||
| ) |
Mel filter bank augmentation HIP/HOST backend.
Mel filter bank augmentation for audio data
- Parameters
-
[in] srcPtr source tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32, layout - NFT) [out] dstPtr destination tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32, layout - NFT) [in] srcDimsTensor source audio buffer length and number of channels (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize * 2) [in] maxFreq maximum frequency if not provided maxFreq = sampleRate / 2 [in] minFreq minimum frequency [in] melFormula formula used to convert frequencies from hertz to mel and from mel to hertz (SLANEY / HTK) [in] numFilter number of mel filters [in] sampleRate sampling rate of the audio [in] normalize boolean variable that determine whether to normalize weights / not [in] rppHandle RPP HIP/HOST handle created with rppCreate()
- Returns
- A
RppStatusenumeration.
- Return values
-
RPP_SUCCESS Successful completion. RPP_ERROR* Unsuccessful completion.
◆ rppt_non_silent_region_detection()
| RppStatus rppt_non_silent_region_detection | ( | RppPtr_t | srcPtr, |
| RpptDescPtr | srcDescPtr, | ||
| Rpp32s * | srcLengthTensor, | ||
| Rpp32s * | detectedIndexTensor, | ||
| Rpp32s * | detectionLengthTensor, | ||
| Rpp32f | cutOffDB, | ||
| Rpp32s | windowLength, | ||
| Rpp32f | referencePower, | ||
| Rpp32s | resetInterval, | ||
| rppHandle_t | rppHandle, | ||
| RppBackend | executionBackend | ||
| ) |
Non Silent Region Detection augmentation on HIP/HOST backend.
Non Silent Region Detection augmentation for 1D audio buffer
Finds the starting index and length of non silent region in the audio buffer by comparing the calculated short-term power with cutoff value passed
- Parameters
-
[in] srcPtr source tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 2, offsetInBytes >= 0, dataType = F32) [in] srcLengthTensor source audio buffer length (1D tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [out] detectedIndexTensor beginning index of non silent region (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [out] detectionLengthTensor length of non silent region (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [in] cutOffDB cutOff in dB below which the signal is considered silent [in] windowLength window length used for computing short-term power of the signal [in] referencePower reference power that is used to convert the signal to dB [in] resetInterval number of samples after which the moving mean average is recalculated to avoid precision loss [in] rppHandle RPP HIP/HOST handle created with rppCreate()
- Returns
- A
RppStatusenumeration.
- Return values
-
RPP_SUCCESS Successful completion. RPP_ERROR* Unsuccessful completion.
◆ rppt_pre_emphasis_filter()
| RppStatus rppt_pre_emphasis_filter | ( | RppPtr_t | srcPtr, |
| RpptDescPtr | srcDescPtr, | ||
| RppPtr_t | dstPtr, | ||
| RpptDescPtr | dstDescPtr, | ||
| Rpp32s * | srcLengthTensor, | ||
| Rpp32f * | coeffTensor, | ||
| RpptAudioBorderType | borderType, | ||
| rppHandle_t | rppHandle, | ||
| RppBackend | executionBackend | ||
| ) |
Pre Emphasis Filter augmentation on HIP/HOST backend.
Pre Emphasis Filter augmentation for audio data
- Parameters
-
[in] srcPtr source tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) [out] dstPtr destination tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) [in] srcLengthTensor source audio buffer length (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [in] coeffTensor preemphasis coefficient (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [in] borderType border value policy [in] rppHandle RPP HIP/HOST handle created with rppCreate()
- Returns
- A
RppStatusenumeration.
- Return values
-
RPP_SUCCESS Successful completion. RPP_ERROR* Unsuccessful completion.
◆ rppt_resample()
| RppStatus rppt_resample | ( | RppPtr_t | srcPtr, |
| RpptDescPtr | srcDescPtr, | ||
| RppPtr_t | dstPtr, | ||
| RpptDescPtr | dstDescPtr, | ||
| Rpp32f * | inRateTensor, | ||
| Rpp32f * | outRateTensor, | ||
| Rpp32s * | srcDimsTensor, | ||
| RpptResamplingWindow & | window, | ||
| rppHandle_t | rppHandle, | ||
| RppBackend | executionBackend | ||
| ) |
Resample augmentation on HIP/HOST backend.
Resample augmentation for audio data
- Parameters
-
[in] srcPtr source tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 2 or 3 (for single-channel or multi-channel audio tensor), offsetInBytes >= 0, dataType = F32) [out] dstPtr destination tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 2 or 3 (for single-channel or multi-channel audio tensor), offsetInBytes >= 0, dataType = F32) [in] inRate Input sampling rate (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [in] outRate Output sampling rate (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [in] srcDimsTensor source audio buffer length and number of channels (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize * 2) [in] window Resampling window (struct of type RpptRpptResamplingWindow) [in] rppHandle RPP HOST handle created with rppCreate()
- Returns
- A
RppStatusenumeration.
- Return values
-
RPP_SUCCESS Successful completion. RPP_ERROR* Unsuccessful completion.
◆ rppt_spectrogram()
| RppStatus rppt_spectrogram | ( | RppPtr_t | srcPtr, |
| RpptDescPtr | srcDescPtr, | ||
| RppPtr_t | dstPtr, | ||
| RpptDescPtr | dstDescPtr, | ||
| Rpp32s * | srcLengthTensor, | ||
| bool | centerWindows, | ||
| bool | reflectPadding, | ||
| Rpp32f * | windowFunction, | ||
| Rpp32s | nfft, | ||
| Rpp32s | power, | ||
| Rpp32s | windowLength, | ||
| Rpp32s | windowStep, | ||
| rppHandle_t | rppHandle, | ||
| RppBackend | executionBackend | ||
| ) |
Produces a spectrogram from a 1D audio buffer on HIP/HOST backend.
Spectrogram for 1D audio buffer
- Parameters
-
[in] srcPtr source tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 2, offsetInBytes >= 0, dataType = F32) [out] dstPtr destination tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32, layout - NFT / NTF) [in] srcLengthTensor source audio buffer length (1D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize) [in] centerWindows indicates whether extracted windows should be padded so that the window function is centered at multiples of window_step [in] reflectPadding indicates the padding policy when sampling outside the bounds of the signal [in] windowFunction samples of the window function that will be multiplied to each extracted window when calculating the Short Time Fourier Transform (STFT).
if windowFunction is a nullptr, then required windowFunction values will be generated inside the kernel[in] nfft size of the FFT [in] power exponent of the magnitude of the spectrum [in] windowLength window size in number of samples [in] windowStep step between the STFT windows in number of samples [in] rppHandle RPP HIP/HOST handle created with rppCreate()
- Returns
- A
RppStatusenumeration.
- Return values
-
RPP_SUCCESS Successful completion. RPP_ERROR* Unsuccessful completion.
◆ rppt_to_decibels()
| RppStatus rppt_to_decibels | ( | RppPtr_t | srcPtr, |
| RpptDescPtr | srcDescPtr, | ||
| RppPtr_t | dstPtr, | ||
| RpptDescPtr | dstDescPtr, | ||
| RpptImagePatchPtr | srcDims, | ||
| Rpp32f | cutOffDB, | ||
| Rpp32f | multiplier, | ||
| Rpp32f | referenceMagnitude, | ||
| rppHandle_t | rppHandle, | ||
| RppBackend | executionBackend | ||
| ) |
To Decibels augmentation on HIP/HOST backend.
To Decibels augmentation for 1D/2D audio buffer converts magnitude values to decibel values
- Parameters
-
[in] srcPtr source tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 2 or 3 (for single-channel or multi-channel/2D audio tensor with 1 channel), offsetInBytes >= 0, dataType = F32) [out] dstPtr destination tensor in HIP memory (for HIP backend) or HOST memory (for HOST backend) [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 2 or 3 (for single-channel or multi-channel/2D audio tensor with 1 channel), offsetInBytes >= 0, dataType = F32) [in] srcDims source tensor sizes for each element in batch (2D tensor in pinned memory (for HIP backend) or HOST memory (for HOST backend), of size batchSize * 2) [in] cutOffDB minimum or cut-off ratio in dB [in] multiplier factor by which the logarithm is multiplied [in] referenceMagnitude Reference magnitude if not provided maximum value of input used as reference [in] rppHandle RPP HOST handle created with rppCreate()
- Returns
- A
RppStatusenumeration.
- Return values
-
RPP_SUCCESS Successful completion. RPP_ERROR* Unsuccessful completion.