Definition of Acoustic Features:

Acoustic features refer to the measurable characteristics and properties of sound signals or acoustic waves. These features are derived from the analysis of audio data using signal processing techniques and serve as crucial components in various applications of speech and audio processing.

Acoustic features provide valuable information about the spectral, temporal, and perceptual aspects of sound, enabling the extraction and quantification of different acoustic properties. These features can be used to analyze and distinguish sounds, such as speech recognition, speaker identification, music genre classification, and sound event detection.

The most commonly used acoustic features include:

  1. Power Spectral Density (PSD): This feature represents the distribution of signal power across different frequency components of the audio signal. It is obtained by dividing the signal into small overlapping frames and calculating the magnitude of the Fourier Transform for each frame.
  2. Mel-Frequency Cepstral Coefficients (MFCC): MFCCs are coefficients that capture the spectral envelope of the audio signal. They are computed by applying a series of signal processing operations, including framing, windowing, Fourier Transform, mel-filterbank, and logarithm. MFCCs are widely used in speech recognition and speaker identification tasks.
  3. Zero Crossing Rate (ZCR): ZCR measures the rate at which the audio signal changes its sign, indicating the frequency of zero crossings. It provides information about the temporal variations in the signal and is used in tasks like music genre classification and speech activity detection.
  4. Temporal Centroid: This feature represents the center of mass of the audio signal’s temporal distribution. It characterizes the average time at which the signal energy is concentrated and is useful in audio classification and event detection.
  5. Spectral Roll-Off: Spectral roll-off denotes the frequency below which a certain percentage (e.g., 85%) of the audio signal’s spectral energy lies. It helps in differentiating between harmonic and non-harmonic sounds.

These are just a few examples of the extensive range of acoustic features that can be extracted from audio signals. The selection and combination of these features depend on the specific application requirements and desired analysis.