birdnet_tiny_forge.features.microspeech package¶

class birdnet_tiny_forge.features.microspeech.MicroSpeechExtractor(params)¶

Bases: FeatureExtractorBase

run(sample_rate, audio_slice)¶: Run feature extraction on raw audio data and return features.

Submodules¶

birdnet_tiny_forge.features.microspeech.extractor module¶

MicroSpeechExtractor uses tflite’s microspeech example feature extraction chain

class birdnet_tiny_forge.features.microspeech.extractor.MicroSpeechExtractor(params)¶

Bases: FeatureExtractorBase

run(sample_rate, audio_slice)¶: Run feature extraction on raw audio data and return features.

birdnet_tiny_forge.features.microspeech.tflite_micro_frontend module¶

class birdnet_tiny_forge.features.microspeech.tflite_micro_frontend.AudioPreprocessor(params: FeatureParams, detail: str = 'unknown')¶

Bases: object

Audio Preprocessor

Args:: params: FeatureParams, an immutable object supplying parameters for the AudioPreprocessor instance detail: str, used for debug output (optional, for debugging only)

generate_feature_using_tflm(audio_frame: tensorflow.Tensor) → tensorflow.Tensor¶

Generate a single feature for a single audio frame. Uses TensorFlow graph execution and the TensorFlow model converter to generate a TFLM compatible model. This model is then used by the TFLM MicroInterpreter to execute a single inference operation.

Args:: audio_frame: tf.Tensor, a single audio frame (self.params.window_size_ms) with shape (1, audio_samples_count)
Returns:: tf.Tensor, a tensor containing a single audio feature with shape (self.params.filter_bank_number_of_channels,)

generate_tflite_file() → Path¶

Create a .tflite model file

The model output tensor type will depend on the ‘FeatureParams.use_float_output’ parameter.

Returns:: Path object for the created model file

reset_tflm()¶

Reset TFLM interpreter state

Re-initializes TFLM interpreter state and the internal state of all TFLM kernel operators. Useful for resetting Signal library operator noise estimation and other internal state.

class birdnet_tiny_forge.features.microspeech.tflite_micro_frontend.FeatureParams(*, sample_rate: int = 16000, window_size_ms: int = 30, window_stride_ms: int = 20, window_scaling_bits: int = 12, filter_bank_number_of_channels: int = 40, filter_bank_lower_band_limit_hz: float = 125.0, filter_bank_upper_band_limit_hz: float = 7500.0, filter_bank_scaling_bits: int = tflite_micro.python.tflite_micro.signal.ops.filter_bank_ops.FILTER_BANK_WEIGHT_SCALING_BITS, filter_bank_alignment: int = 4, filter_bank_channel_block_size: int = 4, filter_bank_post_scaling_bits: int = 6, filter_bank_spectral_subtraction_bits: int = 14, filter_bank_smoothing_bits: int = 10, filter_bank_even_smoothing: float = 0.025, filter_bank_odd_smoothing: float = 0.06, filter_bank_min_signal_remaining: float = 0.05, filter_bank_clamping: bool = False, pcan_strength: float = 0.95, pcan_offset: float = 80.0, pcan_gain_bits: int = 21, pcan_smoothing_bits: int = 10, legacy_output_scaling: float = 25.6, use_float_output: bool = False)¶

Bases: BaseModel

Feature generator parameters

Defaults are configured to work with the micro_speech_quantized.tflite model

filter_bank_alignment: int¶: filter bank alignment, updates filter bank constant

filter_bank_channel_block_size: int¶: filter bank channel block size, updates filter bank constant

filter_bank_clamping: bool¶: filter bank noise reduction clamping

filter_bank_even_smoothing: float¶: filter bank noise reduction even smoothing

filter_bank_lower_band_limit_hz: float¶: filter bank lower band limit

filter_bank_min_signal_remaining: float¶: filter bank noise reduction minimum signal remaining

filter_bank_number_of_channels: int¶: filter bank channel count

filter_bank_odd_smoothing: float¶: filter bank noise reduction odd smoothing

filter_bank_post_scaling_bits: int¶: filter bank output log-scaling bits

filter_bank_scaling_bits: int¶: filter bank weight scaling bits, updates filter bank constant

filter_bank_smoothing_bits: int¶: filter bank noise reduction smoothing bits

filter_bank_spectral_subtraction_bits: int¶: filter bank noise reduction spectral subtration bits

filter_bank_upper_band_limit_hz: float¶: filter bank upper band limit

legacy_output_scaling: float¶: Final output scaling, legacy from training

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pcan_gain_bits: int¶: PCAN gain control bits

pcan_offset: float¶: PCAN gain control offset

pcan_smoothing_bits: int¶: PCAN gain control smoothing bits

pcan_strength: float¶: PCAN gain control strength

sample_rate: int¶: audio sample rate

use_float_output: bool¶: Use float output if True, otherwise int8 output

window_scaling_bits: int¶: input window shaping: scaling bits

window_size_ms: int¶: input window size in milliseconds

window_stride_ms: int¶: input window stride in milliseconds

birdnet_tiny_forge.features.microspeech package¶

Submodules¶

birdnet_tiny_forge.features.microspeech.extractor module¶

birdnet_tiny_forge.features.microspeech.tflite_micro_frontend module¶

BirdNET-Tiny Forge

Navigation

Related Topics