![]() You can set the number of decoders you want inside a single container using DECODER MAX_COUNT variable. A message like this one appears in the docker logs: 16:46:54.981118943Ĭannot find Scan4_llvm_mcpu_skylake_avx512 in cache, using JIT. A machine with the AVX512 instruction set requires code generation for that target, and starting 10 containers for 10 languages may temporarily exhaust CPU. For a running transcription that could be disastrous and may lead to slowdowns and other performance implications.įurthermore, we prepackage executables for machines with the advanced vector extension (AVX2) instruction set. If you have a machine where memory is scarce, and you're trying to deploy multiple languages on it, it's possible that file cache is full, and the OS is forced to page models in and out. It's an extra 2 GB for ja-JP for en-US, it may be more (6-7 GB). It does not take into account the actual full size of the language model, which will reside in file cache. Meanwhile, the memory in this context is operating memory for decoding speech. If you're feeding audio faster than real-time, for example, from a file, that usage can double (1.2x cores). When we benchmarked the use, it takes about 0.6 CPU cores to process a single speech-to-text request when audio is flowing in at real-time, for example, from a microphone. ![]() The acoustic model is the most demanding piece CPU-wise, while the language model demands the most memory. If you're experiencing problems, it may be a hardware-related issue - so we would first look at resource, that is CPU and memory specifications.Ĭonsider for a moment, the ja-JP container and latest model. First, setting up single language, multiple containers, on the same machine, shouldn't be a large issue. ![]() When setting up the production cluster, there are several things to consider. For an overview, see Install and run Speech containers.
0 Comments
Leave a Reply. |