The default 4 MiB cap on encoded/decoded messages is too small for
the multimodal Generate path: Qwen3.6-VL high-res patches put 5–8 MiB
of pre-encoded image bytes inline in a single Generate request, and
Done events carrying full per-token readout vectors can also exceed
4 MiB on long runs. Hit "ResourceExhausted: Received message larger
than max (5799108 vs. 4194304)" from the salience server.
Bump both encode and decode caps on every cloned SalienceClient. The
matching server-side bump is in vllm/entrypoints/salience/server.py.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>