deepseek-vision
Use whenever the user references an image (local file path or http/https URL — screenshot, photo, diagram, UI capture, chart, error dialog) and you need to know what's in it to answer or act. Calls a vision model (Qwen3.6-Flash by default) via DashScope and returns a text description you can reason over. Especially important when running on a text-only backend like DeepSeek V4, but also useful as a dedicated OCR / detail extractor even when the main model is multimodal.
Changelog: Source: GitHub https://github.com/Agents365-ai/dsclaude
Loading comments...