LobeChat
Ctrl K
Back to Discovery
DeepSeek

DeepSeek VL2

deepseek-ai/deepseek-vl2
DeepSeek-VL2 is a mixture of experts (MoE) visual language model developed based on DeepSeekMoE-27B, employing a sparsely activated MoE architecture that achieves outstanding performance while activating only 4.5 billion parameters. This model excels in various tasks, including visual question answering, optical character recognition, document/table/chart understanding, and visual localization.
4K

Providers Supporting This Model

DeepSeek
SiliconCloudSiliconCloud
DeepSeekdeepseek-ai/deepseek-vl2
Maximum Context Length
4K
Maximum Output Length
--
Input Price
$0.14
Output Price
$0.14

Model Parameters

Randomness
temperature

This setting affects the diversity of the model's responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. When set to 0, the model always gives the same response to a given input. View Documentation

Type
FLOAT
Default Value
1.00
Range
0.00 ~ 2.00
Nucleus Sampling
top_p

This setting limits the model's selection to a certain proportion of the most likely vocabulary: only selecting those top words whose cumulative probability reaches P. Lower values make the model's responses more predictable, while the default setting allows the model to choose from the entire range of vocabulary. View Documentation

Type
FLOAT
Default Value
1.00
Range
0.00 ~ 1.00
Topic Freshness
presence_penalty

This setting aims to control the reuse of vocabulary based on its frequency in the input. It attempts to use less of those words that appear more frequently in the input, with usage frequency proportional to occurrence frequency. Vocabulary penalties increase with frequency of occurrence. Negative values encourage vocabulary reuse. View Documentation

Type
FLOAT
Default Value
0.00
Range
-2.00 ~ 2.00
Frequency Penalty
frequency_penalty

This setting adjusts the frequency at which the model reuses specific vocabulary that has already appeared in the input. Higher values reduce the likelihood of such repetition, while negative values have the opposite effect. Vocabulary penalties do not increase with frequency of occurrence. Negative values encourage vocabulary reuse. View Documentation

Type
FLOAT
Default Value
0.00
Range
-2.00 ~ 2.00
Single Response Limit
max_tokens

This setting defines the maximum length that the model can generate in a single response. Setting a higher value allows the model to produce longer replies, while a lower value restricts the length of the response, making it more concise. Adjusting this value appropriately based on different application scenarios can help achieve the desired response length and level of detail. View Documentation

Type
INT
Default Value
--
Reasoning Intensity
reasoning_effort

This setting controls the intensity of reasoning the model applies before generating a response. Low intensity prioritizes response speed and saves tokens, while high intensity provides more comprehensive reasoning but consumes more tokens and slows down response time. The default value is medium, balancing reasoning accuracy with response speed. View Documentation

Type
STRING
Default Value
--
Range
low ~ high

Related Models

DeepSeek

DeepSeek R1

deepseek-ai/DeepSeek-R1
DeepSeek-R1 is a reinforcement learning (RL) driven inference model that addresses issues of repetitiveness and readability within the model. Prior to RL, DeepSeek-R1 introduced cold start data to further optimize inference performance. It performs comparably to OpenAI-o1 in mathematical, coding, and reasoning tasks, and enhances overall effectiveness through meticulously designed training methods.
64K
DeepSeek

DeepSeek V3

deepseek-ai/DeepSeek-V3
DeepSeek-V3 is a mixture of experts (MoE) language model with 671 billion parameters, utilizing multi-head latent attention (MLA) and the DeepSeekMoE architecture, combined with a load balancing strategy that does not rely on auxiliary loss, optimizing inference and training efficiency. Pre-trained on 14.8 trillion high-quality tokens and fine-tuned with supervision and reinforcement learning, DeepSeek-V3 outperforms other open-source models and approaches leading closed-source models in performance.
64K
DeepSeek

DeepSeek V2.5

deepseek-ai/DeepSeek-V2.5
DeepSeek V2.5 combines the excellent features of previous versions, enhancing general and coding capabilities.
32K
Qwen

QVQ 72B Preview

Qwen/QVQ-72B-Preview
QVQ-72B-Preview is a research-oriented model developed by the Qwen team, focusing on visual reasoning capabilities, with unique advantages in understanding complex scenes and solving visually related mathematical problems.
32K
Qwen

QwQ 32B Preview

Qwen/QwQ-32B-Preview
QwQ-32B-Preview is Qwen's latest experimental research model, focusing on enhancing AI reasoning capabilities. By exploring complex mechanisms such as language mixing and recursive reasoning, its main advantages include strong analytical reasoning, mathematical, and programming abilities. However, it also faces challenges such as language switching issues, reasoning loops, safety considerations, and differences in other capabilities.
32K