
Learn the Most Important AI Models
DeepSeek: Open Source and Strength in Coding
Reading time: approx. 8 min
After navigating through the major commercial AI models like ChatGPT, Claude, Gemini and Grok, it is time to look at another important category of AI models: the open source models. Among these, DeepSeek has emerged as an interesting and powerful player, especially known for its impressive performance in coding, mathematics and general reasoning. The DeepSeek family includes everything from Coder-V2 (128 K context, top performance in coding) to the new DeepSeek-V3 (671 B MoE, 14.8 T tokens, 128 K context). Image generation is handled in the sister model DeepSeek-Janus-Pro-7B.
What you will learn
- What an open source model is and its advantages.
- Which DeepSeek models are relevant and their unique strengths.
- How DeepSeek can be used for coding, problem-solving and creative tasks in teaching.
- DeepSeek's language handling in Swedish and limitations.
- Important considerations regarding country of origin, data protection and ethical risks when using open source models in schools.
The Basics: What is DeepSeek?
DeepSeek is a family of large language models developed by DeepSeek-AI. Unlike models like ChatGPT or Gemini, which are primarily available via cloud services with closed source code, DeepSeek models are often available as open source under MIT and model licenses. This means that researchers, developers and even technically skilled schools can download and run the models on their own hardware, or adapt them for specific purposes.
Advantages of Open Source
Having a model be open source offers several advantages:
- Transparency: The code is public, which enables review and understanding of how the model works.
- Adaptability: Users can fine-tune the model with their own data, which can be relevant for specific pedagogical needs.
- Potential for local execution: The ability to run the model on your own infrastructure can offer greater control over data protection, as information does not need to be sent to an external cloud service. However, it should be noted that even 16B-Lite variants often require at least 1x80 GB GPU, while the larger 236B/671B models require several powerful A100/H100 cards, which often means you still have to rely on cloud-based services to run them.
Relevant DeepSeek Models
The DeepSeek family consists of several models, often optimized for different tasks:
- DeepSeek-Coder-V2: This model specializes in coding and is based on a "Mixture-of-Experts" (MoE) architecture. It has been trained on a large dataset where an additional 6 trillion tokens were added on top of the V2 checkpoint, giving a total training volume of around 7 to 7.5 trillion tokens. It supports over 300 programming languages and has a long context window of 128K tokens. A newer 236B/21B-active variant of Coder-V2 tops GPT-4o in HumanEval benchmarks, demonstrating exceptional capability.
- DeepSeek-MoE: A more general base model based on MoE architecture, which excels in general reasoning and mathematical problem-solving, in addition to coding.
- DeepSeek-V3: The latest and most advanced version, a 671B MoE model with 37 billion activated parameters per token. It has strong ability to perform functions such as creating program code, writing novels and articles, and working with financial analysis. It also has a context window of 128K tokens. However, V3 is primarily a text model. Multimodality is planned but not fully rolled out with its own pixel decoder, but rather via external inlining for image understanding.
- DeepSeek-R1: This series (launched January 2025) is specially optimized for reasoning.
- DeepSeek-Prover: An upgrade from April 2025 that further strengthens DeepSeek's mathematical capabilities.
- DeepSeek-Janus-Pro-7B: This is DeepSeek's dedicated model for image generation (launched January 2025), which is a sister model rather than part of the V3 core. DeepSeek-Janus-Pro-7B has been shown to beat DALL-E 3 in certain test benchmarks for image generation.
Strengths: What is DeepSeek Good At?
- Outstanding coding capability: Especially DeepSeek-Coder-V2 ranks very highly in coding benchmarks and outperforms many other models in generating, debugging and explaining code.
- Practical example: "Write a Python function that calculates the Fibonacci sequence iteratively" or "Find and explain the bug in this JavaScript code."
- Mathematics and reasoning: DeepSeek-MoE, the R1 series and Prover show strong results in mathematical problems and complex reasoning tasks, making them useful for STEM subjects.
- Practical example: "Explain the concept of derivative for a high school student and give a practical example."
- Long context windows: With context windows up to 128K tokens, DeepSeek can handle and reason about very large texts or code files.
- High-quality image generation (via Janus-Pro): DeepSeek-Janus-Pro-7B offers impressive image generation capability.
Swedish Language Handling and Image Generation
- Swedish: DeepSeek models are primarily trained on English and code. Although they can understand and generate text in Swedish, community tests have shown lacking cohesion in smaller languages like Swedish and German. Some users have reported problems with "language panic" where the model can mix in other languages or have difficulty with language recognition.
- Tips: To improve Swedish performance, especially in DeepSeek-Coder-V2 (Instruct) and R1, it is recommended to use a system prompt like "You are a Swedish assistant" and set the temperature to less than or equal to 0.7. However, results can still be unstable. For critical tasks, it is often best to prompt in English and then translate the output.
- Image generation: DeepSeek-Janus-Pro-7B is the model for image generation within the DeepSeek family.
Country of Origin and Security Risks
DeepSeek is developed by the Chinese company Hangzhou DeepSeek AI (杭州深度求索), founded in 2023 in Hangzhou, Zhejiang. The company is partly funded by the Chinese hedge fund High-Flyer. Nvidia CEO Jensen Huang recently highlighted DeepSeek as a "world-class model" during an event in Beijing.
However, several European and North American authorities have warned against using Chinese AI models in sensitive environments, including the public sector and education, due to potential security and privacy risks:
- Risk of data sharing under Chinese law: Chinese intelligence and cybersecurity laws (e.g. Section 7, Section 10, Section 35) can force Chinese companies to hand over data to the state, even if data is stored outside China. The Czech Republic's cybersecurity authority (NUKIB) has, for example, banned the use of DeepSeek in the public sector due to this risk.
- National security: The US House Select Committee on the CCP (Communist Party of China) has described DeepSeek as a "serious threat" to national security. The British government follows the same track and "monitors potential threats" from Chinese AI actors.
- Privacy gaps and weak protection filters: Independent reviews and technical reports point to fewer built-in protection filters and potentially weaker privacy and security routines compared to Western counterparts. Models can log prompts and responses for extended periods. CSIS (Center for Strategic and International Studies) has also analyzed that lacking protection filters make it easy to generate harmful code, such as ransomware.
- Potential for misuse: The open weight design combined with weaker "guardrails" can potentially lead to the model being misused to create ransomware, deepfakes or directed propaganda.
- Geopolitical uncertainty: Several countries have already implemented (or are considering) blocking or strict regulations. Sudden policy changes or export controls can make it difficult to get updates or support in the future.
What does this mean for schools? If the school uses DeepSeek via their own cloud services (if such are offered publicly), there is a risk that student or staff data ends up on servers in China and becomes subject to Chinese legislation. If the model is run completely locally on the school's own servers, you can mitigate the risk of data transfer, but then the school itself must take full responsibility for security filters, moderation, patching and infrastructure. Before pilot operation with DeepSeek, a thorough DPIA (Data Protection Impact Assessment) should be performed.
Practical Examples in the Classroom
- Programming teaching: Use DeepSeek-Coder-V2 to help students understand complex code snippets, debug their programs or generate example code for specific tasks. It can be a "programming tutor" for both teachers and students.
- Mathematical problem-solving: For high school students, DeepSeek-MoE, the R1 series and Prover can be used to explore different ways to solve mathematical problems or explain complex concepts step by step.
- Content generation for teachers: Use DeepSeek to quickly generate drafts for text-based teaching materials, assignments or test questions that require strong logical reasoning or coding.
- Visual projects (via Janus-Pro): For creative projects where image generation is needed, DeepSeek-Janus-Pro-7B can offer high-quality images.
Next Steps
Now that we have covered a representative of open source models and unique challenges, we will in the next module take a closer look at Meta Llama, another significant player in the open source AI landscape. Meta has taken a leading role in making powerful AI models available to researchers and developers globally, driving innovation and transparency in the industry.
