Getting Started with Qwen API
This page will help you get started with Qwen API. You'll be up and running in a jiffy!
Qwen API User Guide
A comprehensive guide to understanding and using the Qwen API Proxy effectively.
Table of Contents
- Getting Started
- Understanding Authentication
- Choosing the Right Model
- Feature Guide
- File Upload Guidelines
- Best Practices
- Common Use Cases
- Error Handling
- Troubleshooting
- FAQ
Getting Started
What You Need
Before you begin using the Qwen API, make sure you have:
- A Qwen Account - Sign up at chat.qwen.ai if you don't have one
- Your Authentication Token - Extract it using the browser script provided in the README
- An HTTP Client - Any tool that can make HTTP requests (cURL, Postman, your application, etc.)
First Steps
- Get Your Token: Visit the Qwen website, run the token extraction script in your browser console, and copy the generated token
- Validate Your Token: Use the
/v1/validateendpoint to ensure your token is working - List Available Models: Check
/v1/modelsto see all models you can use - Make Your First Request: Try a simple chat completion to get familiar with the API
API Base URL
All requests should be sent to: https://qwen.aikit.club
Understanding Authentication
What is the Token?
The authentication token is a compressed string that contains your Qwen credentials. It's like a password that proves you have access to use the Qwen AI services.
Token Format
- Starts with
H4sIAAAAAAAAA... - Is compressed using gzip for security
- Contains both your access token and session cookie
- Typically several hundred characters long
Token Lifecycle
Valid Token → Make Requests → Token Expires → Refresh Token → Continue Using
When to Refresh
Your token will eventually expire. You'll know it's expired when you start getting authentication errors. Use the /v1/refresh endpoint to get a new token without logging in again.
Security Tips
- ✅ Store tokens securely (environment variables, secure storage)
- ✅ Never commit tokens to version control
- ✅ Use HTTPS only (never HTTP)
- ✅ Rotate tokens periodically
- ❌ Don't share tokens publicly
- ❌ Don't hardcode tokens in your application
Choosing the Right Model
Quick Model Selector
Need general conversation?
→ Use qwen-max-latest (best all-around)
Need fast responses?
→ Use qwen2.5-turbo (optimized for speed)
Need web search?
→ Use qwen2.5-max (includes web search capability)
Need deep analysis?
→ Use QVQ-Max or QWQ-32B (advanced reasoning)
Need comprehensive research?
→ Use qwen-deep-research (multi-phase research with citations)
Need code generation?
→ Use qwen3-coder-plus (specialized for coding)
Need web development?
→ Use qwen-web-dev (frontend UI/UX specialist)
Need full application?
→ Use qwen-full-stack (complete stack development)
Need vision analysis?
→ Use qwen2.5-max or qwen-max-latest (both support images)
Need very long context?
→ Use qwen2.5-14b-1m (supports 1 million tokens)
Need multimodal (audio/video)?
→ Use qwen2.5-omni-7b (handles multiple media types)
Model Capabilities Overview
Each model has different strengths. Check the Model Capabilities table in the README to see which features each model supports:
- 👁️ Vision - Can analyze images and visual content
- 💡 Reasoning - Advanced thinking and problem-solving
- 🌐 Web Search - Can search the web for current information
- 🔧 Tool Calling - Can use external tools and functions
Feature Guide
1. Basic Chat Conversations
What it does: Have text-based conversations with AI
When to use: General questions, content generation, explanations, creative writing
Best models: qwen-max-latest, qwen2.5-plus, qwen2.5-turbo
Tips:
- Keep messages clear and specific
- Provide context when needed
- Use system messages to set behavior
- Enable streaming for real-time responses
2. Web Search Integration
What it does: AI can search the web and use current information
When to use: Questions about current events, recent developments, real-time data
Requirements:
- Must use a model with web search capability (check table)
- Include
tools: [{ type: "web_search" }]in your request
Tips:
- Ask about current topics ("What happened today...")
- Request recent information ("Latest news about...")
- Specify time frames ("In the last week...")
3. Thinking Mode (Reasoning)
What it does: AI shows its reasoning process and thinks through complex problems
When to use: Math problems, logic puzzles, complex analysis, step-by-step solutions
Requirements:
- Enable with
enable_thinking: true - Set a thinking budget (recommended: 30000 tokens)
Tips:
- Use for problems that need careful reasoning
- Ask for step-by-step explanations
- Works best with reasoning-capable models
4. Vision and Image Analysis
What it does: AI can see and understand images, photos, screenshots, diagrams
When to use: Image description, visual analysis, OCR, chart interpretation
Supported formats: JPG, PNG, GIF, WebP (most common)
Tips:
- Provide clear, high-quality images
- Ask specific questions about what you want to know
- Combine multiple images for comparison (up to 5)
- Works with URLs or base64-encoded images
5. Document Analysis
What it does: AI can read and analyze PDF documents, text files, and other documents
When to use: Document summarization, information extraction, content analysis
Supported formats: PDF, TXT, MD (most common), DOC, DOCX
Tips:
- Keep documents under 20MB
- Can combine with images (e.g., PDF + photo)
- Ask specific questions about document content
- Request summaries, key points, or specific information
6. Deep Research
What it does: Comprehensive multi-phase research with web search and citations
When to use: Academic research, market analysis, in-depth investigations
Model: Must use qwen-deep-research
Features:
- Multi-phase research process
- Web search integration
- Source validation
- Automatic citations
- Comprehensive reports
- Export to PDF/Markdown
Tips:
- Ask broad research questions
- Let the AI complete all research phases
- Review citations for accuracy
- Download reports for offline use
7. Image Generation
What it does: Create new images from text descriptions
When to use: Creating art, illustrations, concept designs, visual content
Options:
- Size:
1024x1024,1792x1024,1024x1792 - Quality: Standard or HD
- Style: Natural or vivid
Tips:
- Be descriptive in your prompts
- Specify style, mood, colors
- Mention composition and perspective
- Iterate with edits for refinement
8. Image Editing
What it does: Modify existing images using text instructions
When to use: Changing image elements, style adjustments, adding/removing objects
Input: Original image + editing instructions
Tips:
- Provide clear editing instructions
- Use simple, direct commands
- One change at a time works best
- Can use generated images as input
9. Video Generation
What it does: Create videos from text descriptions
When to use: Video content creation, animations, visual storytelling
Options:
- Size:
1280x720,1920x1080 - Duration: 5-60 seconds
Tips:
- Describe motion and action clearly
- Specify scene changes
- Keep descriptions focused
- Longer videos take more time
10. Code Generation
What it does: Generate code, debug, explain programming concepts
Model: Best with qwen3-coder-plus
Features:
- Multi-language support
- Function/tool calling
- Code explanation
- Bug fixing
- Code optimization
Tips:
- Specify programming language
- Describe functionality clearly
- Ask for comments and documentation
- Request error handling
11. Web Development
What it does: Create web components, UI elements, responsive designs
Model: Use qwen-web-dev
Output: HTML, CSS, JavaScript
Features:
- Responsive design
- Modern CSS
- Interactive components
- Framework support (React, Vue)
- Accessibility considerations
Tips:
- Describe desired functionality
- Specify design preferences
- Mention framework if needed
- Ask for responsive design
12. Full-Stack Development
What it does: Build complete applications with frontend, backend, database
Model: Use qwen-full-stack
Output: Complete application code
Features:
- Frontend frameworks
- Backend APIs
- Database schemas
- Authentication
- Deployment-ready code
Tips:
- Describe full requirements
- Specify tech stack preferences
- Mention scalability needs
- Request security features
File Upload Guidelines
Understanding File Categories
Files are grouped into two main categories:
Media Files (Image, Audio, Video):
- All considered the same category
- Cannot combine different media types
- Examples: JPG + MP4 = ❌ Invalid
Documents (PDF, Text files):
- Separate category from media
- Can combine with media files
- Examples: JPG + PDF = ✅ Valid
What You Can Upload
Valid Single File Types:
- ✅ One or more images (up to 5)
- ✅ One audio file
- ✅ One video file
- ✅ One or more documents (up to 5)
Valid Combinations:
- ✅ Multiple images only
- ✅ Image(s) + Document(s)
- ✅ Audio + Document(s)
- ✅ Video + Document(s)
- ✅ Single media file only
Invalid Combinations:
- ❌ Image + Audio
- ❌ Image + Video
- ❌ Audio + Video
- ❌ Multiple videos
- ❌ Multiple audio files
Size and Duration Limits
Images:
- Maximum: 10MB per image
- Count: Up to 5 images
- Formats: JPG, PNG, GIF, WebP recommended
Audio:
- Maximum: 100MB
- Duration: Up to 3 minutes (180 seconds)
- Count: 1 file only
- Formats: MP3, WAV, M4A, AAC recommended
Video:
- Maximum: 500MB
- Duration: Up to 10 minutes (600 seconds)
- Count: 1 file only
- Formats: MP4, MOV, AVI, MKV recommended
Documents:
- Maximum: 20MB per document
- Count: Up to 5 documents
- Formats: PDF, TXT, MD recommended
Upload Methods
Method 1: URL Provide a direct URL to the file hosted online
Method 2: Base64 Encode the file as base64 data and include it directly
Method 3: File Upload Use multipart form data to upload the file directly
Best Practices
Request Optimization
Stream for Real-Time:
- Enable streaming for chatbot-like experiences
- Provides instant feedback to users
- Better for long responses
Non-Stream for Complete Data:
- Use when you need the full response at once
- Better for batch processing
- Easier error handling
Message Management
System Messages:
- Set AI behavior and personality
- Define response format
- Establish context
User Messages:
- Ask clear, specific questions
- Provide necessary context
- Keep conversations focused
Assistant Messages:
- Include previous AI responses for context
- Maintain conversation history
- Build on previous exchanges
Context Management
Keep Relevant History:
- Include important previous messages
- Don't send entire conversation every time
- Focus on recent relevant context
Token Awareness:
- Longer conversations use more tokens
- Summarize old conversations if needed
- Reset context when changing topics
Rate Limiting
Respect Limits:
- Don't spam requests rapidly
- Implement delays between requests
- Use queuing for bulk operations
Handle Failures:
- Retry with exponential backoff
- Don't retry immediately
- Log errors for debugging
Error Recovery
Graceful Degradation:
- Have fallback options
- Inform users of issues
- Don't crash on errors
Token Refresh:
- Detect expiration automatically
- Refresh before making request
- Store new token securely
Common Use Cases
1. Building a Chatbot
Goal: Create an interactive conversational AI
Steps:
- Initialize with system message
- Maintain conversation history
- Stream responses for real-time feel
- Handle context window limits
- Reset conversation when needed
Best Practices:
- Keep last 10-20 messages
- Use streaming for better UX
- Implement typing indicators
- Handle connection issues
2. Document Summarization
Goal: Extract key information from documents
Steps:
- Upload document (PDF, TXT)
- Ask for summary
- Request specific sections
- Extract key points
Best Practices:
- Use clear, focused questions
- Specify desired length
- Ask for structured output
- Verify important information
3. Image Analysis
Goal: Understand and describe visual content
Steps:
- Provide image URL or file
- Ask specific questions
- Request detailed analysis
- Compare multiple images
Best Practices:
- Use high-quality images
- Ask focused questions
- Specify what to look for
- Combine with text context
4. Research Assistant
Goal: Conduct comprehensive research
Steps:
- Use
qwen-deep-researchmodel - Provide research topic
- Wait for all phases to complete
- Review sources and citations
- Download report if needed
Best Practices:
- Start with clear research question
- Allow time for completion
- Verify citations
- Use non-streaming mode
5. Content Generation
Goal: Create written content
Steps:
- Describe content type
- Specify tone and style
- Set length requirements
- Iterate and refine
Best Practices:
- Provide examples if possible
- Be specific about requirements
- Review and edit output
- Regenerate if needed
6. Code Assistant
Goal: Help with programming tasks
Steps:
- Use
qwen3-coder-plusmodel - Describe functionality
- Specify language/framework
- Request explanations
- Ask for improvements
Best Practices:
- Include context about project
- Specify error handling needs
- Request code comments
- Test generated code
7. Visual Content Creation
Goal: Generate images and videos
Steps:
- Write detailed prompts
- Specify size and style
- Generate initial content
- Edit and refine
- Download final result
Best Practices:
- Be descriptive
- Iterate on results
- Use editing for refinement
- Save successful prompts
8. Multi-Language Support
Goal: Translate or work in multiple languages
Steps:
- Specify source and target languages
- Provide context
- Request translation or generation
- Verify accuracy
Best Practices:
- Specify language explicitly
- Provide cultural context
- Verify translations
- Use native speakers for review
Error Handling
Common Errors and Solutions
Authentication Errors (401)
Symptoms: "Invalid API key" or "Unauthorized"
Solutions:
- Validate your token using
/v1/validate - Refresh token using
/v1/refresh - Regenerate token from browser
- Check token format is correct
Invalid Request (400)
Symptoms: "Bad request" or "Invalid parameters"
Solutions:
- Check request format matches API spec
- Verify all required fields are present
- Validate JSON syntax
- Check parameter values are valid
Rate Limiting (429)
Symptoms: "Too many requests"
Solutions:
- Slow down request rate
- Implement request queuing
- Add delays between requests
- Use exponential backoff
File Too Large (413)
Symptoms: "File size exceeds limit"
Solutions:
- Compress images before upload
- Use smaller file sizes
- Split large documents
- Check size limits for file type
Unsupported Format (415)
Symptoms: "Unsupported media type"
Solutions:
- Use supported file formats
- Convert files to compatible format
- Check format is in allowed list
- Verify MIME type is correct
Server Error (500)
Symptoms: "Internal server error"
Solutions:
- Retry request after delay
- Check if service is operational
- Report persistent errors
- Use different endpoint if available
Error Response Structure
Every error response includes:
- message: Human-readable error description
- type: Error category
- code: Specific error code
- param: Which parameter caused error (if applicable)
Retry Strategies
Immediate Retry (Network issues):
- Retry 1-2 times immediately
- For temporary connection problems
- Don't retry on auth errors
Exponential Backoff (Rate limits):
- Wait 1s, 2s, 4s, 8s between retries
- For 429 or temporary 500 errors
- Give up after 3-5 attempts
No Retry (Client errors):
- Don't retry 400, 401, 403, 415
- Fix the request instead
- These indicate client-side problems
Troubleshooting
Token Issues
Problem: Token not working
Diagnostic Steps:
- Check token format (starts with H4sIAAAAAAAAA)
- Validate using
/v1/validateendpoint - Check if token is expired
- Verify you're logged into Qwen
- Regenerate token from browser
Common Causes:
- Token copied incorrectly
- Session expired
- Account issues
- Wrong token format
Streaming Problems
Problem: Stream cuts off or doesn't work
Diagnostic Steps:
- Verify
stream: trueis set - Check you're handling SSE format
- Look for
[DONE]marker - Test with non-streaming first
- Check network/proxy settings
Common Causes:
- Incorrect SSE parsing
- Network interruptions
- Proxy buffering
- Client timeout too short
File Upload Failures
Problem: Can't upload files
Diagnostic Steps:
- Check file size against limits
- Verify file format is supported
- Test with smaller file
- Check file is not corrupted
- Verify URL is accessible (if using URL)
Common Causes:
- File too large
- Unsupported format
- Mixing incompatible file types
- Network issues
Model Not Working
Problem: Specific model gives errors
Diagnostic Steps:
- Check model name spelling
- Verify model exists (
/v1/models) - Check model supports requested feature
- Try with default model
- Check capability matrix
Common Causes:
- Typo in model name
- Model doesn't support feature
- Model deprecated or unavailable
- Feature flag not set
Slow Responses
Problem: Requests take too long
Diagnostic Steps:
- Check if large file uploads
- Verify network connection
- Try simpler request
- Use faster model (turbo)
- Enable streaming
Common Causes:
- Large files being processed
- Complex thinking/research tasks
- Network latency
- Server load
- Long generation requested
Debug Logs Not Showing
Problem: Can't see Request/Response IDs
Diagnostic Steps:
- Check you're looking at response content
- Verify logs are at the end
- Look for collapsible section
- Check streaming vs non-streaming
- Verify response completed
Common Causes:
- Stream interrupted
- Looking in wrong place
- Response cut off
- Client filtering logs
FAQ
General Questions
Q: Is this service free to use?
A: Yes, the proxy itself is free. However, you need a Qwen account, and any usage limits depend on your Qwen account tier. Free Qwen accounts have certain limitations, while premium accounts have higher limits.
Q: Are there rate limits?
A: The proxy doesn't impose its own rate limits. All limitations come from your Qwen account. Free accounts have lower limits than paid accounts.
Q: Can I use this in production?
A: Yes, but keep in mind this is an unofficial proxy. For production use, implement proper error handling, token refresh logic, and have fallback strategies.
Q: How do I get support?
A: Open an issue on GitHub for bugs or feature requests. Check existing issues and discussions for common questions.
Q: Can I self-host this?
A: Yes! The code is open source. You can deploy it to your own Cloudflare Workers account or adapt it for other platforms.
Authentication
Q: How long do tokens last?
A: Tokens expire based on Qwen's session management. You'll need to refresh or regenerate them periodically.
Q: Can I use the same token from multiple applications?
A: Yes, one token can be used across multiple applications, but be aware of rate limits.
Q: What if my token gets leaked?
A: Regenerate immediately from the Qwen website and update it in your applications.
Q: Do I need to refresh tokens manually?
A: You can implement automatic refresh using the /v1/refresh endpoint when you detect expiration.
Models
Q: Which model should I use?
A: For general use, start with qwen-max-latest. For specific needs, refer to the Model Selection guide above.
Q: Can I switch models mid-conversation?
A: Yes, but you'll need to create a new chat. Models don't share conversation context.
Q: Are all models always available?
A: Availability depends on your Qwen account. Some models may require premium access.
Q: What's the difference between Max and Turbo?
A: Max is more powerful and thorough, Turbo is faster but may be less detailed.
Features
Q: Can I use web search with any model?
A: No, only models with web search capability (check the capabilities table) support this feature.
Q: How do I enable thinking mode?
A: Add enable_thinking: true to your request and use a reasoning-capable model.
Q: Can I combine multiple features?
A: Yes! For example, you can use web search + thinking mode + vision together if the model supports all three.
Q: What's the maximum file size?
A: It varies by type: Images (10MB), Audio (100MB), Video (500MB), Documents (20MB).
Technical
Q: Is streaming faster than non-streaming?
A: Streaming provides faster first response but same total time. It's better for user experience.
Q: Can I cancel a request in progress?
A: You can close the connection, but the server may continue processing.
Q: How do I handle long conversations?
A: Keep only recent relevant messages to avoid hitting context limits. Summarize old conversations if needed.
Q: Can I use this with OpenAI SDKs?
A: Yes! Just change the base URL to https://qwen.aikit.club and use your Qwen token.
Troubleshooting
Q: Why am I getting "Invalid API key" errors?
A: Your token may be expired, invalid, or incorrectly formatted. Try validating or regenerating it.
Q: Files aren't uploading properly
A: Check file size, format, and that you're not mixing incompatible file types (e.g., image + video).
Q: Responses are cut off
A: This could be token limits, connection issues, or streaming problems. Try non-streaming mode first.
Q: Can I get request logs for debugging?
A: Yes, response content includes debug logs with Request ID and Response ID at the end.
Updated about 1 month ago
