Getting Started with Qwen API

This page will help you get started with Qwen API. You'll be up and running in a jiffy!


Qwen API User Guide

A comprehensive guide to understanding and using the Qwen API Proxy effectively.


Table of Contents


Getting Started

What You Need

Before you begin using the Qwen API, make sure you have:

  1. A Qwen Account - Sign up at chat.qwen.ai if you don't have one
  2. Your Authentication Token - Extract it using the browser script provided in the README
  3. An HTTP Client - Any tool that can make HTTP requests (cURL, Postman, your application, etc.)

First Steps

  1. Get Your Token: Visit the Qwen website, run the token extraction script in your browser console, and copy the generated token
  2. Validate Your Token: Use the /v1/validate endpoint to ensure your token is working
  3. List Available Models: Check /v1/models to see all models you can use
  4. Make Your First Request: Try a simple chat completion to get familiar with the API

API Base URL

All requests should be sent to: https://qwen.aikit.club


Understanding Authentication

What is the Token?

The authentication token is a compressed string that contains your Qwen credentials. It's like a password that proves you have access to use the Qwen AI services.

Token Format

  • Starts with H4sIAAAAAAAAA...
  • Is compressed using gzip for security
  • Contains both your access token and session cookie
  • Typically several hundred characters long

Token Lifecycle

Valid TokenMake RequestsToken ExpiresRefresh TokenContinue Using

When to Refresh

Your token will eventually expire. You'll know it's expired when you start getting authentication errors. Use the /v1/refresh endpoint to get a new token without logging in again.

Security Tips

  • ✅ Store tokens securely (environment variables, secure storage)
  • ✅ Never commit tokens to version control
  • ✅ Use HTTPS only (never HTTP)
  • ✅ Rotate tokens periodically
  • ❌ Don't share tokens publicly
  • ❌ Don't hardcode tokens in your application

Choosing the Right Model

Quick Model Selector

Need general conversation? → Use qwen-max-latest (best all-around)

Need fast responses? → Use qwen2.5-turbo (optimized for speed)

Need web search? → Use qwen2.5-max (includes web search capability)

Need deep analysis? → Use QVQ-Max or QWQ-32B (advanced reasoning)

Need comprehensive research? → Use qwen-deep-research (multi-phase research with citations)

Need code generation? → Use qwen3-coder-plus (specialized for coding)

Need web development? → Use qwen-web-dev (frontend UI/UX specialist)

Need full application? → Use qwen-full-stack (complete stack development)

Need vision analysis? → Use qwen2.5-max or qwen-max-latest (both support images)

Need very long context? → Use qwen2.5-14b-1m (supports 1 million tokens)

Need multimodal (audio/video)? → Use qwen2.5-omni-7b (handles multiple media types)

Model Capabilities Overview

Each model has different strengths. Check the Model Capabilities table in the README to see which features each model supports:

  • 👁️ Vision - Can analyze images and visual content
  • 💡 Reasoning - Advanced thinking and problem-solving
  • 🌐 Web Search - Can search the web for current information
  • 🔧 Tool Calling - Can use external tools and functions

Feature Guide

1. Basic Chat Conversations

What it does: Have text-based conversations with AI

When to use: General questions, content generation, explanations, creative writing

Best models: qwen-max-latest, qwen2.5-plus, qwen2.5-turbo

Tips:

  • Keep messages clear and specific
  • Provide context when needed
  • Use system messages to set behavior
  • Enable streaming for real-time responses

2. Web Search Integration

What it does: AI can search the web and use current information

When to use: Questions about current events, recent developments, real-time data

Requirements:

  • Must use a model with web search capability (check table)
  • Include tools: [{ type: "web_search" }] in your request

Tips:

  • Ask about current topics ("What happened today...")
  • Request recent information ("Latest news about...")
  • Specify time frames ("In the last week...")

3. Thinking Mode (Reasoning)

What it does: AI shows its reasoning process and thinks through complex problems

When to use: Math problems, logic puzzles, complex analysis, step-by-step solutions

Requirements:

  • Enable with enable_thinking: true
  • Set a thinking budget (recommended: 30000 tokens)

Tips:

  • Use for problems that need careful reasoning
  • Ask for step-by-step explanations
  • Works best with reasoning-capable models

4. Vision and Image Analysis

What it does: AI can see and understand images, photos, screenshots, diagrams

When to use: Image description, visual analysis, OCR, chart interpretation

Supported formats: JPG, PNG, GIF, WebP (most common)

Tips:

  • Provide clear, high-quality images
  • Ask specific questions about what you want to know
  • Combine multiple images for comparison (up to 5)
  • Works with URLs or base64-encoded images

5. Document Analysis

What it does: AI can read and analyze PDF documents, text files, and other documents

When to use: Document summarization, information extraction, content analysis

Supported formats: PDF, TXT, MD (most common), DOC, DOCX

Tips:

  • Keep documents under 20MB
  • Can combine with images (e.g., PDF + photo)
  • Ask specific questions about document content
  • Request summaries, key points, or specific information

6. Deep Research

What it does: Comprehensive multi-phase research with web search and citations

When to use: Academic research, market analysis, in-depth investigations

Model: Must use qwen-deep-research

Features:

  • Multi-phase research process
  • Web search integration
  • Source validation
  • Automatic citations
  • Comprehensive reports
  • Export to PDF/Markdown

Tips:

  • Ask broad research questions
  • Let the AI complete all research phases
  • Review citations for accuracy
  • Download reports for offline use

7. Image Generation

What it does: Create new images from text descriptions

When to use: Creating art, illustrations, concept designs, visual content

Options:

  • Size: 1024x1024, 1792x1024, 1024x1792
  • Quality: Standard or HD
  • Style: Natural or vivid

Tips:

  • Be descriptive in your prompts
  • Specify style, mood, colors
  • Mention composition and perspective
  • Iterate with edits for refinement

8. Image Editing

What it does: Modify existing images using text instructions

When to use: Changing image elements, style adjustments, adding/removing objects

Input: Original image + editing instructions

Tips:

  • Provide clear editing instructions
  • Use simple, direct commands
  • One change at a time works best
  • Can use generated images as input

9. Video Generation

What it does: Create videos from text descriptions

When to use: Video content creation, animations, visual storytelling

Options:

  • Size: 1280x720, 1920x1080
  • Duration: 5-60 seconds

Tips:

  • Describe motion and action clearly
  • Specify scene changes
  • Keep descriptions focused
  • Longer videos take more time

10. Code Generation

What it does: Generate code, debug, explain programming concepts

Model: Best with qwen3-coder-plus

Features:

  • Multi-language support
  • Function/tool calling
  • Code explanation
  • Bug fixing
  • Code optimization

Tips:

  • Specify programming language
  • Describe functionality clearly
  • Ask for comments and documentation
  • Request error handling

11. Web Development

What it does: Create web components, UI elements, responsive designs

Model: Use qwen-web-dev

Output: HTML, CSS, JavaScript

Features:

  • Responsive design
  • Modern CSS
  • Interactive components
  • Framework support (React, Vue)
  • Accessibility considerations

Tips:

  • Describe desired functionality
  • Specify design preferences
  • Mention framework if needed
  • Ask for responsive design

12. Full-Stack Development

What it does: Build complete applications with frontend, backend, database

Model: Use qwen-full-stack

Output: Complete application code

Features:

  • Frontend frameworks
  • Backend APIs
  • Database schemas
  • Authentication
  • Deployment-ready code

Tips:

  • Describe full requirements
  • Specify tech stack preferences
  • Mention scalability needs
  • Request security features

File Upload Guidelines

Understanding File Categories

Files are grouped into two main categories:

Media Files (Image, Audio, Video):

  • All considered the same category
  • Cannot combine different media types
  • Examples: JPG + MP4 = ❌ Invalid

Documents (PDF, Text files):

  • Separate category from media
  • Can combine with media files
  • Examples: JPG + PDF = ✅ Valid

What You Can Upload

Valid Single File Types:

  • ✅ One or more images (up to 5)
  • ✅ One audio file
  • ✅ One video file
  • ✅ One or more documents (up to 5)

Valid Combinations:

  • ✅ Multiple images only
  • ✅ Image(s) + Document(s)
  • ✅ Audio + Document(s)
  • ✅ Video + Document(s)
  • ✅ Single media file only

Invalid Combinations:

  • ❌ Image + Audio
  • ❌ Image + Video
  • ❌ Audio + Video
  • ❌ Multiple videos
  • ❌ Multiple audio files

Size and Duration Limits

Images:

  • Maximum: 10MB per image
  • Count: Up to 5 images
  • Formats: JPG, PNG, GIF, WebP recommended

Audio:

  • Maximum: 100MB
  • Duration: Up to 3 minutes (180 seconds)
  • Count: 1 file only
  • Formats: MP3, WAV, M4A, AAC recommended

Video:

  • Maximum: 500MB
  • Duration: Up to 10 minutes (600 seconds)
  • Count: 1 file only
  • Formats: MP4, MOV, AVI, MKV recommended

Documents:

  • Maximum: 20MB per document
  • Count: Up to 5 documents
  • Formats: PDF, TXT, MD recommended

Upload Methods

Method 1: URL Provide a direct URL to the file hosted online

Method 2: Base64 Encode the file as base64 data and include it directly

Method 3: File Upload Use multipart form data to upload the file directly


Best Practices

Request Optimization

Stream for Real-Time:

  • Enable streaming for chatbot-like experiences
  • Provides instant feedback to users
  • Better for long responses

Non-Stream for Complete Data:

  • Use when you need the full response at once
  • Better for batch processing
  • Easier error handling

Message Management

System Messages:

  • Set AI behavior and personality
  • Define response format
  • Establish context

User Messages:

  • Ask clear, specific questions
  • Provide necessary context
  • Keep conversations focused

Assistant Messages:

  • Include previous AI responses for context
  • Maintain conversation history
  • Build on previous exchanges

Context Management

Keep Relevant History:

  • Include important previous messages
  • Don't send entire conversation every time
  • Focus on recent relevant context

Token Awareness:

  • Longer conversations use more tokens
  • Summarize old conversations if needed
  • Reset context when changing topics

Rate Limiting

Respect Limits:

  • Don't spam requests rapidly
  • Implement delays between requests
  • Use queuing for bulk operations

Handle Failures:

  • Retry with exponential backoff
  • Don't retry immediately
  • Log errors for debugging

Error Recovery

Graceful Degradation:

  • Have fallback options
  • Inform users of issues
  • Don't crash on errors

Token Refresh:

  • Detect expiration automatically
  • Refresh before making request
  • Store new token securely

Common Use Cases

1. Building a Chatbot

Goal: Create an interactive conversational AI

Steps:

  1. Initialize with system message
  2. Maintain conversation history
  3. Stream responses for real-time feel
  4. Handle context window limits
  5. Reset conversation when needed

Best Practices:

  • Keep last 10-20 messages
  • Use streaming for better UX
  • Implement typing indicators
  • Handle connection issues

2. Document Summarization

Goal: Extract key information from documents

Steps:

  1. Upload document (PDF, TXT)
  2. Ask for summary
  3. Request specific sections
  4. Extract key points

Best Practices:

  • Use clear, focused questions
  • Specify desired length
  • Ask for structured output
  • Verify important information

3. Image Analysis

Goal: Understand and describe visual content

Steps:

  1. Provide image URL or file
  2. Ask specific questions
  3. Request detailed analysis
  4. Compare multiple images

Best Practices:

  • Use high-quality images
  • Ask focused questions
  • Specify what to look for
  • Combine with text context

4. Research Assistant

Goal: Conduct comprehensive research

Steps:

  1. Use qwen-deep-research model
  2. Provide research topic
  3. Wait for all phases to complete
  4. Review sources and citations
  5. Download report if needed

Best Practices:

  • Start with clear research question
  • Allow time for completion
  • Verify citations
  • Use non-streaming mode

5. Content Generation

Goal: Create written content

Steps:

  1. Describe content type
  2. Specify tone and style
  3. Set length requirements
  4. Iterate and refine

Best Practices:

  • Provide examples if possible
  • Be specific about requirements
  • Review and edit output
  • Regenerate if needed

6. Code Assistant

Goal: Help with programming tasks

Steps:

  1. Use qwen3-coder-plus model
  2. Describe functionality
  3. Specify language/framework
  4. Request explanations
  5. Ask for improvements

Best Practices:

  • Include context about project
  • Specify error handling needs
  • Request code comments
  • Test generated code

7. Visual Content Creation

Goal: Generate images and videos

Steps:

  1. Write detailed prompts
  2. Specify size and style
  3. Generate initial content
  4. Edit and refine
  5. Download final result

Best Practices:

  • Be descriptive
  • Iterate on results
  • Use editing for refinement
  • Save successful prompts

8. Multi-Language Support

Goal: Translate or work in multiple languages

Steps:

  1. Specify source and target languages
  2. Provide context
  3. Request translation or generation
  4. Verify accuracy

Best Practices:

  • Specify language explicitly
  • Provide cultural context
  • Verify translations
  • Use native speakers for review

Error Handling

Common Errors and Solutions

Authentication Errors (401)

Symptoms: "Invalid API key" or "Unauthorized"

Solutions:

  • Validate your token using /v1/validate
  • Refresh token using /v1/refresh
  • Regenerate token from browser
  • Check token format is correct

Invalid Request (400)

Symptoms: "Bad request" or "Invalid parameters"

Solutions:

  • Check request format matches API spec
  • Verify all required fields are present
  • Validate JSON syntax
  • Check parameter values are valid

Rate Limiting (429)

Symptoms: "Too many requests"

Solutions:

  • Slow down request rate
  • Implement request queuing
  • Add delays between requests
  • Use exponential backoff

File Too Large (413)

Symptoms: "File size exceeds limit"

Solutions:

  • Compress images before upload
  • Use smaller file sizes
  • Split large documents
  • Check size limits for file type

Unsupported Format (415)

Symptoms: "Unsupported media type"

Solutions:

  • Use supported file formats
  • Convert files to compatible format
  • Check format is in allowed list
  • Verify MIME type is correct

Server Error (500)

Symptoms: "Internal server error"

Solutions:

  • Retry request after delay
  • Check if service is operational
  • Report persistent errors
  • Use different endpoint if available

Error Response Structure

Every error response includes:

  • message: Human-readable error description
  • type: Error category
  • code: Specific error code
  • param: Which parameter caused error (if applicable)

Retry Strategies

Immediate Retry (Network issues):

  • Retry 1-2 times immediately
  • For temporary connection problems
  • Don't retry on auth errors

Exponential Backoff (Rate limits):

  • Wait 1s, 2s, 4s, 8s between retries
  • For 429 or temporary 500 errors
  • Give up after 3-5 attempts

No Retry (Client errors):

  • Don't retry 400, 401, 403, 415
  • Fix the request instead
  • These indicate client-side problems

Troubleshooting

Token Issues

Problem: Token not working

Diagnostic Steps:

  1. Check token format (starts with H4sIAAAAAAAAA)
  2. Validate using /v1/validate endpoint
  3. Check if token is expired
  4. Verify you're logged into Qwen
  5. Regenerate token from browser

Common Causes:

  • Token copied incorrectly
  • Session expired
  • Account issues
  • Wrong token format

Streaming Problems

Problem: Stream cuts off or doesn't work

Diagnostic Steps:

  1. Verify stream: true is set
  2. Check you're handling SSE format
  3. Look for [DONE] marker
  4. Test with non-streaming first
  5. Check network/proxy settings

Common Causes:

  • Incorrect SSE parsing
  • Network interruptions
  • Proxy buffering
  • Client timeout too short

File Upload Failures

Problem: Can't upload files

Diagnostic Steps:

  1. Check file size against limits
  2. Verify file format is supported
  3. Test with smaller file
  4. Check file is not corrupted
  5. Verify URL is accessible (if using URL)

Common Causes:

  • File too large
  • Unsupported format
  • Mixing incompatible file types
  • Network issues

Model Not Working

Problem: Specific model gives errors

Diagnostic Steps:

  1. Check model name spelling
  2. Verify model exists (/v1/models)
  3. Check model supports requested feature
  4. Try with default model
  5. Check capability matrix

Common Causes:

  • Typo in model name
  • Model doesn't support feature
  • Model deprecated or unavailable
  • Feature flag not set

Slow Responses

Problem: Requests take too long

Diagnostic Steps:

  1. Check if large file uploads
  2. Verify network connection
  3. Try simpler request
  4. Use faster model (turbo)
  5. Enable streaming

Common Causes:

  • Large files being processed
  • Complex thinking/research tasks
  • Network latency
  • Server load
  • Long generation requested

Debug Logs Not Showing

Problem: Can't see Request/Response IDs

Diagnostic Steps:

  1. Check you're looking at response content
  2. Verify logs are at the end
  3. Look for collapsible section
  4. Check streaming vs non-streaming
  5. Verify response completed

Common Causes:

  • Stream interrupted
  • Looking in wrong place
  • Response cut off
  • Client filtering logs

FAQ

General Questions

Q: Is this service free to use?

A: Yes, the proxy itself is free. However, you need a Qwen account, and any usage limits depend on your Qwen account tier. Free Qwen accounts have certain limitations, while premium accounts have higher limits.

Q: Are there rate limits?

A: The proxy doesn't impose its own rate limits. All limitations come from your Qwen account. Free accounts have lower limits than paid accounts.

Q: Can I use this in production?

A: Yes, but keep in mind this is an unofficial proxy. For production use, implement proper error handling, token refresh logic, and have fallback strategies.

Q: How do I get support?

A: Open an issue on GitHub for bugs or feature requests. Check existing issues and discussions for common questions.

Q: Can I self-host this?

A: Yes! The code is open source. You can deploy it to your own Cloudflare Workers account or adapt it for other platforms.

Authentication

Q: How long do tokens last?

A: Tokens expire based on Qwen's session management. You'll need to refresh or regenerate them periodically.

Q: Can I use the same token from multiple applications?

A: Yes, one token can be used across multiple applications, but be aware of rate limits.

Q: What if my token gets leaked?

A: Regenerate immediately from the Qwen website and update it in your applications.

Q: Do I need to refresh tokens manually?

A: You can implement automatic refresh using the /v1/refresh endpoint when you detect expiration.

Models

Q: Which model should I use?

A: For general use, start with qwen-max-latest. For specific needs, refer to the Model Selection guide above.

Q: Can I switch models mid-conversation?

A: Yes, but you'll need to create a new chat. Models don't share conversation context.

Q: Are all models always available?

A: Availability depends on your Qwen account. Some models may require premium access.

Q: What's the difference between Max and Turbo?

A: Max is more powerful and thorough, Turbo is faster but may be less detailed.

Features

Q: Can I use web search with any model?

A: No, only models with web search capability (check the capabilities table) support this feature.

Q: How do I enable thinking mode?

A: Add enable_thinking: true to your request and use a reasoning-capable model.

Q: Can I combine multiple features?

A: Yes! For example, you can use web search + thinking mode + vision together if the model supports all three.

Q: What's the maximum file size?

A: It varies by type: Images (10MB), Audio (100MB), Video (500MB), Documents (20MB).

Technical

Q: Is streaming faster than non-streaming?

A: Streaming provides faster first response but same total time. It's better for user experience.

Q: Can I cancel a request in progress?

A: You can close the connection, but the server may continue processing.

Q: How do I handle long conversations?

A: Keep only recent relevant messages to avoid hitting context limits. Summarize old conversations if needed.

Q: Can I use this with OpenAI SDKs?

A: Yes! Just change the base URL to https://qwen.aikit.club and use your Qwen token.

Troubleshooting

Q: Why am I getting "Invalid API key" errors?

A: Your token may be expired, invalid, or incorrectly formatted. Try validating or regenerating it.

Q: Files aren't uploading properly

A: Check file size, format, and that you're not mixing incompatible file types (e.g., image + video).

Q: Responses are cut off

A: This could be token limits, connection issues, or streaming problems. Try non-streaming mode first.

Q: Can I get request logs for debugging?

A: Yes, response content includes debug logs with Request ID and Response ID at the end.