Multi-key Rotation: Automatic API key rotation with load balancing
Multi-Target Load Balancing: Supports round-robin load balancing across multiple upstream API targets
Intelligent Blacklisting: Distinguishes between permanent and temporary errors for smart key management
Real-time Monitoring: Comprehensive statistics, health checks, and blacklist management
Flexible Configuration: Environment-based configuration with .env file support
CORS Support: Full cross-origin request support
Structured Logging: Detailed logging with response times and key information
Optional Authentication: Project-level Bearer token authentication
High Performance: Zero-copy streaming, concurrent processing, and atomic operations
Production Ready: Graceful shutdown, error recovery, and memory management

Quick Start

Prerequisites

Go 1.21+ (for building from source)
Docker (for containerized deployment)

Option 1: Using Docker (Recommended)

# Pull the latest image
docker pull ghcr.io/tbphp/gpt-load:latest

# Create keys.txt file with your API keys (one per line)
echo "sk-your-api-key-1" > keys.txt
echo "sk-your-api-key-2" >> keys.txt

# Run the container
docker run -d -p 3000:3000 \
  -v $(pwd)/keys.txt:/app/keys.txt:ro \
  --name gpt-load \
  ghcr.io/tbphp/gpt-load:latest

Option 2: Using Docker Compose

# Start the service
docker-compose up -d

# Stop the service
docker-compose down

Option 3: Build from Source

# Clone and build
git clone https://github.com/tbphp/gpt-load.git
cd gpt-load
go mod tidy

# Create configuration
cp .env.example .env
echo "sk-your-api-key" > keys.txt

# Run
make run

Configuration

Supported API Providers

This proxy server works with any OpenAI-compatible API, including:

OpenAI: https://api.openai.com
Azure OpenAI: https://your-resource.openai.azure.com
Anthropic Claude: https://api.anthropic.com (with compatible endpoints)
Third-party Providers: Any service implementing OpenAI API format

Environment Variables

Copy the example configuration file and modify as needed:

cp .env.example .env

Key Configuration Options

Setting	Environment Variable	Default	Description
Server Port	`PORT`	3000	Server listening port
Server Host	`HOST`	0.0.0.0	Server binding address
Keys File	`KEYS_FILE`	keys.txt	API keys file path
Start Index	`START_INDEX`	0	Starting key index for rotation
Blacklist Threshold	`BLACKLIST_THRESHOLD`	1	Error count before blacklisting
Max Retries	`MAX_RETRIES`	3	Maximum retry attempts with different keys
Upstream URL	`OPENAI_BASE_URL`	`https://api.openai.com`	OpenAI-compatible API base URL. Supports multiple, comma-separated URLs for load balancing.
Max Concurrent Requests	`MAX_CONCURRENT_REQUESTS`	100	Maximum number of concurrent requests
Enable Gzip	`ENABLE_GZIP`	true	Enable Gzip compression for responses
Auth Key	`AUTH_KEY`	-	Optional authentication key
CORS	`ENABLE_CORS`	true	Enable CORS support
Allowed Origins	`ALLOWED_ORIGINS`	*	CORS allowed origins (comma-separated, * for all)
Allowed Methods	`ALLOWED_METHODS`	GET,POST,PUT,DELETE,OPTIONS	CORS allowed HTTP methods
Allowed Headers	`ALLOWED_HEADERS`	*	CORS allowed headers (comma-separated, * for all)
Allow Credentials	`ALLOW_CREDENTIALS`	false	CORS allow credentials
Log Level	`LOG_LEVEL`	info	Logging level (debug, info, warn, error)
Log Format	`LOG_FORMAT`	text	Log format (text, json)
Enable File Logging	`LOG_ENABLE_FILE`	false	Enable logging to file
Log File Path	`LOG_FILE_PATH`	logs/app.log	Log file path
Enable Request Logging	`LOG_ENABLE_REQUEST`	true	Enable request logging (set to false in production for performance)
Server Read Timeout	`SERVER_READ_TIMEOUT`	120	HTTP server read timeout in seconds
Server Write Timeout	`SERVER_WRITE_TIMEOUT`	1800	HTTP server write timeout in seconds
Server Idle Timeout	`SERVER_IDLE_TIMEOUT`	120	HTTP server idle timeout in seconds
Graceful Shutdown	`SERVER_GRACEFUL_SHUTDOWN_TIMEOUT`	60	Graceful shutdown timeout in seconds
Request Timeout	`REQUEST_TIMEOUT`	30	Request timeout in seconds
Response Timeout	`RESPONSE_TIMEOUT`	30	Response timeout in seconds (TLS handshake & response header)
Idle Connection Timeout	`IDLE_CONN_TIMEOUT`	120	Idle connection timeout in seconds

Configuration Examples

OpenAI (Default)

OPENAI_BASE_URL=https://api.openai.com
# Use OpenAI API keys: sk-...

Azure OpenAI

OPENAI_BASE_URL=https://your-resource.openai.azure.com
# Use Azure API keys and adjust endpoints as needed

Third-party Provider

OPENAI_BASE_URL=https://api.your-provider.com
# Use provider-specific API keys

Multi-Target Load Balancing

# Use a comma-separated list of target URLs
OPENAI_BASE_URL=https://gateway.ai.cloudflare.com/v1/.../openai,https://api.openai.com/v1,https://api.another-provider.com/v1

API Key Validation

The project includes a high-performance API key validation tool:

# Validate keys automatically
make validate-keys

# Or run directly
./scripts/validate-keys.py

Monitoring Endpoints

Endpoint	Method	Description
`/health`	GET	Health check and basic status
`/stats`	GET	Detailed statistics
`/blacklist`	GET	Blacklist information
`/reset-keys`	GET	Reset all key states

Development

Available Commands

# Build
make build      # Build binary
make build-all  # Build for all platforms
make clean      # Clean build files

# Run
make run        # Run server
make dev        # Development mode with race detection

# Test
make test       # Run tests
make coverage   # Generate coverage report
make bench      # Run benchmarks

# Code Quality
make lint       # Code linting
make fmt        # Format code
make tidy       # Tidy dependencies

# Management
make health     # Health check
make stats      # View statistics
make reset-keys # Reset key states
make blacklist  # View blacklist

# Help
make help       # Show all commands

Project Structure

/
├── cmd/
│   └── gpt-load/
│       └── main.go          # Main entry point
├── internal/
│   ├── config/
│   │   └── manager.go       # Configuration management
│   ├── errors/
│   │   └── errors.go        # Custom error types
│   ├── handler/
│   │   └── handler.go       # HTTP handlers
│   ├── keymanager/
│   │   └── manager.go       # Key manager
│   ├── middleware/
│   │   └── middleware.go    # HTTP middleware
│   └── proxy/
│       └── server.go        # Proxy server core
├── pkg/
│   └── types/
│       └── interfaces.go    # Common interfaces and types
├── scripts/
│   └── validate-keys.py     # Key validation script
├── .github/
│   └── workflows/
│       └── docker-build.yml # GitHub Actions CI/CD
├── build/                   # Build output directory
├── .env.example            # Configuration template
├── Dockerfile              # Docker build file
├── docker-compose.yml      # Docker Compose configuration
├── Makefile               # Build scripts
├── go.mod                 # Go module file
├── LICENSE                # MIT License
└── README.md              # Project documentation

Architecture

Performance Features

Concurrent Processing: Leverages Go's goroutines for high concurrency
Memory Efficiency: Zero-copy streaming with minimal memory allocation
Connection Pooling: HTTP/2 support with optimized connection reuse
Atomic Operations: Lock-free concurrent operations
Pre-compiled Patterns: Regex patterns compiled at startup

Security & Reliability

Memory Safety: Go's built-in memory safety prevents buffer overflows
Concurrent Safety: Uses sync.Map and atomic operations for thread safety
Error Handling: Comprehensive error handling and recovery mechanisms
Resource Management: Automatic cleanup prevents resource leaks

CDN acceleration and security protection for this project are sponsored by Tencent EdgeOne.

🌟 Star History

License

MIT License - see LICENSE file for details.

Languages

Go 53.7%

Vue 38.4%

TypeScript 5.5%

CSS 1.3%

JavaScript 0.7%

Other 0.3%

README.md

GPT-Load

Features