When scaling matters
Most Discord bots never need to worry about scaling. A bot serving one guild or a handful of communities runs comfortably on minimal resources. Scaling becomes relevant when your bot joins enough guilds that resource consumption, API rate limits, or Discord's gateway connection limits start affecting performance.
Discord requires bots in more than 2,500 guilds to use sharding. This is not optional. Beyond that threshold, Discord will not let your bot connect with a single gateway connection. Planning for this before you hit the limit saves emergency refactoring later.
Understanding Discord's limits
| Limit | Threshold | What happens |
|---|---|---|
| Gateway connection | 2,500 guilds per shard | Must implement sharding |
| Global rate limit | 50 requests per second | Requests queued or rejected |
| Identify limit | 1 identify per 5 seconds | Shard startup must be staggered |
| Message content intent | 100+ guilds (unverified) | Must apply for privileged intent |
| Guild member cache | Scales with guild count | Memory usage increases linearly |
Resource planning
Before adding infrastructure, understand what your bot actually consumes. Resource usage depends on your bot's features, caching strategy, and the number of events it processes.
Memory estimation
A rough guide for memory planning:
| Guild count | Estimated RAM (Node.js) | Estimated RAM (Python) | Hosting recommendation |
|---|---|---|---|
| 1–50 | 50–100 MB | 60–120 MB | Free hosting (MonkeyBytes) |
| 50–500 | 100–300 MB | 120–400 MB | Free hosting (MonkeyBytes) |
| 500–2,500 | 300–800 MB | 400 MB–1 GB | Free hosting or budget VPS |
| 2,500–10,000 | 800 MB–2 GB | 1–3 GB | VPS ($5–$10/mo) |
| 10,000+ | 2–8 GB+ | 3–10 GB+ | Dedicated server or multi-VPS |
These are estimates based on typical bot configurations with standard caching. Bots that cache member lists or message histories use significantly more memory. Bots with minimal caching use less.
MonkeyBytes provides 1 GB of dedicated RAM per bot instance, which comfortably handles bots up to roughly 500–2,500 guilds depending on your caching configuration. For cost comparisons at each tier, see our hosting cost breakdown.
Sharding
Sharding splits your bot's guild list across multiple gateway connections. Each shard handles a subset of guilds independently. Discord assigns guilds to shards using a simple formula: shard_id = (guild_id >> 22) % num_shards.
Internal sharding
The simplest approach runs multiple shards within a single process. Both discord.js and discord.py support this natively.
discord.js ShardingManager:
// shard.js (launcher file)
const { ShardingManager } = require('discord.js');
const manager = new ShardingManager('./bot.js', {
token: process.env.DISCORD_TOKEN,
totalShards: 'auto' // Discord determines shard count
});
manager.on('shardCreate', shard => {
console.log(`Launched shard ${shard.id}`);
});
manager.spawn();
discord.py AutoShardedBot:
import discord
import os
bot = discord.AutoShardedBot(
intents=discord.Intents.default(),
shard_count=None # Auto-determined
)
@bot.event
async def on_ready():
print(f'Ready with {bot.shard_count} shards')
bot.run(os.getenv('DISCORD_TOKEN'))
External sharding
For large bots (10,000+ guilds), external sharding runs each shard as a separate process or on separate servers. This allows horizontal scaling across multiple machines. External sharding is more complex to set up but provides better isolation and resource distribution.
Optimisation before scaling
Before adding more hardware or shards, optimise what you have. Many bots scale prematurely when the real problem is inefficient code.
Cache management
The biggest memory consumer in most Discord bots is the guild and member cache. By default, Discord libraries cache everything. Reduce memory usage by only caching what you need:
// discord.js - selective caching
const client = new Client({
intents: [GatewayIntentBits.Guilds],
makeCache: Options.cacheWithLimits({
MessageManager: 50, // Cache last 50 messages per channel
GuildMemberManager: 200, // Cache 200 members per guild
PresenceManager: 0, // Don't cache presence data
ReactionManager: 0, // Don't cache reactions
})
});
Database optimisation
If your bot queries a database on every command, slow queries become bottlenecks under load. Index your most-queried columns, use connection pooling, and cache frequently accessed data in memory with a TTL (time to live) to avoid stale data.
Rate limit handling
Discord.js and discord.py both handle rate limits automatically, but wasteful API calls still slow your bot. Batch operations where possible, use bulk endpoints for mass actions, and avoid making API calls in tight loops.
For more optimisation techniques, read our performance optimisation guide.
Infrastructure scaling path
- Start free: Deploy on MonkeyBytes with 1 GB RAM. Suitable for most bots up to 500+ guilds.
- Optimise code: Reduce caching, fix memory leaks, optimise database queries. This often doubles your capacity without changing hosting.
- Add sharding: When approaching 2,500 guilds, implement internal sharding. This runs on the same hosting instance.
- Upgrade hosting: When internal sharding maxes out your RAM, move to a VPS with more resources.
- External sharding: For very large bots, distribute shards across multiple servers with a coordinator process.
When to upgrade
Upgrade your hosting when you observe these signals:
- RAM usage consistently above 80% of available memory
- Increasing frequency of out-of-memory crashes
- Noticeable delay in command responses under normal load
- Discord forcing you to add shards (>2,500 guilds)
- Database queries taking more than 100ms consistently
Do not upgrade preemptively. Monitor your actual resource usage with the tools described in our uptime monitoring guide and upgrade based on data, not assumptions.
For hosting options at each scale, compare costs in our hosting cost breakdown or evaluate VPS options in our VPS vs free hosting comparison.