Why monitoring matters
Your Discord bot can go offline without you knowing. Server crashes, memory leaks, network issues, and Discord API outages can all take your bot down silently. Without monitoring, you only find out when users complain. By then, your bot may have been down for hours.
Monitoring solves this by continuously checking your bot's status and alerting you when something goes wrong. Even if your hosting platform has automatic restarts, monitoring tells you when restarts are happening, how often, and whether there is an underlying problem that needs fixing.
Types of monitoring
Process monitoring
The most basic form of monitoring checks whether your bot process is running. On MonkeyBytes, the platform provides automatic crash detection and restart. The dashboard shows your bot's current status and the real-time console displays output. On a VPS, tools like PM2 (Node.js) or systemd (Python) handle process monitoring and restart. Read our Node.js deployment guide or Python deployment guide for setup instructions.
Application-level monitoring
Process monitoring tells you if your bot is running, but not if it is functioning correctly. Your bot process might be alive but stuck in an error loop, disconnected from Discord's gateway, or unable to respond to commands. Application-level monitoring checks whether your bot is actually working, not just running.
External uptime monitoring
External monitoring services check your bot from outside your hosting environment. They can detect issues that internal monitoring misses, like network problems between your host and Discord's servers, or hosting platform outages that affect both your bot and its monitoring.
Built-in monitoring on MonkeyBytes
MonkeyBytes provides several monitoring features without additional setup:
- Real-time console: View your bot's stdout and stderr output live in the dashboard
- Auto-restart: If your bot process crashes, the platform automatically restarts it
- Status indicators: The dashboard shows whether your bot is running, stopped, or in maintenance mode
- Resource monitoring: Track RAM and CPU usage to identify resource issues before they cause crashes
These built-in features handle the basics. For more comprehensive monitoring, combine them with the external tools described below.
Monitoring tools compared
| Tool | Free tier | Check interval | Alert methods | Best for |
|---|---|---|---|---|
| UptimeRobot | 50 monitors, 5-min intervals | 5 min (free), 1 min (paid) | Email, SMS, Slack, Discord webhook | HTTP endpoint monitoring |
| Better Stack (formerly Better Uptime) | 10 monitors, 3-min intervals | 3 min (free), 30 sec (paid) | Email, SMS, Slack, PagerDuty | Incident management + monitoring |
| Hetrix Tools | 15 monitors, 1-min intervals | 1 min | Email, Telegram, Discord, Slack | Server and uptime monitoring |
| Cronitor | 5 monitors | 1 min | Email, Slack, PagerDuty | Cron job and heartbeat monitoring |
| Discord bot (self-built) | Free | Custom | Discord messages | Custom status checks |
Setting up heartbeat monitoring
Heartbeat monitoring is the most effective approach for Discord bots. Instead of checking an HTTP endpoint, your bot sends a periodic signal to a monitoring service. If the signal stops, the service alerts you.
How heartbeat monitoring works
- You register a heartbeat URL with a monitoring service
- Your bot sends an HTTP request to that URL at regular intervals
- If the monitoring service does not receive a request within the expected window, it marks your bot as down and sends an alert
Node.js heartbeat implementation
const https = require('https');
// Send heartbeat every 5 minutes
setInterval(() => {
https.get('https://heartbeat.example.com/your-monitor-id', (res) => {
// Heartbeat sent successfully
}).on('error', (err) => {
console.error('Heartbeat failed:', err.message);
});
}, 5 * 60 * 1000);
Python heartbeat implementation
import aiohttp
import asyncio
async def send_heartbeat():
async with aiohttp.ClientSession() as session:
while True:
try:
await session.get('https://heartbeat.example.com/your-monitor-id')
except Exception as e:
print(f'Heartbeat failed: {e}')
await asyncio.sleep(300) # Every 5 minutes
# Start heartbeat as a background task in your bot's on_ready event
@bot.event
async def on_ready():
bot.loop.create_task(send_heartbeat())
print(f'Logged in as {bot.user}')
Building a self-monitoring bot
For a simple, free monitoring solution, you can use a second lightweight bot that checks whether your main bot is online in Discord.
How it works
- Deploy a minimal monitoring bot on a separate hosting instance
- The monitoring bot checks your main bot's Discord presence status at regular intervals
- If the main bot appears offline, the monitoring bot sends an alert to a designated channel
This approach is free and uses Discord itself as the monitoring mechanism. The limitation is that it only detects when your bot is completely offline from Discord, not when it is running but malfunctioning.
What to monitor
Essential metrics
- Online/offline status: Is the bot process running and connected to Discord?
- Response time: How quickly does the bot respond to commands?
- Memory usage: Is RAM consumption stable or gradually increasing (memory leak)?
- Restart frequency: How often is the bot crashing and restarting?
- Gateway connection: Is the bot maintaining its WebSocket connection to Discord?
Warning signs
- Increasing memory usage over time: Indicates a memory leak that will eventually crash your bot
- Frequent restarts: More than one restart per day suggests an unhandled error or resource issue
- Slow command responses: Could indicate CPU overload, rate limiting, or blocking operations on the event loop
- Gateway disconnections: Frequent disconnects may indicate network issues or improper intent handling
Logging for monitoring
Good logging is the foundation of effective monitoring. Without logs, you cannot diagnose why your bot went down.
What to log
// Node.js - structured logging
console.log(`[${new Date().toISOString()}] Bot ready: ${client.user.tag}`);
console.log(`[${new Date().toISOString()}] Guilds: ${client.guilds.cache.size}`);
console.log(`[${new Date().toISOString()}] Memory: ${Math.round(process.memoryUsage().heapUsed / 1024 / 1024)}MB`);
# Python - structured logging
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s'
)
@bot.event
async def on_ready():
logging.info(f'Bot ready: {bot.user}')
logging.info(f'Guilds: {len(bot.guilds)}')
On MonkeyBytes, all console output is visible in the real-time dashboard console. On a VPS, logs are captured by PM2 or journalctl depending on your process manager.
Responding to outages
- Check your hosting platform: Is the server running? On MonkeyBytes, check the dashboard. On a VPS, SSH in and check your process manager.
- Check Discord's status: Visit status.discord.com to see if Discord itself is having issues. If Discord is down, your bot cannot connect regardless of your hosting.
- Check your logs: Look for error messages that explain why the bot went down. Common causes include token invalidation, unhandled exceptions, and out-of-memory crashes.
- Fix and redeploy: Once you identify the cause, fix the issue and redeploy. For help with common errors, see our troubleshooting guide.
Monitoring best practices
- Do not over-monitor. Checking every 30 seconds is unnecessary for most bots. A 5-minute interval catches real outages without creating noise.
- Set up meaningful alerts. An alert should tell you what is wrong and point you to where to investigate. A vague "bot is down" alert is less useful than "bot offline for 10 minutes, last log entry: MemoryError".
- Monitor from outside your host. If your hosting platform goes down, monitoring running on the same platform will not alert you. Use an external service.
- Track trends, not just incidents. A bot that uses 200 MB today and 400 MB next week has a memory leak that will eventually cause a crash. Catching the trend prevents the incident.
- Test your alerts. Intentionally stop your bot and confirm you receive an alert within the expected timeframe. An alert system you have never tested is an alert system you cannot trust.
For related topics, read our performance optimisation guide to prevent many common causes of downtime, or visit the features page to learn about MonkeyBytes' built-in monitoring capabilities.