MongoDB on Linux: NoSQL Databases Complete Deployment Linux Mastery Series
Prerequisites
Deploy MongoDB on Linux in 5 Minutes
MongoDB on Linux provides a high-performance, document-oriented NoSQL database solution that scales horizontally across distributed systems. Unlike traditional relational databases, MongoDB stores data in flexible, JSON-like BSON documents, making it ideal for modern applications requiring rapid development and schema flexibility.
Quick Start Command:
# Install MongoDB Community Edition (Ubuntu/Debian)
wget -qO - https://www.mongodb.org/static/pgp/server-7.0.asc | sudo apt-key add -
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu $(lsb_release -sc)/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
sudo apt update && sudo apt install -y mongodb-org
sudo systemctl start mongod && sudo systemctl enable mongod
Verify Installation:
mongosh --eval 'db.runCommand({ connectionStatus: 1 })'
This immediate deployment establishes a production-ready MongoDB instance with systemd integration, automatic startup configuration, and secure defaults. The database listens on localhost:27017 by default, providing a foundation for building scalable, document-oriented applications.
Table of Contents
- What is MongoDB and Why Choose it on Linux?
- How to Install MongoDB on Different Linux Distributions?
- What are the Essential MongoDB Configuration Settings?
- How to Secure MongoDB Authentication and Authorization?
- What is Sharding and How to Implement it?
- How to Configure MongoDB Replica Sets for High Availability?
- What are the Best Performance Tuning Practices?
- How to Backup and Restore MongoDB Databases?
- FAQ: Common MongoDB on Linux Questions
- Troubleshooting Common MongoDB Issues
- Additional Resources
What is MongoDB and Why Choose it on Linux?
MongoDB represents a paradigm shift from traditional relational database management systems (RDBMS). Consequently, as a document-oriented NoSQL database, it stores data in flexible BSON (Binary JSON) format rather than rigid table structures. Furthermore, this architectural decision enables developers to iterate rapidly without complex schema migrations.
Key Advantages of MongoDB on Linux
Native Linux Integration: MongoDB’s development team optimizes the database specifically for Linux environments. Moreover, the database engine leverages Linux kernel features like memory-mapped files, transparent huge pages, and advanced I/O scheduling to deliver exceptional performance.
Horizontal Scalability: Unlike vertical scaling limitations in traditional databases, MongoDB implements automatic sharding to distribute data across multiple servers. Subsequently, this approach allows your database to grow seamlessly from gigabytes to petabytes.
Schema Flexibility: Documents within the same collection can have different structures. As a result, you can evolve your data model without downtime or complex ALTER TABLE operations.
Rich Query Language: MongoDB provides a powerful query language supporting secondary indexes, aggregation pipelines, and geospatial queries. Additionally, the database includes native support for text search and JSON-style documents.
When to Deploy MongoDB on Linux
Consider MongoDB on Linux for these specific use cases:
- Content Management Systems: Variable document structures accommodate different content types naturally
- Real-Time Analytics: High write throughput and horizontal scaling handle massive event streams
- Mobile Applications: Flexible schemas adapt to rapidly evolving mobile app requirements
- Internet of Things (IoT): Time-series collections efficiently store sensor data at scale
- Catalog Systems: Nested documents model complex product hierarchies without JOIN operations
According to the MongoDB documentation, the database powers applications at companies like eBay, MetLife, and The Weather Company, processing billions of operations daily.
How to Install MongoDB on Different Linux Distributions?
MongoDB installation varies slightly across Linux distributions. Nevertheless, the process follows consistent patterns regardless of your chosen distribution.
Installing MongoDB on Ubuntu/Debian Systems
Ubuntu and Debian systems require adding the official MongoDB repository before installation. Specifically, this ensures you receive the latest stable releases with security updates.
# Import MongoDB public GPG key
wget -qO - https://www.mongodb.org/static/pgp/server-7.0.asc | sudo gpg --dearmor -o /usr/share/keyrings/mongodb-server-7.0.gpg
# Create repository list file
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu $(lsb_release -sc)/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
# Update package database
sudo apt update
# Install MongoDB packages
sudo apt install -y mongodb-org mongodb-org-server mongodb-org-shell mongodb-org-mongos mongodb-org-tools
# Start and enable MongoDB service
sudo systemctl start mongod
sudo systemctl enable mongod
sudo systemctl status mongod
Important: The installation creates a default configuration file at /etc/mongod.conf with security-conscious defaults including localhost-only binding.
MongoDB Installation on RHEL/CentOS/Fedora
Red Hat-based distributions utilize YUM/DNF package managers. Therefore, the repository configuration differs from Debian-based systems.
# Create MongoDB repository file
sudo tee /etc/yum.repos.d/mongodb-org-7.0.repo > /dev/null <<EOF
[mongodb-org-7.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/\$releasever/mongodb-org/7.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-7.0.asc
EOF
# Install MongoDB
sudo dnf install -y mongodb-org
# Configure SELinux for MongoDB (RHEL/CentOS specific)
sudo semanage port -a -t mongod_port_t -p tcp 27017
# Start MongoDB service
sudo systemctl start mongod
sudo systemctl enable mongod
SELinux Consideration: Red Hat systems require explicit SELinux policy configuration to allow MongoDB network operations. Without proper SELinux context, the database may fail to bind to network interfaces.
Installing MongoDB on Arch Linux
Arch Linux users benefit from the Arch User Repository (AUR) for MongoDB installation. However, package management differs significantly from other distributions.
# Install from official repositories
sudo pacman -S mongodb-bin mongodb-tools
# Alternatively, build from AUR
git clone https://aur.archlinux.org/mongodb-bin.git
cd mongodb-bin
makepkg -si
# Create MongoDB data directory
sudo mkdir -p /var/lib/mongodb
sudo chown -R mongodb:mongodb /var/lib/mongodb
# Start MongoDB
sudo systemctl start mongodb
sudo systemctl enable mongodb
Verifying Your MongoDB Installation
After installation on any distribution, verify the deployment with these diagnostic commands:
# Check MongoDB version
mongod --version
# Verify service status
sudo systemctl status mongod
# Test database connection
mongosh --eval 'db.runCommand({ buildInfo: 1 })'
# Display running processes
ps aux | grep mongod
# Check listening ports
sudo netstat -tlnp | grep mongod
# or with ss command
sudo ss -tlnp | grep mongod
Expected Output: A successful installation shows MongoDB version 7.0 or later, an active systemd service, and the database listening on port 27017.
What are the Essential MongoDB Configuration Settings?
MongoDB’s primary configuration file /etc/mongod.conf uses YAML format. Consequently, proper indentation is critical for valid configuration. The default configuration provides security-conscious settings, but production deployments require additional optimization.
Core Configuration File Structure
# /etc/mongod.conf - Production Configuration Example
# Storage configuration
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
engine: wiredTiger
wiredTiger:
engineConfig:
cacheSizeGB: 8
directoryForIndexes: true
# System logging
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
logAppend: true
logRotate: reopen
# Network interfaces
net:
port: 27017
bindIp: 127.0.0.1,192.168.1.50
maxIncomingConnections: 65536
# Security settings
security:
authorization: enabled
keyFile: /etc/mongodb/keyfile
# Replication configuration
replication:
replSetName: rs0
oplogSizeMB: 2048
# Sharding configuration
sharding:
clusterRole: shardsvr
# Process management
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod.pid
timeZoneInfo: /usr/share/zoneinfo
# Operating parameters
operationProfiling:
mode: slowOp
slowOpThresholdMs: 100
Understanding Storage Engine Options
MongoDB on Linux supports multiple storage engines. However, WiredTiger provides the best performance for most workloads. Specifically, this engine offers:
- Document-level concurrency control: Multiple operations modify different documents simultaneously
- Compression: Reduces storage footprint by 70-80% compared to uncompressed data
- Checkpointing: Ensures data durability with configurable checkpoint intervals
# Configure WiredTiger cache size (60% of RAM is recommended)
# For a system with 16GB RAM:
storage:
wiredTiger:
engineConfig:
cacheSizeGB: 9
# Enable compression
storage:
wiredTiger:
collectionConfig:
blockCompressor: snappy
indexConfig:
prefixCompression: true
Network Configuration Best Practices
By default, MongoDB binds only to localhost for security. However, production deployments require external access. Therefore, configure binding appropriately:
# Bind to specific interfaces
net:
bindIp: 127.0.0.1,10.0.1.50,10.0.1.51
# Alternative: Bind to all interfaces (NOT recommended for production)
net:
bindIpAll: true
# Configure maximum connections
net:
maxIncomingConnections: 65536
Security Warning: Never expose MongoDB directly to the public internet without authentication and firewall protection. According to NIST cybersecurity guidelines, database servers should exist behind multiple security layers.
Applying Configuration Changes
After modifying /etc/mongod.conf, restart the service to apply changes:
# Validate configuration syntax
mongod --config /etc/mongod.conf --configsvr --test
# Restart MongoDB service
sudo systemctl restart mongod
# Verify configuration loaded successfully
mongosh --eval 'db.adminCommand( { getCmdLineOpts: 1 } )'
# Check for configuration errors in logs
sudo tail -f /var/log/mongodb/mongod.log
How to Secure MongoDB Authentication and Authorization?
MongoDB deployments without authentication face critical security vulnerabilities. Consequently, implementing robust security measures protects your data from unauthorized access. The MongoDB security model implements role-based access control (RBAC) with fine-grained privileges.
Enabling Authentication
Initially, MongoDB installations allow unrestricted access. Therefore, you must explicitly enable authentication before production deployment.
# Step 1: Create administrative user
mongosh admin --eval '
db.createUser({
user: "adminUser",
pwd: passwordPrompt(),
roles: [
{ role: "userAdminAnyDatabase", db: "admin" },
{ role: "readWriteAnyDatabase", db: "admin" },
{ role: "dbAdminAnyDatabase", db: "admin" },
{ role: "clusterAdmin", db: "admin" }
]
})'
# Step 2: Enable authentication in configuration
sudo tee -a /etc/mongod.conf > /dev/null <<EOF
security:
authorization: enabled
EOF
# Step 3: Restart MongoDB
sudo systemctl restart mongod
# Step 4: Authenticate and verify
mongosh -u adminUser -p --authenticationDatabase admin
Creating Application-Specific Users
Rather than using administrative credentials in applications, create dedicated users with minimal privileges. Subsequently, this approach follows the principle of least privilege.
// Connect as admin
use admin
db.auth("adminUser", "securePassword")
// Create database-specific user
use myAppDatabase
db.createUser({
user: "appUser",
pwd: "strongApplicationPassword",
roles: [
{ role: "readWrite", db: "myAppDatabase" }
]
})
// Create read-only analyst user
db.createUser({
user: "analystUser",
pwd: "analystPassword",
roles: [
{ role: "read", db: "myAppDatabase" }
]
})
Implementing Replica Set Key Authentication
Replica sets require shared secret key files for internal authentication between cluster members. Therefore, generate and distribute key files securely:
# Generate keyfile with appropriate permissions
openssl rand -base64 756 | sudo tee /etc/mongodb/keyfile
sudo chmod 400 /etc/mongodb/keyfile
sudo chown mongodb:mongodb /etc/mongodb/keyfile
# Configure keyfile in mongod.conf
security:
authorization: enabled
keyFile: /etc/mongodb/keyfile
# Distribute keyfile to all replica set members
# IMPORTANT: Use secure transfer methods (scp with SSH keys)
scp -i ~/.ssh/mongodb_key /etc/mongodb/keyfile user@replica-member-2:/tmp/
ssh user@replica-member-2 "sudo mv /tmp/keyfile /etc/mongodb/ && sudo chmod 400 /etc/mongodb/keyfile && sudo chown mongodb:mongodb /etc/mongodb/keyfile"
TLS/SSL Encryption Configuration
Encrypting network traffic prevents eavesdropping on database communications. However, TLS implementation requires proper certificate management.
# Generate self-signed certificate (for testing only)
openssl req -newkey rsa:2048 -new -x509 -days 365 -nodes \
-out /etc/mongodb/mongodb-cert.crt \
-keyout /etc/mongodb/mongodb-cert.key
# Combine certificate and key
cat /etc/mongodb/mongodb-cert.key /etc/mongodb/mongodb-cert.crt > /etc/mongodb/mongodb.pem
sudo chmod 400 /etc/mongodb/mongodb.pem
sudo chown mongodb:mongodb /etc/mongodb/mongodb.pem
# Configure TLS in mongod.conf
net:
tls:
mode: requireTLS
certificateKeyFile: /etc/mongodb/mongodb.pem
# Connect with TLS enabled
mongosh --tls --tlsCAFile /etc/mongodb/mongodb-cert.crt --host mongodb.example.com
Production Recommendation: Obtain certificates from trusted certificate authorities like Let’s Encrypt rather than self-signed certificates. The Let’s Encrypt documentation provides comprehensive guidance.
Firewall Configuration for MongoDB
Operating system firewalls provide an additional security layer. Consequently, restrict MongoDB port access to trusted systems only.
# UFW (Ubuntu/Debian)
sudo ufw allow from 192.168.1.0/24 to any port 27017
sudo ufw enable
# firewalld (RHEL/CentOS/Fedora)
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="27017" protocol="tcp" accept'
sudo firewall-cmd --reload
# iptables (traditional approach)
sudo iptables -A INPUT -p tcp -s 192.168.1.0/24 --dport 27017 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 27017 -j DROP
sudo iptables-save | sudo tee /etc/iptables/rules.v4
For comprehensive security guidance, consult the MongoDB Security Checklist maintained by MongoDB, Inc.
What is Sharding and How to Implement it?
MongoDB sharding distributes data across multiple servers to handle massive datasets and high-throughput operations. Unlike replica sets that duplicate data for redundancy, sharding partitions data horizontally for scalability. Consequently, sharded clusters can store petabytes of data and handle millions of operations per second.
Understanding Sharding Architecture
A MongoDB sharded cluster consists of three critical components:
- Shard Servers: Store subsets of your data collection
- Config Servers: Maintain cluster metadata and routing information
- mongos Routers: Direct client requests to appropriate shards
![MongoDB Sharding Architecture – See SVG below for detailed diagram]
Setting Up Configuration Servers
Configuration servers maintain the authoritative metadata for the sharded cluster. Therefore, deploy them as a replica set for high availability.
# Deploy config server replica set on three servers
# On config-server-1 (192.168.1.10)
mongod --configsvr --replSet configReplSet --dbpath /var/lib/mongodb/config --port 27019 --bind_ip localhost,192.168.1.10
# On config-server-2 (192.168.1.11)
mongod --configsvr --replSet configReplSet --dbpath /var/lib/mongodb/config --port 27019 --bind_ip localhost,192.168.1.11
# On config-server-3 (192.168.1.12)
mongod --configsvr --replSet configReplSet --dbpath /var/lib/mongodb/config --port 27019 --bind_ip localhost,192.168.1.12
# Initialize config server replica set
mongosh --host 192.168.1.10 --port 27019
rs.initiate({
_id: "configReplSet",
configsvr: true,
members: [
{ _id: 0, host: "192.168.1.10:27019" },
{ _id: 1, host: "192.168.1.11:27019" },
{ _id: 2, host: "192.168.1.12:27019" }
]
})
Deploying Shard Servers
Each shard operates as an independent replica set. Consequently, you deploy them using the same procedures as standalone replica sets:
# Shard 1 - Three member replica set
# On shard1-member1 (192.168.1.20)
mongod --shardsvr --replSet shard1 --dbpath /var/lib/mongodb/shard1 --port 27018 --bind_ip localhost,192.168.1.20
# On shard1-member2 (192.168.1.21)
mongod --shardsvr --replSet shard1 --dbpath /var/lib/mongodb/shard1 --port 27018 --bind_ip localhost,192.168.1.21
# On shard1-member3 (192.168.1.22)
mongod --shardsvr --replSet shard1 --dbpath /var/lib/mongodb/shard1 --port 27018 --bind_ip localhost,192.168.1.22
# Initialize shard replica set
mongosh --host 192.168.1.20 --port 27018
rs.initiate({
_id: "shard1",
members: [
{ _id: 0, host: "192.168.1.20:27018" },
{ _id: 1, host: "192.168.1.21:27018" },
{ _id: 2, host: "192.168.1.22:27018" }
]
})
# Repeat process for shard2, shard3, etc.
Deploying mongos Query Routers
The mongos process routes client requests to appropriate shards. Moreover, multiple mongos instances provide high availability for client connections.
# Start mongos on application servers
mongos --configdb configReplSet/192.168.1.10:27019,192.168.1.11:27019,192.168.1.12:27019 \
--bind_ip localhost,192.168.1.50 \
--port 27017
# Add shards to cluster
mongosh --host 192.168.1.50 --port 27017
sh.addShard("shard1/192.168.1.20:27018,192.168.1.21:27018,192.168.1.22:27018")
sh.addShard("shard2/192.168.1.30:27018,192.168.1.31:27018,192.168.1.32:27018")
sh.addShard("shard3/192.168.1.40:27018,192.168.1.41:27018,192.168.1.42:27018")
# Verify cluster status
sh.status()
Enabling Sharding for Databases and Collections
After establishing the cluster infrastructure, enable sharding for specific databases and collections. However, choose your shard key carefully as it significantly impacts performance.
// Enable sharding on database
sh.enableSharding("myDatabase")
// Shard collection using hashed shard key (good for even distribution)
sh.shardCollection("myDatabase.orders", { user_id: "hashed" })
// Shard collection using range-based shard key (good for range queries)
sh.shardCollection("myDatabase.timeseries", { timestamp: 1, sensor_id: 1 })
// Verify sharding configuration
db.orders.getShardDistribution()
// Check chunk distribution
sh.status()
Choosing Effective Shard Keys
The shard key determines how MongoDB distributes documents across shards. Consequently, poor shard key selection causes performance bottlenecks:
Good Shard Key Characteristics:
- High cardinality (many unique values)
- Even distribution of queries across shards
- Avoids monotonically increasing values (prevents hotspots)
Recommended Shard Key Patterns:
// Hashed shard key - Even distribution
{ user_id: "hashed" }
// Compound shard key - Targeted queries with even distribution
{ country: 1, user_id: 1 }
// Time-based with additional field - Prevents hotspots
{ created_month: 1, order_id: 1 }
// BAD: Monotonically increasing key - Creates hotspot on one shard
{ _id: 1 } // ObjectId increases monotonically
For detailed sharding strategies, review the MongoDB Sharding Documentation.
How to Configure MongoDB Replica Sets for High Availability?
MongoDB replica sets provide data redundancy and high availability through automatic failover. When the primary node fails, replica set members automatically elect a new primary. Consequently, applications experience minimal downtime during failures.
Replica Set Architecture
A typical replica set contains three members:
- Primary: Accepts all write operations
- Secondary (2x): Replicate primary’s data and can serve read operations
- Optional Arbiter: Participates in elections but doesn’t store data
Deploying a Three-Member Replica Set
# On member1 (192.168.1.60)
sudo mkdir -p /var/lib/mongodb/rs0
sudo chown mongodb:mongodb /var/lib/mongodb/rs0
mongod --replSet rs0 --dbpath /var/lib/mongodb/rs0 --port 27017 \
--bind_ip localhost,192.168.1.60
# On member2 (192.168.1.61)
mongod --replSet rs0 --dbpath /var/lib/mongodb/rs0 --port 27017 \
--bind_ip localhost,192.168.1.61
# On member3 (192.168.1.62)
mongod --replSet rs0 --dbpath /var/lib/mongodb/rs0 --port 27017 \
--bind_ip localhost,192.168.1.62
# Initialize replica set from member1
mongosh --host 192.168.1.60
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "192.168.1.60:27017", priority: 2 },
{ _id: 1, host: "192.168.1.61:27017", priority: 1 },
{ _id: 2, host: "192.168.1.62:27017", priority: 1 }
]
})
# Verify replica set status
rs.status()
rs.conf()
Understanding Replica Set Member Priorities
Member priority determines election preferences during failover. Higher priority members become primary more readily:
// Reconfigure member priorities
cfg = rs.conf()
cfg.members[0].priority = 3 // Preferred primary
cfg.members[1].priority = 1 // Standard secondary
cfg.members[2].priority = 0.5 // Backup secondary
rs.reconfig(cfg)
// Create hidden member (won't become primary, can't receive client reads)
cfg.members[2].priority = 0
cfg.members[2].hidden = true
rs.reconfig(cfg)
// Create delayed member (for recovery from accidental deletions)
cfg.members[2].priority = 0
cfg.members[2].hidden = true
cfg.members[2].secondaryDelaySecs = 3600 // 1 hour delay
rs.reconfig(cfg)
Testing Automatic Failover
Validate your replica set’s failover behavior before production deployment:
# Identify current primary
mongosh --host 192.168.1.60 --eval "rs.status().members.find(m => m.state === 1).name"
# Simulate primary failure (on primary server)
sudo systemctl stop mongod
# Monitor election from secondary
mongosh --host 192.168.1.61
rs.status()
# Observe new primary election (typically completes in 10-12 seconds)
# Expected output shows different member as PRIMARY state
# Restart original primary (becomes secondary)
sudo systemctl start mongod
Read Preference Configuration
By default, applications read from the primary. However, read preferences distribute read load across secondaries:
// In application connection string
mongodb://192.168.1.60:27017,192.168.1.61:27017,192.168.1.62:27017/myDatabase?replicaSet=rs0&readPreference=secondaryPreferred
// Read preference options:
// - primary: Default, all reads from primary
// - primaryPreferred: Primary if available, otherwise secondary
// - secondary: Only from secondaries
// - secondaryPreferred: Secondaries if available, otherwise primary
// - nearest: Lowest network latency member
Important: Reading from secondaries may return stale data due to replication lag. Therefore, use secondary reads only for eventually consistent queries.
Monitoring Replication Lag
Excessive replication lag indicates performance problems. Consequently, monitor this metric continuously:
// Check replication lag
rs.printSecondaryReplicationInfo()
// Expected output shows lag in seconds:
// source: 192.168.1.61:27017
// syncedTo: Mon Oct 26 2025 10:15:30 GMT+0000 (UTC)
// 0 secs (0 hrs) behind the primary
// Monitor oplog size
db.printReplicationInfo()
For comprehensive replica set guidance, review the MongoDB Replication Documentation and the Red Hat MongoDB Administration Guide.
What are the Best Performance Tuning Practices?
MongoDB performance optimization requires systematic analysis of workload patterns, index strategies, and system resources. However, premature optimization often wastes effort on non-critical bottlenecks. Therefore, always measure before optimizing.
Index Strategy Development
Indexes dramatically accelerate query performance. Conversely, excessive indexes slow write operations and consume storage. Consequently, design indexes strategically:
// Create single-field index
db.users.createIndex({ email: 1 })
// Compound index for multiple query patterns
db.orders.createIndex({ user_id: 1, created_at: -1, status: 1 })
// Unique index with sparse option (omits documents lacking the field)
db.users.createIndex({ username: 1 }, { unique: true, sparse: true })
// Text index for full-text search
db.articles.createIndex({ title: "text", content: "text" })
// Geospatial index for location queries
db.locations.createIndex({ coordinates: "2dsphere" })
// TTL index for automatic document expiration
db.sessions.createIndex({ created_at: 1 }, { expireAfterSeconds: 3600 })
// Partial index (indexes subset of documents)
db.products.createIndex(
{ sku: 1, warehouse: 1 },
{ partialFilterExpression: { inStock: true } }
)
Analyzing Query Performance
MongoDB’s explain() method reveals query execution details:
// Analyze query execution
db.orders.explain("executionStats").find({
user_id: 12345,
created_at: { $gte: ISODate("2025-01-01") }
}).sort({ created_at: -1 })
// Key metrics to examine:
// - executionTimeMillis: Total query time
// - totalKeysExamined: Index entries scanned
// - totalDocsExamined: Documents examined
// - nReturned: Documents returned
//
// Ideal: totalDocsExamined ≈ nReturned (minimal document scanning)
// Identify missing indexes
db.orders.find({
user_id: 12345,
created_at: { $gte: ISODate("2025-01-01") }
}).explain("executionStats").executionStats.executionStages.stage
// If output shows "COLLSCAN", add appropriate index
WiredTiger Cache Configuration
The WiredTiger cache significantly impacts performance. Therefore, configure it appropriately for your workload:
# /etc/mongod.conf
storage:
wiredTiger:
engineConfig:
# Default: 50% of (RAM - 1GB), maximum 256GB
# For dedicated MongoDB server with 32GB RAM: ~15GB
cacheSizeGB: 15
# Journal compression (reduces I/O)
journalCompressor: snappy
# Checkpoint interval (balance between durability and performance)
checkpointSizeMB: 1000
Monitoring Cache Performance:
// Check cache utilization
db.serverStatus().wiredTiger.cache
// Key metrics:
// - "bytes currently in the cache": Current cache size
// - "maximum bytes configured": Configured limit
// - "pages read into cache": Cache misses
// - "pages requested from the cache": Total requests
// Calculate cache hit ratio
var stats = db.serverStatus().wiredTiger.cache
var hitRatio = (stats["pages requested from the cache"] - stats["pages read into cache"]) / stats["pages requested from the cache"] * 100
print("Cache hit ratio: " + hitRatio.toFixed(2) + "%")
// Target: >90% for production workloads
Connection Pool Optimization
Connection pooling reduces connection overhead. However, excessive connections consume system resources:
# /etc/mongod.conf
net:
maxIncomingConnections: 65536 # System maximum
# MongoDB automatically manages internal connection pools
# For application configuration, set appropriate pool sizes:
# Node.js driver example:
const client = new MongoClient(uri, {
maxPoolSize: 100,
minPoolSize: 10,
maxIdleTimeMS: 30000
});
# Python PyMongo example:
client = MongoClient(uri,
maxPoolSize=100,
minPoolSize=10,
maxIdleTimeMS=30000
)
Aggregation Pipeline Optimization
Aggregation pipelines process complex data transformations. Consequently, optimize them for performance:
// INEFFICIENT: No indexes, examines all documents
db.orders.aggregate([
{ $match: { status: "completed" } },
{ $group: { _id: "$user_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } },
{ $limit: 10 }
])
// OPTIMIZED: Index on status, early filtering, memory-efficient sort
db.orders.createIndex({ status: 1, amount: -1 })
db.orders.aggregate([
{ $match: { status: "completed" } }, // Uses index
{ $sort: { amount: -1 } }, // Index-supported sort
{ $group: { _id: "$user_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } },
{ $limit: 10 }
], { allowDiskUse: true }) // Enable disk storage for large datasets
// Check aggregation explain plan
db.orders.explain("executionStats").aggregate([...])
Operating System Tuning
Linux kernel parameters affect MongoDB performance. Therefore, optimize them for database workloads:
# Disable transparent huge pages (THP) - causes latency spikes
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/defrag
# Make persistent across reboots
sudo tee /etc/systemd/system/disable-thp.service > /dev/null <<EOF
[Unit]
Description=Disable Transparent Huge Pages (THP)
After=sysinit.target local-fs.target
[Service]
Type=oneshot
ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/defrag'
[Install]
WantedBy=basic.target
EOF
sudo systemctl enable disable-thp
sudo systemctl start disable-thp
# Increase file descriptor limits
sudo tee -a /etc/security/limits.conf > /dev/null <<EOF
mongodb soft nofile 64000
mongodb hard nofile 64000
mongodb soft nproc 64000
mongodb hard nproc 64000
EOF
# Set appropriate I/O scheduler (deadline or noop for SSDs)
echo deadline | sudo tee /sys/block/sda/queue/scheduler
# Disable NUMA zone reclaim
echo 0 | sudo tee /proc/sys/vm/zone_reclaim_mode
For comprehensive tuning guidance, consult the MongoDB Production Notes and Linux kernel documentation.
How to Backup and Restore MongoDB Databases?
MongoDB backup strategies balance recovery point objectives (RPO) with backup performance impact. Consequently, production environments typically implement multiple backup tiers. For comprehensive backup planning, review our earlier guide on Database Backup and Recovery Strategies (Post #59).
Logical Backups with mongodump
mongodump creates BSON exports of database contents. Subsequently, mongorestore recreates databases from these exports:
# Complete database backup
mongodump --uri="mongodb://username:password@localhost:27017/myDatabase" \
--out=/backup/mongodb/$(date +%Y%m%d-%H%M%S)
# Backup specific collection
mongodump --uri="mongodb://username:password@localhost:27017/myDatabase" \
--collection=orders \
--out=/backup/mongodb/orders-$(date +%Y%m%d)
# Compressed backup with gzip
mongodump --uri="mongodb://username:password@localhost:27017" \
--gzip \
--archive=/backup/mongodb/full-backup-$(date +%Y%m%d).archive
# Backup with query filter (partial backup)
mongodump --uri="mongodb://username:password@localhost:27017/myDatabase" \
--collection=logs \
--query='{"created_at": {"$gte": {"$date": "2025-01-01T00:00:00Z"}}}' \
--out=/backup/mongodb/recent-logs
Restoring from mongodump Backups
# Restore complete backup
mongorestore --uri="mongodb://username:password@localhost:27017" \
/backup/mongodb/20251026-100000/
# Restore specific database
mongorestore --uri="mongodb://username:password@localhost:27017" \
--nsInclude="myDatabase.*" \
/backup/mongodb/20251026-100000/
# Restore from compressed archive
mongorestore --uri="mongodb://username:password@localhost:27017" \
--gzip \
--archive=/backup/mongodb/full-backup-20251026.archive
# Restore to different database name
mongorestore --uri="mongodb://username:password@localhost:27017" \
--nsFrom="myDatabase.*" \
--nsTo="myDatabase_restored.*" \
/backup/mongodb/20251026-100000/
Filesystem Snapshots for Large Databases
For multi-terabyte databases, filesystem snapshots provide faster backups. However, this approach requires coordinated snapshots across replica set members:
# LVM snapshot approach
# 1. Lock database writes
mongosh --eval 'db.fsyncLock()'
# 2. Create LVM snapshot
sudo lvcreate --size 50G --snapshot --name mongodb_snap /dev/vg0/mongodb_lv
# 3. Unlock database
mongosh --eval 'db.fsyncUnlock()'
# 4. Mount snapshot and copy data
sudo mkdir -p /mnt/mongodb_backup
sudo mount /dev/vg0/mongodb_snap /mnt/mongodb_backup
sudo rsync -av /mnt/mongodb_backup/ /backup/mongodb/snapshot-$(date +%Y%m%d)/
sudo umount /mnt/mongodb_backup
# 5. Remove snapshot
sudo lvremove -f /dev/vg0/mongodb_snap
Point-in-Time Recovery with Oplog
Replica set oplogs enable point-in-time recovery between backups. Consequently, combine filesystem snapshots with oplog tailing:
# Continuous oplog backup script
#!/bin/bash
BACKUP_DIR="/backup/mongodb/oplog"
mkdir -p "$BACKUP_DIR"
while true; do
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
mongodump --uri="mongodb://username:password@localhost:27017/local" \
--collection=oplog.rs \
--query='{"ts": {"$gte": {"$timestamp": {"t": '$(date -d '1 hour ago' +%s)', "i": 1}}}}' \
--out="$BACKUP_DIR/oplog-$TIMESTAMP"
sleep 3600 # Run hourly
done
Automated Backup Script
Production deployments require automated, tested backup procedures:
#!/bin/bash
# /usr/local/bin/mongodb-backup.sh
set -euo pipefail
# Configuration
BACKUP_BASE="/backup/mongodb"
RETENTION_DAYS=30
MONGODB_URI="mongodb://backupuser:password@localhost:27017/?authSource=admin"
BACKUP_TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_DIR="${BACKUP_BASE}/${BACKUP_TIMESTAMP}"
LOG_FILE="${BACKUP_BASE}/backup-${BACKUP_TIMESTAMP}.log"
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Execute backup
echo "[$(date)] Starting MongoDB backup" | tee -a "$LOG_FILE"
mongodump --uri="$MONGODB_URI" \
--gzip \
--out="$BACKUP_DIR" \
2>&1 | tee -a "$LOG_FILE"
# Verify backup integrity
if mongorestore --uri="$MONGODB_URI" \
--dryRun \
--gzip \
--dir="$BACKUP_DIR" 2>&1 | tee -a "$LOG_FILE"; then
echo "[$(date)] Backup verification successful" | tee -a "$LOG_FILE"
else
echo "[$(date)] ERROR: Backup verification failed!" | tee -a "$LOG_FILE"
exit 1
fi
# Compress backup
tar -czf "${BACKUP_DIR}.tar.gz" -C "$BACKUP_BASE" "$BACKUP_TIMESTAMP"
rm -rf "$BACKUP_DIR"
# Upload to remote storage (optional)
# aws s3 cp "${BACKUP_DIR}.tar.gz" s3://my-backup-bucket/mongodb/
# Clean old backups
find "$BACKUP_BASE" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete
echo "[$(date)] Backup completed successfully" | tee -a "$LOG_FILE"
Schedule Automated Backups
# Add to crontab for nightly backups at 2 AM
sudo crontab -e
0 2 * * * /usr/local/bin/mongodb-backup.sh
# Or create systemd timer for more control
sudo tee /etc/systemd/system/mongodb-backup.service > /dev/null <<EOF
[Unit]
Description=MongoDB Backup Service
After=mongod.service
[Service]
Type=oneshot
User=root
ExecStart=/usr/local/bin/mongodb-backup.sh
EOF
sudo tee /etc/systemd/system/mongodb-backup.timer > /dev/null <<EOF
[Unit]
Description=MongoDB Backup Timer
Requires=mongodb-backup.service
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
EOF
sudo systemctl enable mongodb-backup.timer
sudo systemctl start mongodb-backup.timer
For enterprise backup solutions, consider MongoDB Atlas automated backups or Percona Backup for MongoDB.
FAQ: Common MongoDB on Linux Questions
What’s the difference between MongoDB and traditional SQL databases?
MongoDB stores data in flexible JSON-like documents rather than rigid tables. Consequently, you can modify schemas without migrations, nest related data together, and scale horizontally more easily. However, MongoDB sacrifices ACID guarantees across multiple documents in favor of performance. For transaction-heavy applications, traditional SQL databases like PostgreSQL (Post #57) may be more appropriate.
How much RAM does MongoDB require?
MongoDB’s WiredTiger storage engine typically uses 50% of available RAM minus 1GB. Therefore, a server with 32GB RAM allocates approximately 15GB to the WiredTiger cache. However, ensure additional RAM for operating system, connections, and other processes. Minimum recommended RAM: 4GB for development, 16GB+ for production.
Can I run MongoDB on small Linux servers?
MongoDB operates efficiently on resource-constrained systems. However, performance degrades when the working set exceeds available RAM. For small deployments, consider 4GB RAM minimum, 2 CPU cores, and SSD storage. Alternatively, explore Redis (Post #58) for simpler caching requirements.
Should I use MongoDB for time-series data?
MongoDB 5.0+ includes specialized time-series collections optimized for temporal data. These collections compress historical data automatically and optimize query patterns for time-based operations. Consequently, MongoDB handles IoT sensor data, financial tick data, and application metrics efficiently. However, evaluate specialized time-series databases like InfluxDB or TimescaleDB for extreme-scale scenarios.
How do I monitor MongoDB performance on Linux?
MongoDB provides comprehensive monitoring through:
- mongostat: Real-time server statistics
- mongotop: Collection-level operation tracking
- db.serverStatus(): Detailed metrics API
- MongoDB Cloud Manager: Enterprise monitoring solution
- Prometheus + Grafana: Open-source monitoring stack
For advanced monitoring, review our Prometheus and Grafana setup guide (Post #46).
What’s the recommended Linux distribution for MongoDB?
MongoDB performs well on all major distributions. However, Red Hat Enterprise Linux and Ubuntu LTS receive the most extensive testing. Additionally, these distributions provide long-term support matching enterprise deployment requirements. For production, choose:
- Ubuntu 22.04 LTS or later
- RHEL 8.x / Rocky Linux 8.x or later
- Debian 11+ (if your organization prefers Debian)
Can MongoDB replace my existing MySQL database?
Migration from relational databases to MongoDB requires careful planning. Specifically, applications relying on complex JOINs, strict transaction requirements, or normalized schemas may not benefit from MongoDB. However, applications with flexible schemas, horizontal scaling requirements, or document-oriented data model naturally fit MongoDB. Evaluate your specific use case before migration. For MySQL administration guidance, see Post #56.
Troubleshooting Common MongoDB Issues
Even well-configured deployments encounter operational challenges. Therefore, systematic troubleshooting methodologies identify and resolve issues efficiently.
MongoDB Won’t Start After Installation
Symptom: systemctl start mongod fails with error messages.
Diagnostic Steps:
# Check systemd service status
sudo systemctl status mongod
# Review MongoDB logs
sudo tail -n 100 /var/log/mongodb/mongod.log
# Verify configuration syntax
mongod --config /etc/mongod.conf --configsvr --test
# Check file permissions
ls -la /var/lib/mongodb
ls -la /var/log/mongodb
# Verify port availability
sudo netstat -tlnp | grep 27017
Common Causes:
- Permission Issues: MongoDB process can’t access data directory
sudo chown -R mongodb:mongodb /var/lib/mongodb sudo chown -R mongodb:mongodb /var/log/mongodb - Port Already In Use: Another process occupies port 27017
sudo lsof -i :27017 # Kill conflicting process or change MongoDB port - Invalid Configuration: Syntax errors in mongod.conf
# Use yamllint to validate YAML syntax sudo apt install yamllint yamllint /etc/mongod.conf
Connection Refused Errors
Symptom: Applications can’t connect to MongoDB with “Connection refused” errors.
Diagnostic Steps:
# Verify MongoDB is running
sudo systemctl status mongod
# Check binding configuration
grep bindIp /etc/mongod.conf
# Test local connection
mongosh --host 127.0.0.1
# Test network connection from client
telnet mongodb.example.com 27017
# Check firewall rules
sudo iptables -L -n | grep 27017
sudo firewall-cmd --list-all
Solutions:
# Bind to all interfaces (after implementing authentication!)
# /etc/mongod.conf
net:
bindIp: 0.0.0.0
# Or bind to specific interfaces
net:
bindIp: 127.0.0.1,192.168.1.50
# Open firewall port
sudo firewall-cmd --permanent --add-port=27017/tcp
sudo firewall-cmd --reload
# Restart MongoDB
sudo systemctl restart mongod
Slow Query Performance
Symptom: Queries taking seconds or minutes to complete.
Diagnostic Steps:
// Enable slow query logging
db.setProfilingLevel(1, { slowms: 100 })
// View slow queries
db.system.profile.find().sort({ ts: -1 }).limit(10).pretty()
// Analyze specific query
db.collection.explain("executionStats").find({ field: value })
// Check for missing indexes
db.collection.getIndexes()
// Monitor current operations
db.currentOp()
Optimization Strategies:
// Create appropriate indexes
db.collection.createIndex({ commonly_queried_field: 1 })
// Use compound indexes for multiple query fields
db.collection.createIndex({ field1: 1, field2: -1 })
// Analyze index usage
db.collection.aggregate([
{ $indexStats: {} }
])
// Remove unused indexes (consume resources without benefit)
db.collection.dropIndex("unused_index_name")
Replication Lag Issues
Symptom: Secondary members fall behind primary by minutes or hours.
Diagnostic Steps:
// Check replication status
rs.status()
// View replication lag details
rs.printSecondaryReplicationInfo()
// Monitor oplog size
db.getReplicationInfo()
// Check secondary member health
rs.status().members.filter(m => m.state === 2)
Common Causes and Solutions:
- Insufficient Secondary Resources: Secondary lacks CPU/RAM
# Monitor system resources htop iostat -x 1 - Network Latency: High latency between replica set members
# Test network latency ping -c 10 secondary-member.example.com mtr secondary-member.example.com - Small Oplog: Oplog fills before secondary catches up
// Increase oplog size db.adminCommand({replSetResizeOplog: 1, size: 4096}) // 4GB
High Memory Usage
Symptom: MongoDB consuming excessive system memory.
Diagnostic Steps:
// Check WiredTiger cache usage
db.serverStatus().wiredTiger.cache
// Monitor memory by collection
db.stats()
db.collection.stats()
// Check connection count
db.serverStatus().connections
Memory Optimization:
# Reduce WiredTiger cache if competing with other services
storage:
wiredTiger:
engineConfig:
cacheSizeGB: 8 # Adjust based on available RAM
# Limit maximum connections
net:
maxIncomingConnections: 1000
Authentication Failures After Enabling Security
Symptom: “Authentication failed” errors after enabling authorization.
Diagnostic Steps:
# Verify authentication configuration
grep authorization /etc/mongod.conf
# Test authentication with explicit credentials
mongosh -u adminUser -p --authenticationDatabase admin
# Check user existence and roles
mongosh admin --eval 'db.getUsers()'
# Review authentication logs
sudo grep "Authentication failed" /var/log/mongodb/mongod.log
Solutions:
// Create missing administrative user (requires temporary security disable)
// 1. Stop MongoDB
// 2. Comment out security section in /etc/mongod.conf
// 3. Start MongoDB
// 4. Create user
use admin
db.createUser({
user: "adminUser",
pwd: "securePassword",
roles: ["root"]
})
// 5. Re-enable security and restart
// Grant additional privileges to existing user
db.grantRolesToUser("username", [
{ role: "readWrite", db: "myDatabase" }
])
For complex issues, consult the MongoDB documentation or engage with the MongoDB Community Forums.
Additional Resources
Official MongoDB Documentation
- MongoDB Manual – Comprehensive official documentation
- MongoDB University – Free courses and certifications
- MongoDB Server Documentation
Linux System Administration
- Red Hat MongoDB Documentation – Enterprise deployment guides
- Ubuntu Server Guide – Ubuntu-specific configurations
- Linux Foundation Resources – Advanced Linux administration
Performance and Optimization
Security Resources
- MongoDB Security Checklist
- NIST Cybersecurity Framework – Security standards
- CIS Benchmarks – Security configuration baselines
Community and Support
- MongoDB Community Forums – Active community discussions
- Stack Overflow MongoDB Tag – Technical Q&A
- MongoDB GitHub Repository – Source code and issues
Related LinuxTips.pro Articles
- Post #56: MySQL/MariaDB Administration on Linux – Relational database comparison
- Post #57: PostgreSQL Setup and Optimization – Alternative database solution
- Post #58: Redis In-Memory Data Store Configuration – Caching layer complement
- Post #59: Database Backup and Recovery Strategies – Comprehensive backup planning
- Post #61: Docker Fundamentals – Containerizing MongoDB deployments
Previous Article: Database Backup and Recovery Strategies (Post #59)
Next Article: Docker Fundamentals: Containers vs Virtual Machines (Post #61)
Linux Mastery Series Navigation: View All 100 Articles | Database Administration Chapter
Last Updated: October 26, 2025 | Author: LinuxTips.pro Editorial Team