In this post, we will compare ElasticSearch and MongoDB based on their performance, accuracy, and resource consumption when used for searching movie data. Instead of lengthy discussions, this article presents direct comparisons through tests, results, and insights to help you decide when to use each technology.
Disclaimer: The results presented are based on my personal experience. They might not be perfect, so feel free to share corrections or additional comparisons in the comments.
Table of Contents β¬οΈ
- Key Takeaways
- Dataset
- Installation & Setup
- Application Features
- Performance Comparison
- Search Performance Comparison
- With Fuzzy Search
- Conclusion
Key Takeaways
- RAM Consumption: Which database is more resource-efficient?
- Speed & Accuracy: Which one provides faster and more accurate search results?
- Use Cases: When to use ElasticSearch vs. MongoDB?
Dataset
The dataset used for comparison is the Wikipedia Movie Plots dataset, which contains summaries of 34,886 movies across multiple genres, languages, and origins.
Metadata Attributes
- Title: Name of the movie.
- Year: Release year.
- Origin/Ethnicity: Cultural or national background.
- Director(s) & Actor(s): Creators and cast information.
- Genre: Categories like drama, comedy, action, etc.
- Plot: A detailed summary of the movie's storyline.
Installation & Setup
- Download the dataset from Kaggle.
- Run MongoDB and ElasticSearch using Docker.
- Index the data as described in my repo: mrzaizai2k/elasticsearch_big_data.
- Watch the theoretical comparison
Application Features
- Real-time movie search.
- Query performance display.
- Highlighted search matches.
- Detailed movie information (title, plot, etc.).
- Top 10 most relevant results per query.
-
Runs on:
http://localhost:8501/
This is the UI of the website, the service run on port http://localhost:8501/
Performance Comparison
Data Upload Performance
- MongoDB: Completed in ~5 minutes.
- ElasticSearch: Took ~30β40 minutes.
Resource Usage (Docker Containers)
- ElasticSearch: ~8.6 GB RAM.
- MongoDB: ~213 MB RAM.
Insight: ElasticSearch requires significantly more resources. MongoDB is better suited for heavy-write applications, while ElasticSearch is more efficient for search-intensive applications.
Search Performance Comparison
Without Fuzzy Search
Search by Movie Title: "Harry Porter"
- Performance: ElasticSearch (16ms) is faster than MongoDB (22.7ms).
-
Accuracy:
- ElasticSearch: Produced incorrect results.
- MongoDB: Ranked "Harry Porter" at positions 3 and 4.
Search for Typo: "hary poter"
- ElasticSearch: No matching results.
- MongoDB: Returned results but not the intended movie.
Search by Cast Name: "William Powell"
- Performance: ElasticSearch was 2.5x faster than MongoDB.
- Accuracy: Both databases returned correct results.
Full-Text Search Query
Query: "But Virginia soon calls him; she wants to drop the charges. He responds with anger, which Virginia records."
-
Performance:
- ElasticSearch: 10ms.
- MongoDB: 8x slower.
-
Accuracy:
- ElasticSearch: Correct movie at position 1.
- MongoDB: Correct movie at position 2.
Insight: ElasticSearch excels in exact full-text searches but struggles with typos or variations. MongoDB, while slower, is more resilient to inaccuracies.
With Fuzzy Search
Search by Movie Title: "Harry Porter"
- Performance: ElasticSearch took 2x longer than MongoDB.
-
Accuracy:
- ElasticSearch: All correct results.
- MongoDB: Correct but less comprehensive.
Search for Typo: "hary poter"
- Performance: ElasticSearch was 2x slower than MongoDB.
-
Accuracy:
- ElasticSearch: Results identical to correct query.
- MongoDB: Significant accuracy degradation.
Search by Cast Name: "William Powell"
-
Performance:
- MongoDB: 23ms.
- ElasticSearch: 33ms.
- Accuracy: Both returned correct results.
Search with Typo in Cast Name: "Wiliam Powell"
- Performance: ElasticSearch took 3x longer than MongoDB.
-
Accuracy:
- ElasticSearch: Correct result.
- MongoDB: Failed to find the cast.
Full-Text Search
Search query: "But Virginia soon calls him; she wants to drop the charges. He responds with anger, which Virginia records."
Movie: Lawyer Man (1932)
-
Performance:
- Elasticsearch was twice as fast as MongoDB.
-
Accuracy:
- Elasticsearch correctly identified the movie as the top result.
- MongoDB placed the correct movie in the second position, along with additional less relevant results.
Full-Text Search Query with Missing Words (Bold Words)
Query: "but Virginia soon calls him she wants to drop the charges. He responds with anger, which Virginia records."
- Performance: ElasticSearch was slower than MongoDB.
-
Accuracy:
- ElasticSearch: Lawyer Man ranked 1st.
- MongoDB: Lawyer Man ranked 2nd, with additional irrelevant results.
Insight: With fuzzy search enabled, ElasticSearch is slower but more accurate, especially for error-prone queries. MongoDB is faster but struggles with inaccurate queries.
Conclusion
When to Use MongoDB?
β
Low resource consumption (suitable for limited RAM environments).
β
Faster for exact lookups.
β
Better for applications with frequent data writes.
β Struggles with search accuracy for typos.
When to Use ElasticSearch?
β
Highly optimized for text search.
β
Better accuracy for fuzzy search queries.
β
Faster for complex search operations.
β Requires more memory and slower indexing.
Final Thought: If your application relies heavily on searching large text datasets, ElasticSearch is the better choice. If you need a lightweight, general-purpose database with occasional search capabilities, MongoDB is the way to go.
π₯ Whatβs your experience with MongoDB & ElasticSearch? Let me know in the comments! π
Top comments (1)
Didnβt expect MongoDB to handle typos better! This really makes me rethink which one to use for full-text search. Thanks for the insights! π