DEV Community

Data Tech Bridge
Data Tech Bridge

Posted on

Amazon OpenSearch Service - Cheat Sheet

Amazon OpenSearch Service Cheat Sheet for AWS Certified Data Engineer - Associate (DEA-C01)

Core Concepts and Building Blocks

AWS OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud. OpenSearch is a distributed, open-source search and analytics engine for use cases such as log analytics, real-time application monitoring, and clickstream analysis.

Key Components:

  1. Domain - A logical container for an OpenSearch cluster
  2. Cluster - Collection of one or more data nodes, optional dedicated master nodes, and optional UltraWarm nodes
  3. Node - An instance in the cluster (data, master, or UltraWarm)
  4. Index - Collection of documents with similar characteristics
  5. Shard - Horizontal partition of an index (primary or replica)
  6. Document - Basic unit of information that can be indexed

OpenSearch Service Features and Details

Feature Description
Deployment Options Single-AZ or Multi-AZ with standby
Instance Types T2, T3, M5, C5, R5, I3, etc.
Storage Types EBS (gp2, gp3, io1) or Instance Store
Data Tiers Hot (active data), UltraWarm (less frequently accessed), Cold (archive)
Security Fine-grained access control, encryption at rest and in transit, VPC support
Backup Automated snapshots to S3
Scaling Horizontal (add nodes) and vertical (change instance types)
Monitoring CloudWatch metrics, logs, and dashboards
Integrations Kinesis Data Firehose, CloudWatch Logs, Lambda, etc.
API Compatibility Compatible with OpenSearch and legacy Elasticsearch APIs
Visualization OpenSearch Dashboards (formerly Kibana)

Important Pointers for AWS OpenSearch Service

  1. OpenSearch Service is the successor to Amazon Elasticsearch Service, offering compatibility with both OpenSearch and Elasticsearch APIs.

  2. OpenSearch is an open-source, distributed search and analytics engine derived from Elasticsearch 7.10.2.

  3. A domain is the fundamental unit in OpenSearch Service, representing a cluster with its configuration, instance types, instance count, and storage options.

  4. OpenSearch Service supports both development and production instance types, with production types offering dedicated resources.

  5. For production workloads, it's recommended to use at least three dedicated master nodes for cluster stability.

  6. Multi-AZ deployment with standby is recommended for production environments to ensure high availability.

  7. OpenSearch Service supports three data tiers: Hot (active data), UltraWarm (less frequently accessed), and Cold (archive).

  8. UltraWarm nodes use Amazon S3 for storage, offering cost-effective storage for less frequently accessed data at about 1/10th the cost of hot storage.

  9. Cold storage is even more cost-effective than UltraWarm, designed for data that is accessed infrequently and can tolerate slightly higher access latency.

  10. The maximum domain storage limit is 3 PB for hot tier (EBS), 20 PB for UltraWarm tier, and unlimited for cold tier.

  11. OpenSearch Service supports encryption at rest using AWS KMS and encryption in transit using HTTPS.

  12. Fine-grained access control allows you to control who can access specific data within your OpenSearch cluster.

  13. OpenSearch Service integrates with Amazon Cognito for OpenSearch Dashboards authentication.

  14. VPC support allows you to isolate your OpenSearch domain within your VPC and use security groups for access control.

  15. OpenSearch Service supports both automated and manual snapshots for backup and recovery.

  16. Automated snapshots are stored in S3 at no additional charge and retained for 14 days by default.

  17. Manual snapshots can be stored indefinitely and used for cluster migration or backup.

  18. OpenSearch Service supports cross-region replication for disaster recovery and global deployments.

  19. The maximum number of nodes per cluster is 80 for most instance types.

  20. The maximum number of shards per node is 1,000, including both primary and replica shards.

  21. The recommended maximum shard size is 50 GB for optimal performance.

  22. The maximum JVM heap size is 32 GB, regardless of instance size, due to JVM limitations.

  23. OpenSearch Service supports both index-level and document-level security.

  24. OpenSearch Service integrates with AWS CloudTrail for auditing API calls.

  25. OpenSearch Service supports SQL queries in addition to the Query DSL.

  26. OpenSearch Service supports anomaly detection for identifying unusual patterns in your data.

  27. OpenSearch Service supports alerting to notify you when data meets certain conditions.

  28. OpenSearch Service supports asynchronous search for long-running queries.

  29. OpenSearch Service supports cross-cluster search to query multiple domains.

  30. OpenSearch Service supports index state management for automating index lifecycle tasks.

  31. OpenSearch Service supports k-NN (k-nearest neighbor) search for vector-based similarity searches.

  32. OpenSearch Service supports learning to rank (LTR) for improving search relevance.

  33. OpenSearch Service supports custom packages for adding plugins and dictionaries.

  34. OpenSearch Service supports auto-tune to optimize domain performance automatically.

  35. OpenSearch Service supports index rollover to manage index size and age.

  36. OpenSearch Service supports index transforms to create a new index with transformed data.

  37. OpenSearch Service supports index aliases for abstracting indices from clients.

  38. OpenSearch Service supports index templates for defining settings and mappings for new indices.

  39. OpenSearch Service supports snapshot lifecycle management for automating snapshot creation and deletion.

  40. OpenSearch Service supports data streams for time-series data.

  41. OpenSearch Service supports ingest pipelines for preprocessing documents before indexing.

  42. OpenSearch Service supports field mappings to define how fields are indexed and stored.

  43. OpenSearch Service supports dynamic mappings to automatically detect field types.

  44. OpenSearch Service supports explicit mappings to define field types explicitly.

  45. OpenSearch Service supports nested fields for indexing arrays of objects.

  46. OpenSearch Service supports parent-child relationships for related documents.

  47. OpenSearch Service supports aggregations for data analysis and visualization.

  48. OpenSearch Service supports scripting for custom logic in queries and aggregations.

  49. OpenSearch Service supports percolator queries for matching documents against stored queries.

  50. OpenSearch Service supports highlighting for highlighting search terms in results.

  51. OpenSearch Service supports suggesters for search-as-you-type functionality.

  52. OpenSearch Service supports completion suggesters for auto-complete functionality.

  53. OpenSearch Service supports phrase suggesters for correcting misspelled phrases.

  54. OpenSearch Service supports term suggesters for correcting misspelled terms.

  55. OpenSearch Service supports fuzzy queries for approximate matching.

  56. OpenSearch Service supports wildcard queries for pattern matching.

  57. OpenSearch Service supports regular expression queries for pattern matching.

  58. OpenSearch Service supports range queries for numeric and date ranges.

  59. OpenSearch Service supports geo-spatial queries for location-based searches.

  60. OpenSearch Service supports boolean queries for combining multiple queries.

  61. OpenSearch Service supports function score queries for custom scoring.

  62. OpenSearch Service supports script score queries for custom scoring with scripts.

  63. OpenSearch Service supports decay functions for boosting by distance, date, or numeric value.

  64. OpenSearch Service supports field collapsing for grouping results by field.

  65. OpenSearch Service supports search templates for parameterized queries.

  66. OpenSearch Service supports rank evaluation for evaluating search quality.

  67. OpenSearch Service supports profile API for analyzing query performance.

  68. OpenSearch Service supports explain API for explaining query scoring.

  69. OpenSearch Service supports validate API for validating queries.

  70. OpenSearch Service supports count API for counting documents without returning them.

  71. OpenSearch Service supports bulk API for batch operations.

  72. OpenSearch Service supports multi-get API for retrieving multiple documents.

  73. OpenSearch Service supports multi-search API for executing multiple searches.

  74. OpenSearch Service supports update API for updating documents.

  75. OpenSearch Service supports update-by-query API for updating documents matching a query.

  76. OpenSearch Service supports delete-by-query API for deleting documents matching a query.

  77. OpenSearch Service supports reindex API for copying documents from one index to another.

  78. OpenSearch Service supports scroll API for retrieving large result sets.

  79. OpenSearch Service supports search after for deep pagination.

  80. OpenSearch Service supports point-in-time search for consistent results across multiple searches.

  81. OpenSearch Service supports field capabilities API for retrieving field information.

  82. OpenSearch Service supports cat APIs for compact and aligned text output.

  83. OpenSearch Service supports cluster APIs for managing cluster settings.

  84. OpenSearch Service supports index APIs for managing indices.

  85. OpenSearch Service supports snapshot APIs for managing snapshots.

  86. OpenSearch Service supports task management APIs for managing long-running tasks.

  87. OpenSearch Service supports node stats API for retrieving node statistics.

  88. OpenSearch Service supports cluster stats API for retrieving cluster statistics.

  89. OpenSearch Service supports index stats API for retrieving index statistics.

  90. OpenSearch Service supports shard stats API for retrieving shard statistics.

  91. OpenSearch Service supports hot threads API for identifying busy threads.

  92. OpenSearch Service supports thread pool API for monitoring thread pools.

  93. OpenSearch Service supports circuit breaker API for monitoring memory usage.

  94. OpenSearch Service supports cluster health API for monitoring cluster health.

  95. OpenSearch Service supports cluster allocation explain API for explaining shard allocation.

  96. OpenSearch Service supports cluster reroute API for manually allocating shards.

  97. OpenSearch Service supports cluster update settings API for updating cluster settings.

  98. OpenSearch Service supports index recovery API for monitoring index recovery.

  99. OpenSearch Service supports index segments API for retrieving segment information.

  100. OpenSearch Service supports index shard stores API for retrieving shard store information.

  101. OpenSearch Service supports index flush API for flushing indices to disk.

  102. OpenSearch Service supports index refresh API for refreshing indices.

  103. OpenSearch Service supports index force merge API for merging segments.

  104. OpenSearch Service supports index clear cache API for clearing caches.

  105. OpenSearch Service supports index analyze API for analyzing text.

Comparison of OpenSearch Service Instance Types

Instance Type vCPU Memory (GiB) Use Case
t3.small 2 2 Development and testing
t3.medium 2 4 Development and testing
m5.large 2 8 Production - balanced workloads
m5.xlarge 4 16 Production - balanced workloads
c5.large 2 4 Production - compute-intensive workloads
c5.xlarge 4 8 Production - compute-intensive workloads
r5.large 2 16 Production - memory-intensive workloads
r5.xlarge 4 32 Production - memory-intensive workloads
i3.large 2 15.25 Production - storage-intensive workloads
i3.xlarge 4 30.5 Production - storage-intensive workloads

Comparison of OpenSearch Service Storage Types

Storage Type Performance Use Case Cost
EBS gp2 Medium General purpose Medium
EBS gp3 Medium-High General purpose with customizable IOPS Medium
EBS io1 High I/O-intensive workloads High
Instance Store Very High High-performance workloads Included with instance
UltraWarm Medium Less frequently accessed data Low
Cold Storage Low Rarely accessed data Very Low

Data Ingestion Methods and Throughput Characteristics

Ingestion Method Throughput Latency Replayability Rate Limiting
Direct API High Low Manual Per domain limits
Kinesis Firehose Medium-High Medium Automatic with S3 backup Configurable
Logstash Medium Medium Configurable Configurable
Fluentd Medium Medium Configurable Configurable
Lambda Medium Medium-High Depends on source Lambda concurrency limits
CloudWatch Logs Medium Medium-High Limited CloudWatch Logs limits

OpenSearch vs Elasticsearch Comparison

Feature OpenSearch Elasticsearch
License Apache 2.0 Elastic License (not fully open-source)
Development Community-driven Elastic N.V.
AWS Support Full support Limited to older versions
Security Features Included Requires paid subscription in newer versions
Visualization OpenSearch Dashboards Kibana
Machine Learning Basic capabilities Advanced capabilities in paid tiers
Alerting Included Requires paid subscription
SQL Support Included Requires paid subscription
Anomaly Detection Included Requires paid subscription

Important CloudWatch Metrics for Monitoring

Metric Description Threshold Recommendation
ClusterStatus.red Indicates one or more primary shards are missing Should be 0
ClusterStatus.yellow Indicates one or more replica shards are missing Should be 0 in steady state
CPUUtilization CPU usage percentage <80%
JVMMemoryPressure JVM heap usage percentage <80%
FreeStorageSpace Available storage space >25% of total storage
SearchLatency Time to complete search requests Depends on application requirements
IndexingLatency Time to complete indexing requests Depends on application requirements
KibanaHealthyNodes Number of healthy OpenSearch Dashboards nodes Equal to number of nodes
MasterCPUUtilization CPU usage of master nodes <50%
MasterJVMMemoryPressure JVM heap usage of master nodes <80%
AutomatedSnapshotFailure Indicates failed automated snapshots Should be 0
ThreadpoolSearchQueue Number of queued search threads Should be near 0 in steady state
ThreadpoolWriteQueue Number of queued write threads Should be near 0 in steady state
Shards.active Number of active shards Should be stable
Nodes Number of nodes in the cluster Should match expected count

Mind Map: AWS OpenSearch Service Components

AWS OpenSearch Service
├── Domain Management
│   ├── Creation and Configuration
│   ├── Scaling (Vertical and Horizontal)
│   ├── Version Management
│   └── Deletion and Backup
├── Node Types
│   ├── Data Nodes
│   ├── Dedicated Master Nodes
│   ├── UltraWarm Nodes
│   └── Cold Storage
├── Storage Options
│   ├── EBS Volumes (gp2, gp3, io1)
│   ├── Instance Store
│   ├── UltraWarm (S3)
│   └── Cold Storage (S3)
├── Security
│   ├── Fine-grained Access Control
│   ├── Encryption at Rest
│   ├── Encryption in Transit
│   ├── VPC Access
│   ├── IAM Authentication
│   └── Cognito Integration
├── Data Management
│   ├── Indexing
│   ├── Sharding
│   ├── Replication
│   ├── Snapshots
│   └── Index Lifecycle Management
├── Search and Analytics
│   ├── Full-text Search
│   ├── Aggregations
│   ├── SQL Support
│   ├── PPL Support
│   └── Visualization with OpenSearch Dashboards
├── Advanced Features
│   ├── Anomaly Detection
│   ├── Alerting
│   ├── k-NN Search
│   ├── Cross-cluster Search
│   └── Asynchronous Search
└── Monitoring and Management
    ├── CloudWatch Integration
    ├── Auto-Tune
    ├── Index State Management
    ├── Performance Analyzer
    └── CloudTrail Integration
Enter fullscreen mode Exit fullscreen mode

Example Calculations

Shard Calculation

For a 200 GB index with expected growth to 300 GB:

Target shard size = 50 GB
Number of primary shards = 300 GB / 50 GB = 6 primary shards
With 1 replica: Total shards = 6 primary * (1 + 1) = 12 shards
Enter fullscreen mode Exit fullscreen mode

Node Calculation

For a cluster with 500 GB of data, using r5.large.search instances (16 GB RAM):

JVM heap size = 8 GB (50% of instance memory)
Storage per node = ~2 TB (EBS)
Number of nodes needed for storage = 500 GB / 2 TB = 1 node (minimum)
For high availability: Minimum 2 nodes + 3 dedicated master nodes
Enter fullscreen mode Exit fullscreen mode

Throughput Calculation

For a cluster with 5 data nodes, each capable of handling 10,000 requests/second:

Theoretical max throughput = 5 nodes * 10,000 requests/second = 50,000 requests/second
With 30% headroom: Recommended max throughput = 50,000 * 0.7 = 35,000 requests/second
Enter fullscreen mode Exit fullscreen mode

Implementing Throttling and Overcoming Rate Limits

  1. OpenSearch Service has service quotas that limit the number of domains, instances, and storage per account and region.

  2. To overcome API rate limits, implement exponential backoff and jitter in your client applications.

  3. Use bulk API operations instead of single document operations to reduce the number of API calls.

  4. For high-throughput ingestion, consider using a buffer like Kinesis Data Firehose or SQS to smooth out traffic spikes.

  5. Monitor the 4xx and 5xx error rates in CloudWatch to detect rate limiting issues.

  6. Implement client-side throttling to prevent overwhelming the OpenSearch cluster.

  7. Use connection pooling in your client applications to reduce connection overhead.

  8. Consider using the _bulk API with optimal batch sizes (typically 5-15 MB per batch) for efficient indexing.

  9. Distribute indexing workloads evenly across all data nodes by using a consistent routing strategy.

  10. Use the refresh_interval setting to control how frequently OpenSearch makes new documents available for search.

Data Ingestion Pipeline Replayability

  1. When using Kinesis Data Firehose for ingestion, enable S3 backup to store raw data for replay capability.

  2. For direct API ingestion, maintain source data in S3 or another durable store for potential replay.

  3. Use DynamoDB or another database to track ingestion state and progress for resumability.

  4. Implement idempotent processing to safely replay data without duplicates by using document IDs.

  5. Consider using SQS dead-letter queues to capture and retry failed ingestion attempts.

  6. For Logstash pipelines, use persistent queues to prevent data loss during service disruptions.

  7. Implement checkpointing in your ingestion pipelines to track progress and enable resumption.

  8. Use Lambda destinations to capture failed events for later reprocessing.

  9. Consider using Apache Kafka as a buffer before OpenSearch for enhanced replay capabilities.

  10. Implement a circuit breaker pattern in your ingestion pipeline to handle temporary OpenSearch unavailability.

Top comments (0)