Amazon QuickSight Cheat Sheet for AWS Certified Data Engineer - Associate (DEA-C01)
Core Concepts and Building Blocks
Amazon QuickSight is a cloud-powered business intelligence service that makes it easy to deliver insights to everyone in your organization. It provides capabilities for connecting to various data sources, preparing data, creating visualizations, building interactive dashboards, and sharing insights.
Key components include:
- Datasets: Collections of data prepared for analysis
- Analyses: Interactive workspaces where you create visualizations
- Dashboards: Collections of visualizations that can be shared
- Stories: Guided narratives that combine visualizations
- SPICE: Super-fast, Parallel, In-memory Calculation Engine
- Q: Natural language query capability
- Embedding: Capability to embed dashboards in applications
QuickSight Components Mind Map
Amazon QuickSight
├── Data Management
│ ├── Data Sources
│ │ ├── AWS Services (S3, RDS, Redshift, etc.)
│ │ ├── Third-party databases
│ │ ├── SaaS applications
│ │ └── File uploads
│ ├── Datasets
│ │ ├── Direct Query
│ │ └── SPICE
│ └── Data Preparation
│ ├── Calculated fields
│ ├── Filters
│ └── Joins
├── Visualization
│ ├── Visual Types
│ ├── Formatting
│ ├── Interactivity
│ └── Parameters
├── Analysis & Dashboards
│ ├── Analyses
│ ├── Dashboards
│ ├── Stories
│ └── Insights
├── Sharing & Collaboration
│ ├── User management
│ ├── Dashboard sharing
│ ├── Email reports
│ └── Embedding
└── Advanced Features
├── ML Insights
├── Q (Natural language)
├── Paginated reporting
└── Custom actions
Important Pointers for AWS QuickSight
QuickSight offers two editions: Standard and Enterprise, with Enterprise providing additional features like row-level security, AD integration, and encryption.
SPICE (Super-fast, Parallel, In-memory Calculation Engine) is QuickSight's in-memory engine that provides fast analytics.
Standard Edition provides 10GB of SPICE capacity per user, while Enterprise Edition provides 10GB of SPICE capacity per user with option to purchase additional capacity.
SPICE data is automatically encrypted at rest and in transit.
QuickSight can connect to various AWS data sources including Amazon RDS, Aurora, Redshift, Athena, S3, and OpenSearch.
It also supports external databases like MySQL, PostgreSQL, SQL Server, and third-party applications like Salesforce, Jira, and ServiceNow.
QuickSight supports direct querying of data sources or importing data into SPICE for faster performance.
Direct Query mode is useful for real-time data or when data exceeds SPICE capacity limits.
SPICE mode provides faster query performance and is recommended for frequently accessed dashboards.
QuickSight datasets can be refreshed manually or on a schedule (hourly, daily, weekly, or monthly).
The maximum size for a single SPICE dataset is 250GB.
File uploads are limited to 1GB per file.
CSV, TSV, CLF, ELF, JSON, and XLSX file formats are supported for direct uploads.
QuickSight supports row-level security to restrict data access based on user attributes.
Column-level security can be implemented using calculated fields and row-level security.
QuickSight Enterprise Edition supports private VPC connections for secure data access.
QuickSight supports IAM authentication and integration with Microsoft Active Directory.
QuickSight Q is an ML-powered natural language query feature that allows users to ask questions about their data.
ML Insights provides anomaly detection, forecasting, and narrative insights automatically.
QuickSight supports embedding dashboards in applications with two options: user-based embedding and anonymous embedding.
Anonymous embedding uses capacity pricing based on session duration rather than per-user pricing.
QuickSight supports custom themes for branding dashboards.
Dashboard email reports can be scheduled to be sent on a recurring basis.
QuickSight supports parameters that can be used to create interactive dashboards.
Custom actions allow users to create interactive elements like drill-downs and URL actions.
QuickSight supports up to 20 datasets in a single analysis.
A single analysis can have up to 20 sheets.
Each sheet can have up to 30 visuals.
QuickSight supports up to 5,000 users in an account.
QuickSight Enterprise Edition supports reader sessions, which are cost-effective for users who only need to view dashboards.
QuickSight supports cross-region SPICE datasets, allowing data replication across regions.
QuickSight supports data preparation with transformations like joins, filters, and calculated fields.
Calculated fields support various functions including string, date, numeric, and conditional operations.
QuickSight supports Level-Aware Calculations (LAC) for complex aggregations.
QuickSight supports window functions for advanced analytics like running totals and moving averages.
QuickSight supports custom SQL queries when connecting to database sources.
QuickSight supports incremental refreshes for certain data sources to optimize refresh performance.
QuickSight supports IP restrictions to control access from specific IP ranges.
QuickSight Enterprise Edition supports multi-factor authentication (MFA).
QuickSight supports AWS CloudTrail for auditing user activities.
QuickSight supports AWS PrivateLink for private connectivity.
QuickSight supports data source permissions to control who can access specific data sources.
QuickSight supports dataset sharing across users in the same account.
QuickSight supports folder organization for analyses and dashboards.
QuickSight supports versioning for dashboards, allowing you to revert to previous versions.
QuickSight supports dashboard publishing workflows with review stages.
QuickSight supports data source substitution for migrating between environments.
QuickSight supports custom number formatting for visualizations.
QuickSight supports conditional formatting based on data values.
QuickSight supports visual-level filters, sheet-level filters, and analysis-level filters.
QuickSight supports cascading controls where one filter affects the values available in another.
QuickSight supports geospatial visualizations with built-in maps.
QuickSight supports custom geocoding for mapping custom geographic hierarchies.
QuickSight supports time series forecasting with configurable parameters.
QuickSight supports anomaly detection with ML-powered insights.
QuickSight supports narrative insights that automatically generate text descriptions of data trends.
QuickSight supports what-if analysis with parameters and calculated fields.
QuickSight supports dynamic dashboard layouts that adapt to different screen sizes.
QuickSight supports mobile-optimized dashboards for viewing on mobile devices.
QuickSight supports dashboard embedding with SDK for JavaScript.
QuickSight supports custom visual embedding with SDK for embedding specific visuals.
QuickSight supports dashboard linking for navigation between dashboards.
QuickSight supports URL actions for integration with external systems.
QuickSight supports filter actions for interactive cross-filtering between visuals.
QuickSight supports API operations for programmatic management of resources.
QuickSight supports AWS CloudFormation for infrastructure as code deployment.
QuickSight supports AWS Lake Formation integration for data lake security.
QuickSight supports AWS Glue Data Catalog integration for metadata.
QuickSight supports AWS IAM Identity Center (formerly SSO) for single sign-on.
QuickSight supports paginated reports for pixel-perfect reporting (in preview).
QuickSight supports data source refresh failures notifications.
QuickSight supports dashboard usage metrics to track user engagement.
QuickSight supports dashboard sharing with specific users or groups.
QuickSight supports dashboard embedding with multi-tenancy for SaaS applications.
QuickSight supports custom content delivery networks (CDNs) for embedded dashboards.
QuickSight supports custom domains for embedded dashboards.
QuickSight supports row-level security with tags for dynamic data filtering.
QuickSight supports dataset parameters for dynamic data loading.
QuickSight supports direct connection to Amazon OpenSearch Service.
QuickSight supports direct connection to Amazon Timestream for time series data.
QuickSight supports direct connection to Amazon Kinesis Data Firehose for streaming data.
QuickSight supports direct connection to Amazon MSK (Managed Streaming for Apache Kafka).
QuickSight supports direct connection to Amazon DocumentDB.
QuickSight supports direct connection to Amazon Neptune.
QuickSight supports direct connection to Amazon Redshift Spectrum for data lake queries.
QuickSight supports direct connection to Amazon Athena for serverless queries.
QuickSight supports direct connection to Amazon S3 with manifest files for multiple files.
QuickSight supports direct connection to Snowflake.
QuickSight supports direct connection to Teradata.
QuickSight supports direct connection to Oracle.
QuickSight supports direct connection to Microsoft SQL Server.
QuickSight supports direct connection to MySQL and MariaDB.
QuickSight supports direct connection to PostgreSQL.
QuickSight supports direct connection to Presto and Apache Spark.
QuickSight supports direct connection to Exasol and IBM DB2.
QuickSight supports SAML 2.0 for identity federation.
QuickSight supports custom permissions for analyses and dashboards.
QuickSight supports data source credentials vaulting for secure connection management.
QuickSight supports data source connection testing before creating datasets.
QuickSight supports data source connection pooling for improved performance.
QuickSight supports data source query timeout settings to prevent long-running queries.
QuickSight supports data source SSL/TLS encryption for secure connections.
QuickSight supports data source connection via SSH tunneling for additional security.
QuickSight supports data source connection via IAM roles for AWS services.
QuickSight supports data source connection via cross-account IAM roles.
QuickSight supports data source connection via VPC endpoints for private connectivity.
QuickSight supports data source connection via customer-managed VPC for network isolation.
QuickSight supports data source connection via AWS Direct Connect for dedicated connectivity.
QuickSight supports data source connection via AWS Transit Gateway for centralized connectivity.
QuickSight supports data source connection via AWS Client VPN for secure remote access.
QuickSight supports throttling mechanisms to prevent overloading data sources.
QuickSight implements automatic retry logic for transient connection failures.
QuickSight implements connection pooling to optimize database connections.
QuickSight implements query result caching to improve performance.
QuickSight implements progressive loading of visuals for faster dashboard rendering.
QuickSight implements asynchronous loading of visuals to prevent blocking.
QuickSight implements data sampling for large datasets to improve performance.
QuickSight implements query optimization for direct query mode.
QuickSight implements query folding to push operations to the data source.
QuickSight implements query parallelization for improved performance.
QuickSight Editions Comparison
Feature | Standard Edition | Enterprise Edition |
---|---|---|
Pricing | $9/user/month (annual) | $18/user/month (annual) |
SPICE Capacity | 10GB per user | 10GB per user (additional available) |
User Management | IAM, email invitation | IAM, email invitation, AD integration |
Row-Level Security | No | Yes |
Private VPC Access | No | Yes |
Email Reports | Yes | Yes |
Embedding | No | Yes (user-based and anonymous) |
Reader Sessions | No | Yes |
ML Insights | Limited | Full |
Encryption at rest | Yes | Yes with CMK option |
MFA | No | Yes |
Column-level security | No | Yes |
Paginated reporting | No | Yes (preview) |
QuickSight Visual Types
Visual Type | Best Used For | Key Features |
---|---|---|
Bar Chart | Comparing values across categories | Horizontal/vertical orientation, stacked/grouped options |
Line Chart | Showing trends over time | Multiple lines, area options, forecasting |
Pie/Donut Chart | Showing composition of a whole | Percentage visualization, limited categories |
Table | Showing detailed data | Pagination, sorting, conditional formatting |
Pivot Table | Summarizing data with multiple dimensions | Row/column hierarchies, value calculations |
KPI | Highlighting a single metric | Comparison to target, trend indicator |
Gauge | Showing progress toward a goal | Min/max range, thresholds |
Heat Map | Showing patterns in dense data | Color intensity, size options |
Tree Map | Hierarchical data with size relationships | Nested rectangles, color coding |
Geospatial Map | Geographic data visualization | Points, heat maps, choropleth maps |
Scatter Plot | Relationship between two variables | Correlation analysis, clustering |
Box Plot | Distribution and outliers | Quartiles, median, outliers |
Histogram | Data distribution | Binning, frequency analysis |
Word Cloud | Text frequency analysis | Word sizing by frequency |
Sankey Diagram | Flow visualization | Connection width proportional to flow quantity |
Combo Chart | Multiple metrics with different scales | Bar and line combinations |
Example Calculations in QuickSight
Basic Calculated Field Examples:
- Simple Calculation:
revenue - cost
- Percentage Calculation:
(revenue - cost) / revenue
- Conditional Logic:
ifelse(revenue > 1000, 'High Value', 'Standard')
- Date Manipulation:
dateDiff(shipDate, orderDate, 'DD')
- String Manipulation:
left(customerName, 1) + '. ' + split(customerName, ' ', 2)
Level-Aware Calculation Examples:
- Year-to-Date Sales:
runningSum(sum(sales), [date])
- Percent of Total:
sum(sales) / windowSum(sum(sales))
- Rank by Sales:
rank(sum(sales), [category], 'desc')
- Moving Average:
movingAvg(sum(sales), 3, [date])
- Year-over-Year Growth:
(sum(sales) - previous(sum(sales), 1, [date])) / previous(sum(sales), 1, [date])
QuickSight Service Limits and Performance Considerations
Resource | Limit | Notes |
---|---|---|
SPICE capacity | 10GB per user | Additional capacity can be purchased |
Maximum SPICE dataset size | 250GB | For larger datasets, use direct query |
File upload size | 1GB | Per file |
Rows per dataset | No hard limit | Performance may degrade with very large datasets |
Columns per dataset | 2,000 | Best practice is to use fewer columns |
Users per account | 5,000 | Enterprise Edition |
Datasets per analysis | 20 | |
Sheets per analysis | 20 | |
Visuals per sheet | 30 | |
Scheduled refreshes | Hourly, daily, weekly, monthly | |
API request rate | 10 requests per second | Throttling applies |
Dashboard embedding | 100 dashboards per AWS account | Can be increased via support |
Dashboard viewers | No limit | Performance may vary with many concurrent users |
Custom VPC connections | 10 per AWS account | |
Data source connections | 1,000 per AWS account |
Data Ingestion and Throughput Characteristics
-
SPICE Ingestion Performance:
- Throughput varies based on data source and complexity
- Typically 10-50 MB/s for simple datasets
- Complex transformations may reduce throughput
- Parallel ingestion for improved performance
-
Direct Query Performance:
- Depends on underlying data source performance
- Network latency between QuickSight and data source
- Query complexity and data volume
- Connection pooling optimizes performance
-
Throttling and Rate Limits:
- API requests limited to 10 requests per second
- SPICE ingestion has service quotas based on account
- Implement exponential backoff for API calls
- Use batch operations where possible
-
Overcoming Rate Limits:
- Implement retry logic with exponential backoff
- Batch operations for efficiency
- Schedule operations during off-peak hours
- Request quota increases for persistent needs
-
Replayability of Data Ingestion:
- SPICE refreshes can be scheduled or triggered via API
- Failed refreshes can be retried manually or programmatically
- Incremental refreshes supported for certain data sources
- Logging and monitoring for tracking refresh status
CloudWatch Metrics for Monitoring QuickSight
Metric | Description | Recommended Alarm |
---|---|---|
SPICEIngestionFailure | Tracks failed SPICE ingestion jobs | > 0 for critical datasets |
SPICEIngestionLatency | Time taken to complete SPICE ingestion | > historical baseline + 20% |
DashboardViewCount | Number of dashboard views | < historical baseline - 30% (potential access issues) |
DashboardRenderingLatency | Time taken to render dashboards | > 5 seconds |
UserConcurrentAccess | Number of concurrent users | Near account limits |
APIRequestCount | Number of API requests | Near throttling limits |
DirectQueryExecutionTime | Time taken for direct queries | > 10 seconds |
SPICEQueryExecutionTime | Time taken for SPICE queries | > 2 seconds |
DatasetRefreshFailure | Failed dataset refresh attempts | > 0 for critical datasets |
EmbeddedDashboardAccess | Access count for embedded dashboards | Monitor for usage patterns |
Open Source Components and Integration
QuickSight doesn't directly expose open source components to users, but it leverages several open source technologies internally:
Presto/Trino: QuickSight's SPICE engine is built on technology similar to Presto for distributed SQL query processing.
D3.js: QuickSight's visualization layer likely uses D3.js or similar libraries for rendering charts.
Apache Parquet: SPICE likely uses columnar storage formats similar to Parquet for efficient data storage.
Comparison with open source alternatives:
Feature | QuickSight | Tableau | Power BI | Apache Superset |
---|---|---|---|---|
Deployment | Fully managed SaaS | Self-hosted or cloud | Self-hosted or cloud | Self-hosted |
Pricing | Pay-per-user or capacity | Upfront license | Subscription | Open source (free) |
AWS Integration | Native | Limited | Limited | Limited |
Scalability | Serverless, auto-scaling | Manual scaling | Limited scaling | Manual scaling |
Setup complexity | Low | Medium | Medium | High |
Customization | Limited | Extensive | Extensive | Extensive |
Community | AWS community | Large community | Large community | Open source community |
Enterprise features | Built-in | Extensive | Extensive | Limited |
Top comments (0)