I'm writing this short blog while solving challenge at KC7 platform, this question hit me when to use "distinct" and "Project" in KQL aka Kusto Query Language.
so let's understand
1. distinct:
Use distinct when you want to eliminate duplicate values from a specific column or set of columns.
It helps you find unique values in your dataset.
Example: If you want to find all unique IP addresses from a log table, you would use distinct.
OutboundNetworkEvents
| distinct src_ip
2. project:
Use project to select specific columns from your dataset. It allows you to control which columns are included in the result.
It’s useful when you want to create a new table with only the columns you need, possibly with some transformations.
Example: If you want to display only the source IP and URL from the logs, you would use project.
OutboundNetworkEvents
| project src_ip, url
Can we combine both?
yep of course!
- You can use project to select specific columns first, and then use distinct to find unique values within those columns.
- Example: If you want to find unique source IP addresses and URLs, you would use both project and distinct.
OutboundNetworkEvents
| project src_ip, url
| distinct src_ip, url
Conclusion:
- Use distinct to remove duplicates and get unique values.
- Use project to select and transform specific columns.
Top comments (0)