DEV Community

Alberto Robertson
Alberto Robertson

Posted on

Running SQL on files with the esProc is very convenient, on par with duckDB

esProc SPL, to query CSV, JSON, Excel and other files, you can use it directly after installation, which is very convenient.

source

Query orders with sales volume greater than 1 from CSV:

$select * from D:/data/orders.csv where quantity>1
Enter fullscreen mode Exit fullscreen mode

The results of each step are saved in a cell, and the next step can be based on the above results to continue querying. For example, if you also want to calculate the sales of each product with a sales volume greater than 1:

$select product,sum(amount) from {A1} group by product
Enter fullscreen mode Exit fullscreen mode

The result of the previous step can be referenced using {cell name}.

Of course, a single SQL statement can also implement this task, but what if both of these results are what you want? What if we need to calculate the number of orders with a unit price of 100 or more in orders with sales volume greater than 1? Moreover, for some complex calculations, the excessively long SQL may appear blurry. With this ’ SQL step-by-step mechanism ’, it is easy to reuse and simplify calculations.

In addition to being used in IDEs, it also offer the JDBC driver and can be integrated in applications to serve as an embedded database.

A slight drawback is that the SQL provided by esProc is not comprehensive enough, it is only a subset of SQL92. There are no problems with regular ones, including subqueries and WITH. However, EXISTS is not yet supported, and there is no window function available.

For tasks that require more complex SQL to support, esProc provides a more powerful native language SPL.

For example, this SQL statement uses the KEEP function unique to ORACLE, and using SQL of other databases would be more cumbersome:

SELECT department_id, 
MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct) worst, 
MAX(salary) KEEP (DENSE_RANK LAST ORDER BY commission_pct) best 
FROM employees 
GROUP BY department_id 
ORDER BY department_id
Enter fullscreen mode Exit fullscreen mode

SPL has good support for order-related calculations, and it is relatively simple to write:

SPL code

Symbols like ~ often appear in SPL code, which can be confusing at first glance. This is a unique syntax of SPL, and once understood, it becomes very convenient.

In esProc, SQL and SPL syntax can also be used in combination. Regardless of how the results are calculated earlier, they are saved in the cells and can continue to be calculated using SQL later. The more common mixed usage pattern is to handle irregular formats and data sources.

For example, colon separated text:

$select * from {file("d:/Orders.txt").import@t(;":")}
where Amount>=100
Enter fullscreen mode Exit fullscreen mode

Writing SPL code into {} can be used as a table.
Read the specified sheet in Excel:

$select * from {file("D:/Orders.xlsx").xlsimport@t(;"sheet3")}
where Amount>=100
Enter fullscreen mode Exit fullscreen mode

Read Restful data:

$select * from {json(httpfile("http://127.0.0.1:6868/api/getData").read())}
where Amount>=100
Enter fullscreen mode Exit fullscreen mode

If SQL can handle it, use SQL. If SQL can’t do it or you can’t work it out, use SPL. It’s a completely seamless mix, which is superior than DuckDB, which relies on Python for complex calculations.

esProc is open source, and its source code is available here: esProcSPL

Top comments (0)