Design Philosophy
Wvlet query language is designed to meet the following principles:
Core
The design of syntax should follow a typing order of left-to-right, top-to-bottom as much as possible to enable seamless typing so as not to distract the user from the flow of data processing.
The language also needs to incorporate the best practices of software engineering in order to make the query reusable (modularity) and composable for building more complex queries and data processing pipelines.
Each relational operator processes the input table data and returns a new table data, but the returned data is not limited to a simple table format. It can be a nested objects with additional metadata to enrich the query results.
Syntax Design Choices
Use Lower Cases for Keywords
All keywords in Wvlet must be lower cases to reduce typing efforts and maintain the syntax consistency. In SQL, using both upper-cases or lower-cases like select or SELECT is allowed, but this makes the query format inconsistent between users or even in a single query. This also adds unnecessary complication to the parser implementation for managing upper-case keywords.
Consistent String Quotations
Use '...'
(single quotes) and "..."
(double quotes) for the convenience of writing string literals, and use `...`
(back quotes) for describing column or table names, which might contain special characters or spaces.
Break Down SELECT
The SELECT statement in SQL is a quite complex operator, which can do multiple operations at the same time, including aggregation, adding, removing, or renaming columns, annotating columns with aliases, changing column orders, etc. Wvlet breaks down this functionality into different operators agg
, add
, exclude
, transform
, shift
, etc. With these new operators, users don't need to enumerate all columns in the SELECT statement, which makes the query more readable and easier to maintain.
SELECT
sum(c1),
-- Rename c2 with an alias
c2 as c2_new,
-- skip c3 for exclusion
-- Add a new computed column
c4 + c5 as c101,
-- Shift c6 and c7 to the end
c8,
...
...,
c100,
c6,
c7,
FROM tbl
In Wvlet, you can write the same query as follows:
from tbl
-- Add a simple aggregation
add c1.sum
-- Rename c2 with an alias
transform c2 as c2_new
-- Exclude c3
exclude c3
-- Add a new computed column
add c4 + c5 as c101
-- Shift c6 and c7 to the end
shift to right c6, c7
As tables of log data can have hundreds of columns, it is not practical to enumerate all columns in the SELECT statement. By breaking down the SELECT statement into multiple operators, users can focus on the data processing logic rather than the column enumeration.