Hierarchical Queries and Oracle
Not everything is as fun as the first time you discover the power of monads and for-comprehensions, or realize the state machine you wrote with pattern matching is super easy to read and follow! Some days its business as usual, which as a software developer means you’re usually trying to bend a framework to your will (or cursing the estimate you were forced to give). Today it would happen to be Slick and native SQL queries, namely hierarchical queries.
AFAICT, Slick doesn’t provide a way to express hierarchical queries1 (or something that would generate a hierarchical SQL query) so I resorted to “static SQL”: Slick provides the ability to express native SQL via string interpolation, which delegates to a macro for generating code that builds a query. Now this in itself is pretty neat and demonstrates the power of macros.
Let’s say we have a self-referencing table in Oracle:
create table "EMPLOYEE" ("ID" RAW(32) NOT NULL PRIMARY KEY,"NAME" VARCHAR(100) NOT NULL,"MANAGER_ID" RAW(32)) alter table "EMPLOYEE" add constraint "MANAGER_FK" foreign key("MANAGER_ID") references "EMPLOYEE"("ID");
For any given employee, we'd like to determine who they report to, who their manager reports to, etc (i.e. the chain of command). Oracle SQL supports two syntaxes for hierarchical queries: CONNECT BY and recursive WITH clauses. (Although the latter is preferred as its ANSI SQL compliant, the examples here use CONNECT BY
.)
- For a single employee, this returns only the chain of managers (note the
LEVEL > 1
):
SELECT id, name, LEVEL FROM employee WHERE LEVEL > 1 START WITH id=<employee id> CONNECT BY PRIOR manager_id = id
- For multiple employees, this returns a flattened hierarchy which can then be easily reconstructed in Scala code (the LEVEL is returned so leaf nodes can be determined):
SELECT id, name, manager_id, LEVEL FROM employee START WITH id IN (<set of employee ids>) CONNECT BY PRIOR manager_id = id ORDER BY LEVEL ASC
Today we'll focus on the first query.
Slick Static SQL
A plain SQL statement can be written in Slick and turned into a DBIOAction
via the sql
interpolator (see this for more info on Scala interpolation). Values can be injected by prefixing them with $
(more on that later).
val query = sql""" SELECT id, name, LEVEL FROM employee WHERE LEVEL > 1 START WITH id=$employeeId CONNECT BY PRIOR manager_id = id""".as[Employee]
The sql
interpolator delegates parsing to a macro and passes the result to a slick.jdbc. SQLActionBuilder
to build the query. The query results need to be converted to a collection of domain objects. The SQLActionBuilder
has an as[R](implicit rconv: GetResult[R])
method which takes a type parameter representing the target type to convert to.
Binding custom types in Static SQL
The as
method requires an instance of GetResult[R]
be available implicitly. GetResult[R]
is essentially a wrapper around a java.sql.ResultSet
that allows one to get results and convert them to a custom type. In fact GetResult[R]
is actually defined as trait GetResult[+T] extends (PositionedResult => T)
. Remember functions are types in Scala, so you can extend even a Function
type and implement an apply
method.
A slick.jdbc.PositionedResult
allows one to iterate through a ResultSet
via nextXXX
methods like nextInt
, nextBytes
, etc. Below is our implementation:
import java.util.UUID import slick.jdbc.GetResult import driver.api._ // For reference (this would be defined in a different file) case class Employee(id: UUID, name: String) // driver.columnTypes.uuidJdbcType is an UUIDJdbcType instance // UUIDJdbcType.fromBytes takes a byte array and returns an UUID implicit val getEmployeeResult = GetResult[Employee](r => Employee(driver.columnTypes.uuidJdbcType.fromBytes(r.nextBytes()), r.nextString()))
Out of the box, Slick provides implicit implementations of GetResult[R]
for BigDecimal, Boolean, Byte, Date, Float, Int, Long, Short, String, Time, Timestamp
. It does not provide one for UUID
.
Binding parameters in Static SQL
After implementing this, compilation still fails! The compiler spits out:
could not find implicit value for parameter e: slick.jdbc.SetParameter[java.util.UUID]
Well gee if you know exactly what’s missing Mr. Scalac, can’t you just friggin' generate it for me? It turns out the line START WITH id=$employeeId
is what causes the error. employeeId
is an UUID type. Slick doesn't automatically know how to deal with UUID
types when setting parameters on a static query. SetParameter[R]
is a wrapper around java.sql.PreparedStatement
and provides helper methods to set parameters on it. Below is one for UUID
:
implicit object SetUUID extends SetParameter[UUID] { def apply(v: UUID, pp: PositionedParameters) { pp.setBytes(driver.columnTypes.uuidJdbcType.toBytes(v)) } }
Now the code compiles and the query runs successfully. I'm a little disappointed because for some reason I expected UUID
s to just work since they're handled properly in schemas (column[UUID]
) and Slick's query DSL. You have to jump through a few more hoops to handle custom types in static SQL than you do in Slick's query DSL via MappedColumnType
.
Aside: Frameworks and expressivity
Good languages are designed to be expressive. You're free to disagree with me, but SQL, given its intended purpose, is an expressive language. I inherently have an issue with ORMs and libraries like Slick that try to optimize for expressivity at a higher layer, while losing on the expressivity provided by the substrate. Maybe I'm old-fashioned, but this is not always a desirable tradeoff. I've never written an application that didn't require dropping down to native SQL (for various reasons)… Maybe I should check out Anorm, which doesn't try to be an ORM. I was the kid that never liked pre-determined Lego kits. I liked rummaging through the box of random blocks I accumulated over the years and letting my creativity run wild. Maybe that explains a lot about me…
-
If there is a known maximum depth to any hierarchy, one could write a query as a series of left self joins. This is expressible via Slick's functional query DSL. For larger depths, this probably isn't preferrable (but I haven't compared query plans to know for sure). ↩