Skip to content Skip to sidebar Skip to footer

Sqlalchemy: Find Difference Between Array Columns

I have a table called UnitOfWork which has 3 columns cases_identified, cases_completed and cases_double_check, all of which are Postgresql arrays of integers. Is it possible to wri

Solution 1:

For this answer I'll assume that cls.cases_identified, cls.cases_completed, and cls.cases_double_check are of postgresql.ARRAY(Integer) type in Python side.

Your @todo.expression should return just that: an SQL expression. Currently it is trying to return a python list. The exception is raised since a postgresql.ARRAY does not support the in operator, though it does have a method contains, which maps to the @> operator in Postgresql and tests "if elements are a superset of the elements of the argument array expression". This on the other hand is not what you want. You're lucky that you had the if x not in ... in there, as plain

[x for x in cls.cases_identified]

seems to result in an infinite loop instead of an exception.

Getting the difference between arrays in Postgresql has been covered here extensively, but here's how you would apply that using SQLAlchemy, first using an array constructor:

from sqlalchemy import func

...

classUnitOfWork(...):

    @todo.expressiondeftodo(cls):
        # Force correlation of `cls` from outer FROM objects. Use# the `select()` shorthand method of `FunctionElement`.
        identified = func.unnest(cls.cases_identified).select().correlate(cls)
        completed = func.unnest(cls.cases_completed).select().correlate(cls)
        double_check = func.unnest(cls.cases_double_check).select().correlate(cls)
        # Create the subquery statement
        stmt = identified.except_(completed.union(double_check))
        # Uses the `func` generic for ARRAY constructor from# subquery syntax. The subquery statement has to be# converted to a "scalar subquery" in the eyes of SQLA# (really just a subquery returning 1 col) in order# to remove it from the FROM clause hierarchy.return func.array(stmt.as_scalar())

This has a downside of not providing any FROM objects (as it correlates everything from the enclosing query), so you'd have to issue the original query like this:

test = Session.query(UnitOfWork.todo).select_from(UnitOfWork).first()

You could also use the Postgresql intarray module that provides special functions and operators for null-free arrays of integers:

classUnitOfWork(...):

    @todo.expressiondeftodo(cls):
        return (cls.cases_identified - cls.cases_completed - 
                cls.cases_double_check)

Note that you have to install the extension first in Postgresql:

CREATE EXTENSION intarray;

Post a Comment for "Sqlalchemy: Find Difference Between Array Columns"