What is Predicate Filtering?

It is the ability to skip reading unnecessary data at storage level from a data set.

Easy description: It filters the WHERE of your SELECT at Storage Level, so it does NOT need to load the full data set into RAM to filter rows or columns.

It matters very much when you are dealing with TB-size databases and it makes a big difference for Reporting and Data Warehousing environments.

 

Without Predicate Filtering 

Databases using this method:

  • Oracle Enterprise Edition
  • PostgreSQL
  • MariaDB
  • etc

 

* While no indexes present. (FULL SCAN)

With Predicate Filtering

Databases using this method:

  • Tibero Database
  • Oracle Exadata
  • IBM Netezza (Manual config required)
  • ?

 

* While no indexes present. (STORAGE SCAN)

Predicate Filtering – in action –

We do not need a huge table to spot the difference.

In the following example we use a table of simple 93 Gigabytes. If bigger, the difference will be much higher.

Let’s suppose your developer makes a query on a column that there is no index.

In this case, your explain plan may look like this. Use real-world production infrastructure, here is the comparison:

Oracle Database Enterprise Edition 19c 19.3 APR2019 (24 vCPUs, 512 GB RAM)

---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |    20 |  3330K  (1)|          |
|   1 |  SORT AGGREGATE    |            |     1 |    20 |            |          |
|*  2 |   TABLE ACCESS FULL| H_LINEITEM |    14M|   267M|  3330K  (1)|          |
---------------------------------------------------------------------------------
Elapsed: 114 seconds

In Tibero, the Predicate filtering is included and basic feature for any high available configuration and the SQL Plan seems like this:

Tibero Enterprise Edition 6 FS07(1 vCPUs, 8 GB RAM)

Execution Plan
--------------------------------------------------------------------------------
   1  COLUMN PROJECTION (Cost:2695456, %%CPU:0, Rows:1)
   2    SORT AGGR (Cost:2695456, %%CPU:0, Rows:1)
   3      TSCAN FULL WITH STORAGE SCAN: H_LINEITEM (Cost:2691305, %%CPU:0, Rows: 42931592)
Total elapsed time 5 seconds