How to create an Oracle materialized view that performs well on fast refresh ?
Recently, working for a client, we faced an interesting challenge using Oracle database. It's been quite painful to gather scattered piece of information to get to the goal, so I thought I could summarize and share it here.
Let's imagine you have 20 million addresses, and you want to enrich them with the country they belong to. You have a country look up table. And you want to keep track of how the country has been enriched, so you need a dedicated table to store addresses' country (or you would violate normalization rules).
The core problem have been this: how to allow displaying AND modifying data stored in quite big tables (milions of records) with acceptable performance for the end user.
Let's go into it and create our model first!
We create and populate the address table
Then we create and populate the country look up table:
Finally we create and populate the enrichment table:
Then, selecting data is a bit slow:
Takes ~9s with my set up.
Important remark: all addresses may not have been enriched, but you want to see them all, because the whole point of showing addresses in this context is to enrich the ones that have not been enriched yet. So you have to use outer joins in your select. Moreover, in real life, we have more columns to select and more tables to join so the performance is far from being acceptable.
So we can take advantage of materialized views:
Which takes ~25s with my personnal setup.
But for selecting:
50 records are fetched within 0.026s.
When the enrichment process is automated and executes overnight, it's perfectly acceptable. But in our case, we need to have a web application that allows user to manually enrich addresses, which is the reason why we want to display and modify data with good performance. Then 25s to refresh the materialized view is even worse than 5s to show the 50 first records.
Luckily for us, Oracle implemented a "fast refresh" mode for materialized views. In order to activate fast refresh, we have to create materialized view logs on the underlying tables. Let's create them:
Then we can enable fast refresh:
But then, when we try to refresh:
This error is thrown:
Since we created the logs, why can't they be used ?
The reason is "the rowids of all the detail tables must appear in the SELECT list of the materialized view query definition", so we have to add them:
Which leads to another error:
This one has been a bit tricky to investigate ! This is due to the fact that fast-refresh does not support the Ansi join syntax (as shown on Ask Tom for instance). So we have to re-write our materialized view creation statement:
Now, let's update one record and refresh the materialized view:
The refresh actions takes ~11 seconds on my laptop. Better, but still not satisfying.
Let's have a look at what happened. When looking at the trace we can see that the statement is this one:
And the explain plan says:
We'll address the full table access on the table country_adrdress later, but first we need to add indexes on enriched_addresses:
And then we update a value again and refresh the materialized view:
~5s. Better, but still too much.
The statement is the same, but the explain plan became:
This is the other weird behaviour which took me a while to understand. The Oracle knowledge database says "This problem is caused by unpublished Bug 4196039, "Inefficient plan causing poor incremental refresh for Materialized View", and for those not having access, the Ask Tom version, Alberto Dell’Era’s blog version or Dominic Brooks’ one.
So we gather materialized view stats while it’s empty and lock them:
And being connected as sys, we set the _mv_refresh_use_stats parameter to true:
Let's update data again:
Which finally takes 0.099s. Here we are ! 🚀
Becareful with the "_mv_refresh_use_stats”, apparently it can lead to problems in some cases, as mentionned on Dominique Brook’s blog.