Friday, January 27, 2012

Oracle Goldengate

I always laugh at Oracle’s sale pitch for Oracle Goldengate. The conversation is like this:
We need to implement an ETL process …
Oracle Goldengate is just for that.
We need a long distance disaster recovery solution …
Oracle Goldengate is great for it.
We have RAC but facing downtime for no rolling patching …
Oracle Goldengate is a must.
We need to pull data from other database vendor’s databases for real time BI …
Oracle Goldengate of course.

However, after some research on Oracle Golden gate, I start to like it more. First, it supports replication of a wide range of database vendors from MySQL to DB2. MySQL is quite popular for software development nowadays but few replication products support it. Secondly, it uses transaction logs (redo log for oracle, binLog for MySQL) for Changed Data Capture (CDC) that has performance advantage and no change to the databases involved comparing with other CDC methods, such as trigger based. The third, it has rich set of built-in functions for data transformation and mapping on fly. In conclusion, it may not be the solution for every database problem, but can meet many IT challenges if used properly.

MySQL replication

MySQL replication is easy to setup but very basic comparing to the Oracle replication. But my biggest complaint is that it allows user to setup Master-to-Master replication. Because MySQL replication is asynchronous, The ACID property of modern relational database is down to the drain with the MySQL Master-to-Master replication. The correct way for MySQL Master-to-Master is MySQL cluster setup.

This example shows the problem with MySQL Master-to-Master replication. Master A executes "update table a_table set col1 = 1 where id =1;" and master B executes "update table a_table set col1 = 2 where id =1;" at about the same time, there are three possible outcomes:
1. master A: a_table col1=2 where id = 1 and master B: a_table col1=1 where id = 1, the two masters are out of sync
2. for both masters: a_table col1=1 where id = 1
3. for both masters: a_table col1=2 where id = 1
There is no way to predict what outcome of one the three with MySQL multimaster replication.

Change Database name

We all know how to change the database name by recreating the control file. However, I consider this is a risky operation especially if you have a large database that takes many hours to recover. Start with Oracle 10g, we can use a new tool called nid to do the same thing without do resetlogs operation, plus the DB ID is not changed.

servername$nid target=sys/ dbname=newsidname setname=yes
DBNEWID: Release 11.2.0.1.0 - Production on Wed Jan 25 10:43:08 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to database sidname (DBID=3903245474)
Connected to server version 11.2.0
Control Files in database:
/ora01/app/oracle/oradata/oldsidname/control01.ctl
/ora02/app/oracle/oradata/oldsidname/control02.ctl
Change database name of database oldsidname to newsidname? (Y/[N]) => Y
Proceeding with operation
Changing database name from oldsidname to newsidname
Control File /ora01/app/oracle/oradata/oldsidname/control01.ctl - modified
Control File /ora02/app/oracle/oradata/oldsidname/control02.ctl - modified
Datafile /ora01/app/oracle/oradata/oldsidname/system01.db - wrote new name
Datafile /ora01/app/oracle/oradata/oldsidname/sysaux01.db - wrote new name
Datafile /ora01/app/oracle/oradata/oldsidname/undotbs01.db - wrote new name
Datafile /ora02/app/oracle/oradata/oldsidname/users01.db - wrote new name
Datafile /ora01/app/oracle/oradata/oldsidname/temp01.db - wrote new name
Control File /ora01/app/oracle/oradata/oldsidname/control01.ctl - wrote new name
Control File /ora02/app/oracle/oradata/oldsidname/control02.ctl - wrote new name
Instance shut down
Database name changed to newsidname.
Modify parameter file and generate a new password file before restarting.
Succesfully changed database name.
DBNEWID - Completed succesfully.

Monday, September 5, 2011

Oracle Compression

This blog is about Oracle table compression. As we know Oracle offers table compression basic and table compression advanced. Let us compare them.

Licensing: Basic compression is included with Oracle EE license. Advanced compression is $11,500 per CPU. Assume 2x compression ratio, you need 22 TB database before the compression to justify the cost of license for advanced table compression with a one CPU server.

Compression ratio: Both basic and advanced compressions do block level compression. So it can make compression for repeat values both in rows and in columns. General rules for great compression ratio are few distinct values in a block, larger size of values in term of big number or long string, and large block size. Sorted rows will have better compression ratio because few distinct values in a given block. Because some of the data in advanced compression block is not compressed., advanced compression ratio is a little less than the basic compression.

Concurrent access: Basic compression is only effective with bulk insert. Additionally, only one operation can be done at a time with basic compression. Otherwise, Oracle would give “ORA-12838: cannot read/modify an object after modifying it in parallel” error. Advanced compression does not have such restriction.

Performance: Compression uses some CPU resource but needs fewer disks IO. Based on the testing, the performance for insert, delete is relatively good. The update of compressed table is about 10 times slower than the no compressed table and the compressed table grows to the twice the size of no compressed table. The performance for updating with basic compression is worse than that of advanced compression with large number of chained rows.

Calculate the compression ratio: To calculate the compression rate, we count the blocks with rows by following SQL: “Select count(distinct(dbms_rowid.rowid_block_number(rowid))) from a_table;” and a_table is either compressed or not compressed table.

Friday, July 29, 2011

RMAN backup test without downtime

As a responsible DBA, periodically checking the database restore and recovery is a must to ensure the disaster recovery plan works. Unfortunately, most of backup restore and recovery procedures require scheduled downtime either at database level or at tablespace level. For system with high availability requirement, such scheduled downtime is hard to come by. In order to test the reliability of restore/recovery of such system, it is possible to carry out the procedure very close to full scale restore/recovery without take database system offline.

Option 1: RMAN validate:

RMAN can validate the database restore at database level, backupset level, datafile level, and data block level. The RMAN validate checks the RMAN backup file is accessible and no physical and/or logical corruption. The following RMAN commands are used to perform validation.
• VALIDATE …
• BACKUP ... VALIDATE
• RESTORE ... VALIDATE
For example: “VALIDATE DATABASE;” checks if current database files have intrablock corruption. “BACKUP DATABASE VALIDATE;” creates the backupset of the database and check for block level corruption for online database files. “RESTORE DATABASE VALIDATE” performs the database restore without actually writing the database file to disk. VALIDATE can be used for backup validation as well, such as “VALIDATE BACKUPSET ;” DBA can check the backupset using “RESTORE … VALIDATE” similarly by “SET UNTIL TIME …” right after the backupset is created. In short, both “VALIDATE …” and “RESTORE … VALIDATE” can be used for RMAN backup validation. RMAN VALIDAE could not detect interblock corruption of datafile and that is for dbverify.

Option2: Restore to new location

You can complete the whole nine yard of restore without database/tablespace downtime. Use “SET UNTIL TIME …” to select the time scale of the restore and “SET NEWNAME FOR …” to select the location of the restore. For example:
RUN
{
set until time "to_date('2011-07-15:19:00:00','YYYY-MM-DD:hh24:mi:ss')";
set newname for datafile '/nas01/app/oracle/oradata/orcl/users01.dbf' to '/tmp/users01.dbf';
RESTORE datafile 4;
}

Monday, March 14, 2011

Oracle statistics collection

We all know good old “analyze table” SQL statement to collect statistics of a table. With newer version of Oracle database release, dbms_stats package becomes preferred method and it is much more powerful than “Analyze table” SQL. Starting with Oracle 10g, oracle uses dedicated background process to collect database statistics and it is on by default. However, because oracle considering column cardinality is independent to each other, the optimizer can generate bad execution plan if the column’s data distributions are related, such as column state and column zip code. To make the oracle to do the right thing, we can use dynamic sampling hint to ask Oracle collect statistics on fly just for the query to be run to get correct cardinality. Dynamic sampling does add some load tax on database and Oracle would not store the good execution plan generated with dynamic sampling. Start with Oracle 10g, Oracle SQL profile would collect stats for a query and store the execution plan to run such query all the time later. It is particularly useful that you can modify the SQL profile manually to let a query run the execution plan you like and with “force match” option, the execution plan can be applied to similar query that does not use binding variables. Add hint to a SQL becomes so much a thing of yesterday with SQL profile.

Saturday, March 12, 2011

Many faces of an Oracle data block

There are all kinds of data block in Oracle database, such as table segment block, index data block, undo segment block, or data file head block, to name a few. Let’s just talk about the table data block for now.

For a single table data block, Oracle can have many copies of it and many versions of it. The data block could be a disk image or memory image in data buffer cache. Oracle can update the data block and such change is not necessary to be written to disk immediately. In other word, the versions of the data block in memory and on disk are different. Things get more complicated in RAC. Current version of data block can exist in multiple Oracle instances. To implement Oracle read consistence, the old image of data block called Past Image in memory is constructed and is sent to remote instance because remote instance has no access of local Undo segment to construct read consistent data.It is possible that many versions of Past Image data blocks in a single instance.