Monday, September 5, 2011

Oracle Compression

This blog is about Oracle table compression. As we know Oracle offers table compression basic and table compression advanced. Let us compare them.

Licensing: Basic compression is included with Oracle EE license. Advanced compression is $11,500 per CPU. Assume 2x compression ratio, you need 22 TB database before the compression to justify the cost of license for advanced table compression with a one CPU server.

Compression ratio: Both basic and advanced compressions do block level compression. So it can make compression for repeat values both in rows and in columns. General rules for great compression ratio are few distinct values in a block, larger size of values in term of big number or long string, and large block size. Sorted rows will have better compression ratio because few distinct values in a given block. Because some of the data in advanced compression block is not compressed., advanced compression ratio is a little less than the basic compression.

Concurrent access: Basic compression is only effective with bulk insert. Additionally, only one operation can be done at a time with basic compression. Otherwise, Oracle would give “ORA-12838: cannot read/modify an object after modifying it in parallel” error. Advanced compression does not have such restriction.

Performance: Compression uses some CPU resource but needs fewer disks IO. Based on the testing, the performance for insert, delete is relatively good. The update of compressed table is about 10 times slower than the no compressed table and the compressed table grows to the twice the size of no compressed table. The performance for updating with basic compression is worse than that of advanced compression with large number of chained rows.

Calculate the compression ratio: To calculate the compression rate, we count the blocks with rows by following SQL: “Select count(distinct(dbms_rowid.rowid_block_number(rowid))) from a_table;” and a_table is either compressed or not compressed table.

Friday, July 29, 2011

RMAN backup test without downtime

As a responsible DBA, periodically checking the database restore and recovery is a must to ensure the disaster recovery plan works. Unfortunately, most of backup restore and recovery procedures require scheduled downtime either at database level or at tablespace level. For system with high availability requirement, such scheduled downtime is hard to come by. In order to test the reliability of restore/recovery of such system, it is possible to carry out the procedure very close to full scale restore/recovery without take database system offline.

Option 1: RMAN validate:

RMAN can validate the database restore at database level, backupset level, datafile level, and data block level. The RMAN validate checks the RMAN backup file is accessible and no physical and/or logical corruption. The following RMAN commands are used to perform validation.
• VALIDATE …
• BACKUP ... VALIDATE
• RESTORE ... VALIDATE
For example: “VALIDATE DATABASE;” checks if current database files have intrablock corruption. “BACKUP DATABASE VALIDATE;” creates the backupset of the database and check for block level corruption for online database files. “RESTORE DATABASE VALIDATE” performs the database restore without actually writing the database file to disk. VALIDATE can be used for backup validation as well, such as “VALIDATE BACKUPSET ;” DBA can check the backupset using “RESTORE … VALIDATE” similarly by “SET UNTIL TIME …” right after the backupset is created. In short, both “VALIDATE …” and “RESTORE … VALIDATE” can be used for RMAN backup validation. RMAN VALIDAE could not detect interblock corruption of datafile and that is for dbverify.

Option2: Restore to new location

You can complete the whole nine yard of restore without database/tablespace downtime. Use “SET UNTIL TIME …” to select the time scale of the restore and “SET NEWNAME FOR …” to select the location of the restore. For example:
RUN
{
set until time "to_date('2011-07-15:19:00:00','YYYY-MM-DD:hh24:mi:ss')";
set newname for datafile '/nas01/app/oracle/oradata/orcl/users01.dbf' to '/tmp/users01.dbf';
RESTORE datafile 4;
}

Monday, March 14, 2011

Oracle statistics collection

We all know good old “analyze table” SQL statement to collect statistics of a table. With newer version of Oracle database release, dbms_stats package becomes preferred method and it is much more powerful than “Analyze table” SQL. Starting with Oracle 10g, oracle uses dedicated background process to collect database statistics and it is on by default. However, because oracle considering column cardinality is independent to each other, the optimizer can generate bad execution plan if the column’s data distributions are related, such as column state and column zip code. To make the oracle to do the right thing, we can use dynamic sampling hint to ask Oracle collect statistics on fly just for the query to be run to get correct cardinality. Dynamic sampling does add some load tax on database and Oracle would not store the good execution plan generated with dynamic sampling. Start with Oracle 10g, Oracle SQL profile would collect stats for a query and store the execution plan to run such query all the time later. It is particularly useful that you can modify the SQL profile manually to let a query run the execution plan you like and with “force match” option, the execution plan can be applied to similar query that does not use binding variables. Add hint to a SQL becomes so much a thing of yesterday with SQL profile.

Saturday, March 12, 2011

Many faces of an Oracle data block

There are all kinds of data block in Oracle database, such as table segment block, index data block, undo segment block, or data file head block, to name a few. Let’s just talk about the table data block for now.

For a single table data block, Oracle can have many copies of it and many versions of it. The data block could be a disk image or memory image in data buffer cache. Oracle can update the data block and such change is not necessary to be written to disk immediately. In other word, the versions of the data block in memory and on disk are different. Things get more complicated in RAC. Current version of data block can exist in multiple Oracle instances. To implement Oracle read consistence, the old image of data block called Past Image in memory is constructed and is sent to remote instance because remote instance has no access of local Undo segment to construct read consistent data.It is possible that many versions of Past Image data blocks in a single instance.

Thursday, January 20, 2011

V$ARCHIVED_LOG

v$archived_log is a dynamic view to display archive log information in control file. I recently found that many archived logs in the view are very old. Checking initial parameter control_file_record_keep_time and it is default at 7 days. So why these old archive log records are still there especially the status of tehm is “D”, which stands for deleted after RMAN backup. Did some research on metalink but no luck. Then found an article from asktom that explains it very well. According to Tom, control_file_record_keep_time is the minimum number of day archive logs records would keep in control file. If database generated a lot of archive fle during the control_file_record_keep_time, the control file would grow to store all the archive log records. However, the control file never shrinks and as the result, it would take long time to overwritten old archive log records for a low activity database.

Tuesday, October 19, 2010

Data Pump with higher version of dump file

Sometimes, we need import a data pump dump file that created with higher version of data pump export. For example, moving production data to a testing database with lower version of database, or getting the dump file from a Vendor or another group within company that run higher version of database. Before Oracle data pump, we can import an Oracle dump file generated by higher version export utility (exp) using the higher version of import utility (imp) that connects to a lower version database. This trick does not work for data pump anymore because the database actually does the heavy lifting for data pump and newer version of data pump binary cannot work with older version of database. The work around is to set VERSION parameter to lower version of database with you do the data pump export. For example: VERSION=10.2 and then you can import it into database with Oracle version 10.2 database using data pump version 10.2. But what about you get a dump file that the version parameter is not being set and it is impossible to ask the provider to regenerate the dump file or it takes too long to do so? You always have option to import the dump file to a higher version of database then export again with version parameter to be set. Then import the regenerated dump file to the target lower version database. Lots of steps and some more additional work, but it can be done. If you like to take a short cut (and most of us do like to take shortcuts), you can skip the data export step by using network_link parameter. Say the importing your data into database orcl11 and the target database is orcl10. Instead of exporting the data with version=10.2 from orcl11g then importing with newly generated dump file to orcl10, you can create a database link db_link_10_to_11 from orcl10g to orcl11g and run 10g data pump directly with network_link=db_link_10_to_11 parameter. You do not need to set dumpfile parameter because orcl11g database generates the dump on fly and export into 10g database. There are few traps and tricks to use this feature. You have to grant the db link user the data pump privilege to make it work and you need to set version=10.2 for the data pump import to work around Oracle bug.

Wednesday, July 7, 2010

Oracle password strength

As we known, only alphanumeric characters and special characters, #,$,_, can be used for oracle password and it must start with a letter. We also know password is not case sensitive. Therefore, for a password with 4-character length, there are 26*39*39*39 = 1,542,294 possible combination. With the PLSQL based brutal force cracker I write at speed of 20k password hash per second, it takes 77 seconds the worst case to find the password based on hash value. If the password length is 8, with is standard for many organizations, there are 3,568,006,173,654 possible combination. With my 20K per second password cracker, it takes 5.66 years for the worst case. But, do not feel too secure, with a faster oracle database cracker, such as Repscan and Worauthbf at rate of 1 million per seconds, it takes 41 days to find password for the worst case. If you still feel comfortable because your user changes password every month, please read on. Because oracle password is salted only by user name, a hacker can build password to hash lookup table for a user, such as sys or system users. There is only one problem, it would need 71 TB to store 3 trillion combination of 8-character password and 16 bytes hash pairs. However, this problem is solved in 1980’s with a clever solution called rainbow table. Rainbow table trades the space with computation and compress the disk requirements to the 50 – 100 GB range for oracle password 8-character or less. Once the rainbow table is built, it only takes seconds to look up password by its hash value. In short, your database is not security if your password hash is not secure.

There are many ways to obtain the database hash value. For example, if a user has read catalog privileges, he can query DBA_USERS view to get hash value. Or if the system data file can be read by a hacker, he can easily dump the data blocks and obtains password hash because the data file is not encrypted.

To enhance the oracle password strength, many DBAs use password enclosed in double quote. By doing that, any characters except double quote can be used. Because Oracle hash the password as two bytes Unicode, foreign language characters such as Japanese character is valid, so do special characters such as cartridge return. You can also use password contains double quote with SQLPLUS password command or compute the hash using password cracker and set it accordingly. Another option is to use centralized authentication such as Oracle Internet Directory or Oracle wallet based authentication. Oracle 11g adds random salt for password hashing and it is case sensitive. Therefore rainbow table attack becomes invalid. Unfortunately, SHA1 hashing used by Oracle 11g uses much less computation power than the old DES hashing and Alexander Kornbrust reported an oracle password cracker, ighashgpu, that achieved the rate of 790 million passwords per seconds for Oracle 11g SHA1 hash. In other word, it can crack 255 trillion 8-character passwords of Oracle 11g in 3.7 days for the worst case.