Tuesday, October 19, 2010

Data Pump with higher version of dump file

Sometimes, we need import a data pump dump file that created with higher version of data pump export. For example, moving production data to a testing database with lower version of database, or getting the dump file from a Vendor or another group within company that run higher version of database. Before Oracle data pump, we can import an Oracle dump file generated by higher version export utility (exp) using the higher version of import utility (imp) that connects to a lower version database. This trick does not work for data pump anymore because the database actually does the heavy lifting for data pump and newer version of data pump binary cannot work with older version of database. The work around is to set VERSION parameter to lower version of database with you do the data pump export. For example: VERSION=10.2 and then you can import it into database with Oracle version 10.2 database using data pump version 10.2. But what about you get a dump file that the version parameter is not being set and it is impossible to ask the provider to regenerate the dump file or it takes too long to do so? You always have option to import the dump file to a higher version of database then export again with version parameter to be set. Then import the regenerated dump file to the target lower version database. Lots of steps and some more additional work, but it can be done. If you like to take a short cut (and most of us do like to take shortcuts), you can skip the data export step by using network_link parameter. Say the importing your data into database orcl11 and the target database is orcl10. Instead of exporting the data with version=10.2 from orcl11g then importing with newly generated dump file to orcl10, you can create a database link db_link_10_to_11 from orcl10g to orcl11g and run 10g data pump directly with network_link=db_link_10_to_11 parameter. You do not need to set dumpfile parameter because orcl11g database generates the dump on fly and export into 10g database. There are few traps and tricks to use this feature. You have to grant the db link user the data pump privilege to make it work and you need to set version=10.2 for the data pump import to work around Oracle bug.

Wednesday, July 7, 2010

Oracle password strength

As we known, only alphanumeric characters and special characters, #,$,_, can be used for oracle password and it must start with a letter. We also know password is not case sensitive. Therefore, for a password with 4-character length, there are 26*39*39*39 = 1,542,294 possible combination. With the PLSQL based brutal force cracker I write at speed of 20k password hash per second, it takes 77 seconds the worst case to find the password based on hash value. If the password length is 8, with is standard for many organizations, there are 3,568,006,173,654 possible combination. With my 20K per second password cracker, it takes 5.66 years for the worst case. But, do not feel too secure, with a faster oracle database cracker, such as Repscan and Worauthbf at rate of 1 million per seconds, it takes 41 days to find password for the worst case. If you still feel comfortable because your user changes password every month, please read on. Because oracle password is salted only by user name, a hacker can build password to hash lookup table for a user, such as sys or system users. There is only one problem, it would need 71 TB to store 3 trillion combination of 8-character password and 16 bytes hash pairs. However, this problem is solved in 1980’s with a clever solution called rainbow table. Rainbow table trades the space with computation and compress the disk requirements to the 50 – 100 GB range for oracle password 8-character or less. Once the rainbow table is built, it only takes seconds to look up password by its hash value. In short, your database is not security if your password hash is not secure.

There are many ways to obtain the database hash value. For example, if a user has read catalog privileges, he can query DBA_USERS view to get hash value. Or if the system data file can be read by a hacker, he can easily dump the data blocks and obtains password hash because the data file is not encrypted.

To enhance the oracle password strength, many DBAs use password enclosed in double quote. By doing that, any characters except double quote can be used. Because Oracle hash the password as two bytes Unicode, foreign language characters such as Japanese character is valid, so do special characters such as cartridge return. You can also use password contains double quote with SQLPLUS password command or compute the hash using password cracker and set it accordingly. Another option is to use centralized authentication such as Oracle Internet Directory or Oracle wallet based authentication. Oracle 11g adds random salt for password hashing and it is case sensitive. Therefore rainbow table attack becomes invalid. Unfortunately, SHA1 hashing used by Oracle 11g uses much less computation power than the old DES hashing and Alexander Kornbrust reported an oracle password cracker, ighashgpu, that achieved the rate of 790 million passwords per seconds for Oracle 11g SHA1 hash. In other word, it can crack 255 trillion 8-character passwords of Oracle 11g in 3.7 days for the worst case.

Tuesday, June 15, 2010

Oracle and ACID

We all know Oracle database is ACID compliant. ACID stands for Atomicity, Consistency, Isolation and Durability. Even so, DBA and database user in general could not assume that if it is an Oracle database, it meets all ACID properties. An Oracle database has to be setup and used correctly to meet ACID properties. In some cases, tradeoff is made on ACID for other gains, such as performance.

Let’s start with atomicity. Atomicity means a transaction should be done all the way or nothing. For example, two rows are inserted into table a_table by two insert SQL statements, and when rollback is issued, both rows are removed from table. But what happens if a trigger with autonomous transaction is fired on insert for a_table and the trigger inserts some records into another table b_table when it fires? Even the data in a_table is rolled back, the data in b_table will not be rolled back. Therefore, atomicity is not met. If the trigger is for auditing, such behavior is by design to meet business requirement. Another example is that two rows are updated into table a_table by two update SQL statements, and when rollback is issued, both rows rollback to their original values. Later, user who issued the rollback transaction wants to redo the rollback transaction but could not figure out the new values to update. So he does flashback query to get the value and reapplies them to the table a_table. Unfortunately, the flashback query time falls between two update statements and as the result, only second update of the transaction with two updates are executed. In other word, the transaction is half way completed and violates atomicity. Technically, these are two transactions in this case, but it is one in business sense.

The second rule of ACID is consistency. It requires the database in a consistent state before and after a transaction, does not matter the transaction is committed or rolled back. Oracle maintains the consistent state with constraints, including referential constraints. Business rules can be maintained by triggers or other means. But if the constraints are disable or non-validated, or trigger is not behaved as design, the database can end up in a no consistent state after a transaction.

The third rule of ACID is isolation. A transaction should not see the data changes of other transactions happened after the transaction is started. By default, Oracle supports “read committed” out of box. Therefore, Nonrepeatable read and Phantom read are permitted. To achieve true isolation, Isolation level should be set at serializable, so the session will not see the changes of other session until the transaction is finished. Oracle implements it by using multi-version read consistency and optimistic locking at data block level.

The last rule of ACID is durability. Once the transaction is committed, the change can not be lost. To ensure that, Oracle must in archive log mode and at least one copy of backup, all archive logs since the backup, one copy of redo and one copy of control file available. Additionally, no objects are in nologging mode, which does not necessary to generate redo log for database recovery. And commit write must be immediate and wait, so all the redo log have to be written to disk per commit.

It is DBA’s responsibility to make the Oracle database ACID compliant. And in some cases, break the AICD properties for business or performance reasons. In the late case, DBA need to understand the complications of such actions and make them clear to management and database users.

Friday, June 11, 2010

Grid Control Execute SQL

The "Execute SQL" feature on GRid Control is very handy to run SQL or PL/SQL command against multiple database targets. However, I had trouble to use like operator for my sql, as show in the screen shot.


The solution is to add "%" as escape character, as shown below:

Thursday, March 11, 2010

Oracle SGA memory management on Linux

  1. SGA auto sizing is not that dynamic. At startup, the oracle processes allocate memory up to the sga_max_size. Oracle keeps the unused sga memory under “Free SGA Memory Available”. For example, if we set sga_max_size = 600M and sga_target = 500M, then query v$sgainfo and we find that “Free SGA Memory Available” is 100M. I could not find a reason not to set sga_target equals to sga_max_size. (Well, only for very special needs.)
  2. It is hard to know exactly how much memory is allocated to an Oracle instance at OS level. On Linux, you can use “ps aux” to find out how much memory is allocated to an oracle process in bytes by VSZ column. VSZ stands for “virtual memory size” and it quite misleading. In Linux, virtual memory is a memory map that can be point to physical memory, or a swap disk space, or a memory to be allocated. So it is more like a total memory allocated for the process. Secondly, the VSZ includes the memory dedicated to the process and shared memory with other processes as well. This is why the sum of VSZ for all processes from an Oracle instance is much larger that SGA and PGA combined because shared memory is counted multiple times. To find out what memory is shared and what is private, you can use “pmap –d ”. A sample output is as following:
    AddressKbytesModeOffsetMapping
    ….
    0000003b27e130008rw---0000003b27e13000[ anon ]
    0000003b2c50000080r-x--0000000000000000libnsl-2.3.4.so
    0000003b2c5140001020-----0000000000014000libnsl-2.3.4.so
    0000003b2c6130008rw---0000000000013000libnsl-2.3.4.so
    0000003b2c6150008rw---0000003b2c615000[ anon ]
    0000007fbfffa00024rwx--0000007fbfffa000[ stack ]
    ffffffffff6000008192-----0000000000000000[ anon ]

    mapped: 772456K writeable/private: 6824K shared: 620548K
    As it shows, pmap lists memory address, memory siz of all memory segments for a process. It also shows private memory and shared memory in last line. It is possible to write a program to find the total memory allocation for a set of processes based the memory address and size of pmap output from every single process in the set.
  3. Not all SGA memory is in physical memory. Back to “ps aux” command and “RSS” column is for “resident set size” in KB. In English, it is the physical memory the process used, including shared and not shared. The following is the output of pmon process of an oracle instance:

    USERPIDVSZRSSCOMMAND
    oracle1010876426417696ora_pmon_orcl

    The data indicates that only 2% of the memory is actually physical memory. Because the majority of memory for an oracle process is shared, the percentage of physical memory out of total memory for other oracle processes is similar. I still do not have a way to know what Oracle memory structure is on physical memory or what percentage of it is physical memory. Normally, it is not a problem since Linux OS would move the active memory into physical memory. But in some case, for example, server is doing swapping, virtual machine is used, or high performance system that even first time access of a stale memory page needs to be fast, DBA may needs to pin all Oracle memory into not swappable physical memory. If certain Oracle memory structure is not in physical memory, such as library cache, the database is unusable because of performance.
  4. Pin SGA into physical memory. Oracle init parameter lock_sga can be set to true and set the ulimit parameter memlock to greater than sga_max_size. Otherwise, the instance failed to start with ORA-27102: out of memory. Let’s check the process again and interestingly, the physical memory allocated to oracle process is still very little.

    USERPIDVSZRSSCOMMAND
    oracle2042076426417696ora_pmon_orcl

    init parameter lock_sga is OS dependent and looks like Linux is not supporting lock_sga to pin SGA into physical memory. Even the status of the shared memory is locked by “ipcs –m”. Next, let us try to set pre_page_sga = true. Run “ps aux” and see how much physical memory is used

    USERPIDVSZRSSCOMMAND
    oracle12278764264634348ora_pmon_orcl

    The RSS is about 600M so sga is in physical memory now. However, sga in physical memory can be paged out by OS as needed. Again, the memory locking is not working in Linux even the flag is set.

Tuesday, February 23, 2010

Solid State Disk as Oracle storage

I recently completed a cool project: implementing 11g release 2 RAC on Solid State Drive (SSD). SSD is still too expensive except for the most advanced systems. With $10/gigabytes price tag, it is about 20 times more expensive than Hard Disk Drive (HDD). But it is fast, quit, more durable and greener with 90% less power consumption. It is projected that SSD will cost only few dollars per gigabytes within two years and likely will replace HDD like USB flash memory did to floppy disk.

In the blog, I present the IO testing result of SSD against SAN using ORION (Oracle IO Calibration tool). ORION result is more closely mimicking of Oracle database IO characteristics.

Large read throughput for SSD is at 116 MBPS (MegaBytes Per Second) vs. 40 MBPS for SAN. The transfer rate is much less than theoretical transfer rate of SDD and SAN but it is likely the throughput Oracle could achieve for such hardware. It is worth to note that SSD is only mirrored by not stripped, and SAN is mirrored and stripped. So it is reasonable to assume SDD throughput would be few times faster if it were in RAID 01 configuration. I use the same configuration of SAN and SDD of large read for all other tests as well. The random small read throughput for SAN is 1.8 MBPS or 225 IOPS (IO Per Second). The random small read throughput for SSD is 72 MBPS or 9000 IOPS, which is 40 times faster than HDD. Because SSD removes the latency to move the disk head to new disk sections, the random small reads is about 60% the speed of large read for SSD. The data suggests SDD is great for OLTP system and for control file and redo logfile of Oracle database.

Large write throughput for SAN is 79 MBPS and 160 MBPS for SDD. Small random write for SAN is 22 MBPS or 2750 IOPS. Small random write for SSD is 79 MBPS or 9875 IOPS. It is interesting to see that write IO performance is better than read for both SAN and SSD. My guess is that storage write forward cache play significant role here and it is hard to measure true disk write speed. However, it is clear that SDD outperformed SAN once again.

Friday, February 19, 2010

First Impression of Oracle 11gR2 RAC

11g Release 2 RAC has some major changes from 11g Release 1.

For starter, ASM and Clusterware are combined into single Oracle binary home called Oracle grid infrastructure. By default, the voting disks and OCR are in ASM diskgroup instead of raw devices recommended in previous releases. Such change contradicts the concept that ASM is on top of clusterware for previous releases and raise question such as, if ASM is crashed, does clusterware service dies as well because voting disks and OCR are not accessible? Well, the clusterware can access voting disks directly without going through ASM, so it will bring up ASM automatically after it is crashed. Restore an OCR is more difficult and requires to shutdown clusterware on all nodes, then start clusterware on one node exclusively with ASM online. Fortunately, OCR is not easy to corrupt because it is in a normal redundancy diskgroup or high redundancy diskgroup.

Another new feature of 11g Release 2 is SCAN (Single Client Access Name) which binds round-robin dynamic DNS to RAC nodes. It provides easy way of load balance and failover for any client connection. However, it could be a challenge to setup DNS and I wish the installation can have an option not to use SCAN for RAC.

Overall, release 2 RAC is easier to install and deinstall than that of earlier release and more stable because of cluster time service. It also has better scalability than earlier release because the performance improvement of inter-node communication.