Oracle Internal Table X$BH Buffer Hash

Oracle Internal Table X$BH Buffer Hash

See Also: V$bh

Description

Buffer Hash(according to J. Morle's Scaling Oracle 8i)
Another explaining is Buffer Header.

Columns

ver 10.2.0.4
> x$bh
 Name         Type
 ------------ -----------
 ADDR         RAW(8)
 INDX         NUMBER
 INST_ID      NUMBER
 HLADDR       RAW(8)
 BLSIZ        NUMBER
 NXT_HASH     RAW(8)
 PRV_HASH     RAW(8)
 NXT_REPL     RAW(8)
 PRV_REPL     RAW(8)
 FLAG         NUMBER
 RFLAG        NUMBER
 SFLAG        NUMBER
 LRU_FLAG     NUMBER
 TS#          NUMBER
 FILE#        NUMBER
 DBARFIL      NUMBER
 DBABLK       NUMBER
 CLASS        NUMBER
 STATE        NUMBER
 MODE_HELD    NUMBER
 CHANGES      NUMBER
 CSTATE       NUMBER
 LE_ADDR      RAW(8)
 DIRTY_QUEUE  NUMBER
 SET_DS       RAW(8)
 OBJ          NUMBER
 BA           RAW(8)
 CR_SCN_BAS   NUMBER
 CR_SCN_WRP   NUMBER
 CR_XID_USN   NUMBER
 CR_XID_SLT   NUMBER
 CR_XID_SQN   NUMBER
 CR_UBA_FIL   NUMBER
 CR_UBA_BLK   NUMBER
 CR_UBA_SEQ   NUMBER
 CR_UBA_REC   NUMBER
 CR_SFL       NUMBER
 CR_CLS_BAS   NUMBER
 CR_CLS_WRP   NUMBER
 LRBA_SEQ     NUMBER
 LRBA_BNO     NUMBER
 HSCN_BAS     NUMBER
 HSCN_WRP     NUMBER
 HSUB_SCN     NUMBER
 US_NXT       RAW(8)
 US_PRV       RAW(8)
 WA_NXT       RAW(8)
 WA_PRV       RAW(8)
 OBJ_FLAG     NUMBER
 TCH          NUMBER
 TIM          NUMBER
ver 11.1.0.6.0
 Name                 Null?    Type
 -------------------- -------- ---------------
 FILE#                         NUMBER
 BLOCK#                        NUMBER
 CLASS#                        NUMBER
 STATUS                        VARCHAR2(10)
 XNC                           NUMBER
 FORCED_READS                  NUMBER
 FORCED_WRITES                 NUMBER
 LOCK_ELEMENT_ADDR             RAW(4)
 LOCK_ELEMENT_NAME             NUMBER
 LOCK_ELEMENT_CLASS            NUMBER
 DIRTY                         VARCHAR2(1)
 TEMP                          VARCHAR2(1)
 PING                          VARCHAR2(1)
 STALE                         VARCHAR2(1)
 DIRECT                        VARCHAR2(1)
 NEW                           CHAR(1)
 OBJD                          NUMBER      matches dba_objects.data_object_id, not object_id
 TS#                           NUMBER

Frequent Used SQL

  • Cache Buffer Chain Latch Contention

With x$bh table you can find the object and the file#.block# of its header when there's high cache buffers chains latch contention for the said latch (whose sleeps you think are too high).

select obj, dbarfil, dbablk 
from x$bh bh, v$latch_children lc 
where bh.hladdr = lc.addr

Example:
19:16:58 SQL> select obj, dbarfil, dbablk
19:17:02   2  from x$bh bh, v$latch_children lc
19:17:02   3  where bh.hladdr = lc.addr and rownum <= 10;
 
       OBJ    DBARFIL     DBABLK
---------- ---------- ----------
      5932          2      15488
      6091          2      41749
     12752          2      72348
      5938          2      37411
     12806          2      68010
4294967295          3      10218
     12688          2      63672
         2          1      47252
4294967295          3       1542
     12578          2      54996
 
10 rows selected.
  • Cache Buffer Clones

You can also use this table to see if a specific buffer has too many clones:

select dbarfil, dbablk, count(*) 
from x$bh 
group by dbarfil, dbablk having count(*) > 2

Example
19:17:13 SQL> select dbarfil, dbablk, count(*)
19:21:34   2  from x$bh
19:21:34   3  group by dbarfil, dbablk having count(*) > 2;
 
   DBARFIL     DBABLK   COUNT(*)
---------- ---------- ----------
         1      10087          3
         1      10032          3
         1       8626          5
         1      36756          3
         1      36405          4
         1      10356          3
         1      44133          5
         1      10054          3
         1      10155          5
         1      36738          3
         1     117707          4
         1      36819          4
         1      10152          3
         1     105370          5
...
 
95 rows selected.

Note that obj column matches dba_objects.data_object_id, not object_id.
For performance reason, don't merge dba_extents with the query of x$bh that has a group by, unless you use in-line view and no_merge hint (see J. Lewis' Practical Oracle8i, p.215) The tch column, touch count, records how many times a particular buffer has been accessed. Its flag column is explained by J. Lewis' Buffer Flag; explanation of state, mode and indx can be found in Anjo Kolk's paper session wait events.

RAC and v$bh

The internal machinations of Oracle buffer cache management are not published and many Oracle professionals infer their behavior from experimentation and tidbits from the Oracle documentation. For a complete description of the v$ event views, get the free 10g poster by UNISYS.

19:31:03 SQL> desc v$bh
 Name                                                                                Null?    Type
 ----------------------------------------------------------------------------------- -------- --------------------------------------------------------
 FILE#                                                                                        NUMBER
 BLOCK#                                                                                       NUMBER
 CLASS#                                                                                       NUMBER
 STATUS                                                                                       VARCHAR2(10)
 XNC                                                                                          NUMBER
 FORCED_READS                                                                                 NUMBER
 FORCED_WRITES                                                                                NUMBER
 LOCK_ELEMENT_ADDR                                                                            RAW(4)
 LOCK_ELEMENT_NAME                                                                            NUMBER
 LOCK_ELEMENT_CLASS                                                                           NUMBER
 DIRTY                                                                                        VARCHAR2(1)
 TEMP                                                                                         VARCHAR2(1)
 PING                                                                                         VARCHAR2(1)
 STALE                                                                                        VARCHAR2(1)
 DIRECT                                                                                       VARCHAR2(1)
 NEW                                                                                          CHAR(1)
 OBJD                                                                                         NUMBER
 TS#                                                                                          NUMBER

For RAC Instance

The v$bh view of x$bh has the all-important “status” column that indicates the lock mode for each data block in the data buffer. You can use v$bh to display the data buffer contents, and here is Burleson's great script for this. For non-OPS and non-RAC databases, you will see these common lock modes:

xcur - This is a RAM block that has been acquired in exclusive current mode. According the Oracle documentation, if a buffer state is exclusive current (XCUR), an instance owns the resource in exclusive mode.

cr - This mode indicates a "cloned" RAM block (a "stale" block), that was once in xcur mode. The instance has shared access to the block and can only perform reads. The cr state means the owning instance can perform a consistent read of the block, if the instance holds an older version of the data.

free - This is an “available” RAM block. It might contain data, but it is not currently in-use by Oracle.

read – The buffer is reserved for a block that is currently being read from disk.

During database recovery we might also see these values for v$bh status column:
mrec – Indicates a block in media recovery mode

irec – This is a block in instance (crash) recovery mode

For RAC database we also see this v$bh.status values:
scur - a current mode block, shared with other instances

A normal database will have filled all the buffer blocks in just a few minutes, and the database buffer will normally remain full until shutdown time. These "free" RAM blocks still contain data. They are just marked as free because they are eligible to be overlaid by a new incoming data block.

Multi-versioning of blocks in the buffer

The Oracle row-level locking mechanism is well understood by Oracle experts, but here is some question about how row-level locking translated into data buffer block status changes.

In a MetaLink forum discussion titled "Multiple buffer versions flooding the buffer cache", we see a DBA on Oracle 8.1.6.2 who claims to have 50 versions of a single data block in his data buffer. This query was used to identify multiple versions of a data block within the buffer cache:

SELECT 
   dbarfil, 
   dbablk, 
   class, 
   count(*) 
FROM 
   x$bh 
GROUP BY 
   dbarfil, 
   dbablk,
   class 
HAVING 
   count(*) > 40;

And also a query to identify the specific segments that are experiencing multiple data buffer versions:
SELECT 
   segment_name, 
   segment_type, 
   owner, 
   tablespace_name 
FROM 
   sys.dba_extents 
WHERE 
   file_id = <DBAFIL> 
AND 
   <DBABLK> BETWEEN block_id and block_id + blocks -1;

"If this is a table or index block it is advisable to check the application to see why the block is being frequently changed as this may represent a point of contention."

"In general, you would see a number of CR blocks in the buffer cache proportional to the number of updates on that block."

In this case we see an argument for re-setting the pctfree object parameter when we have a table (or index) with many "hot update" rows. Re-setting pctfree to a high-value will spread the data rows across more data blocks, thereby relieving the possibility of having dozens of versions of the data block in the data buffer.

For example, if you have avg_row_len = 80 and 2k data blocks, setting pctfree=90 would remove the data block from the freelist (marking it as logically full) after only three rows are inserted. This approach wastes disk space but reduced the multiple versions of "cr" RAM blocks. This high pctfree technique is also used to reduce row fragmentation in tables with VARCHAR data types where a row is initially stored small with huge row expansion from subsequent UPDATE statements.

Mark Bobak, a member of the respected Oaktable group, professes some knowledge of buffer management internals, and published this excellent speculation on the behavior of buffer block status:

I'm fairly certain, but not 100% sure, that Oracle creates a CR block from a XCUR (or SCUR? I'm not a RAC expert), by cloning the XCUR block, and then referring to rollback, rolls back the block till it's sufficiently old, to be read consistent to the point in time required.

So, if a query's snapshot SCN is N, but a particular DBA (data block address) is only available in current mode, and the SCN is N+100, then Oracle will clone it, and mark it as CR, and start rolling back, using the information it the block's ITL slot(s) to backtrack where the various before images are.

Eventually, one of two things will happen, either the block will be sufficiently old (with an SCN of N or less) or, Oracle will not be able to find the required information in the rollback segment, which would result in ORA-1555.

And yes, I believe the XCUR block is kept in the buffer cache after a transaction completes. (Again, there may be some gotchas in the RAC case, I'm not sure about that case.)

Consider also, that a block in XCUR mode can be flushed to disk, even if there is a transaction open on the block. Oracle doesn't care. If you think about how the locking model works, and how read consistency works, it really doesn't matter. If a block with one or more row level locks is flushed to disk, and then the transaction commits, what happens?

Well, Oracle certainly won't re-read the block just to update the row lock and ITL slot. So, what does it do? It marks the transaction as committed in the rollback segment slot header, and moves on. If another process comes along and reads or updates that block, delayed block cleanout will kick in to clean up the mess that was left behind.

There is also a question about buffer multi-versioning when using Oracle 10g Automatic Storage Memory Management (ASMM). Here is an actual example from an ASMM 10g database showing only one free block in the data buffer:

STATUS  NUMBER_BUFFERS 
------- -------------- 
cr                 616 
free                 1  
xcur             14790
 
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                        % Total
Event                                 Waits    Time (s)   DB Time     Wait Class
------------------------------ ------------ ----------- --------- --------------
log file parallel write               9,670         291     55.67     System I/O
log file sync                         9,293         278     53.12         Commit
CPU time                                            225     43.12
db file parallel write                4,922         201     38.53     System I/O
control file parallel write           1,282          65     12.42     System I/O

Here is a sample AWR report from the system as BC first found it.

Note that all xcur and cr blocks in the RAM data buffers can be de-allocated (with the alter system flush buffer_pool command) to make their status “free” and they can then be reassigned to other SGA regions. Here is a simple script to display counts of each v$bh status:

column c1 heading 'Status'                 format a10
column c2 heading 'Number|of Data|Buffers' format 999,999,999
 
select
   status    c1,
   count(1)  c2
from
   v$bh
group by
   status
order by
   count(1) desc;

Here is the output from a high-DML database.
                 Number
                of Data
Status          Buffers
---------- ------------
xcur            311,967
free            270,731
cr               17,302

The number of blocks in v$bh where status="free" is critical to high-update databases because there must be enough buffers to hold all data blocks that are being updated (in xcur status). The Oracle 10g Automatic Storage Memory Management (ASMM) feature (the default on Oracle10g) should anticipate high updates and allocate additional data buffers during high update periods. For example, here is an actual output from an Oracle10g database where it appears that ASMM is not allocating enough free blocks to accommodate concurrent updates:
STATUS  NUMBER_BUFFERS 
------- -------------- 
cr                 616 
free                 1  
xcur             14790

For more on ASMM, click here.

An Online Demonstration

This example below shows how Oracle marks un-locked block buffers as “free” and how non-free buffer blocks (where status <> 0) and we see the cur and xcur blocks being flushed from the buffer cache when we issue the alter system flush buffer_cache command. In this example there are 50,000 rows in the POLICYREC table.

SQL> update POLICYREC set sum_assured = sum_assured + 15;
50000 rows updated.
 
SQL> commit;
Commit complete.
 
SQL> select o.OBJECT_TYPE, substr(o.OBJECT_NAME,1,10)
  objname , b.objd , b.status, count(b.objd) from v$bh b,
  dba_objects o where b.objd = o.data_object_id and
  o.owner = 'NYUSER' group by o.object_type,
  o.object_name,b.objd, b.status ;
 
OBJECT_TYPE  OBJNAME         OBJD STATUS   COUNT(B.OBJD)
----------- -------------   ------ ------  -------
TABLE       TEST1             43058 free         6
TABLE       POLICYREC         43061 cr          47
TABLE       POLICYREC         43061 free       238
TABLE       POLICYREC         43061 xcur       376
 
SQL> alter system flush buffer_cache;
 
System altered.
 
SQL> select o.OBJECT_TYPE, substr(o.OBJECT_NAME,1,10)
 objname , b.objd , b.status, count(b.objd) from  v$bh b,
 dba_objects o where b.objd = o.data_object_id and
 o.owner = 'NYUSER' group by o.object_type,
 o.object_name,b.objd, b.status ;
 
OBJECT_TYPE OBJNAME       OBJD STATUS COUNT(B.OBJD)
---------- -------------- ----- ----- -------------
TABLE           TEST1      43058 free         6
TABLE       POLICYREC      43061 free       660

Reference

English
By Burleson Consulting's The mysteries of Oracle buffer block management
http://www.stormloader.com/yonghuang/computer/x$table.html
http://www.ixora.com.au/ (lots of valuable documents are deleted …)
http://www.ixora.com.au/scripts/x$tables.htm
http://www.fors.com/velpuri2/X$/List%20of%20X$%20Tables
http://ordba.net/Ora10g/x$tab.htm
http://www.orafaq.com/faqdbain.htm#X$TAB
http://www.oracle-training.cc/oracle_tips_x$.htm
http://www.adp-gmbh.ch/ora/misc/x.html
http://www.petefinnigan.com/Storing_Data_Directly_From_Oracle_SGA.pdf
http://oraperf.sourceforge.net/presentations/SGA_access.ppt
http://www.eygle.com/refer/Oracle_x$table_list.htm
http://imergegroup.blogspot.com/2007/03/what-is-inside-xtables-in-oracle.html
http://www.oracle-internals.com/?p=11
http://www.jlcomp.demon.co.uk/faq/find_dist.html
https://twiki.cern.ch/twiki/bin/view/PSSGroup/ASM_Internals
22241.1 List of X$ Tables and how the names are derived

Chinese
http://www.dbanotes.net/X$tables.htm
http://www.eygle.com/archives/2004/11/oracle_x_table.html
http://ysjxjf.itpub.net/post/7246/68230
http://chrysie.itpub.net/post/20977/276884
http://soft.zdnet.com.cn/software_zone/2007/0811/452643.shtml
http://fantastechnol.blogspot.com/search/label/Oracle