RAC: Frequently Asked Questions (RAC FAQ) (Doc ID 220970.1)

0    437    1

Tags:

👉 本文共约70231个字,系统预计阅读时间或需265分钟。

目录

QUESTIONS AND ANSWERS

RAC - Real Application Clusters

RAC

RAC One Node

QoS - Quality of Servce Management

Clusterware

Autonomous Computing

Rapid Home Provisioning

Answers

How do I measure the bandwidth utilization of my NIC or my interconnect?

Oracle RAC depends on both (a) Latency and (b) Bandwidth
(a) Latency can be best measured by running a AWR or statspack report and reviewing the cluster section
(b) Bandwidth can be measured using OS provided utilities like iptraf, Netperf, topaz (AIX). ** Some of these utilities may not be available on All platforms.

Keep in mind that, if the network is utilized at 50% bandwidth, this means that 50% of the time it is busy and not available to potential users. In this case delays (due to Network collisions) will increase the latency even though the bandwidth might look "reasonable". So always keep an eye on both "Latency and Bandwidth"


How can I validate the scalability of my shared storage? (Tightly related to RAC / Application scalability)

RAC scalability is dependent at the storage unit's ability to process I/O's per second (throughput) in a scalable fashion, specifically from multiple sources (nodes).

Oracle recommends using ORION (Oracle I/O test tool) which simulates Oracle I/O. Note: Starting with 11.2 the orion tool is included with the RDBMS/RAC software, see ORACLE_HOME/bin. On other Unix platforms (as well as Linux) one can use IOzone, if prebuilt binary not available you should build from source, make sure to use version 3.271 or later and if testing raw/block devices add the "-I" flag.

In a basic read test you will try to demonstrate that a certain IO throughput can be maintained as nodes are added. Try to simulate your database io patterns as much as possible, i.e. blocksize, number of simultaneous readers, rates, etc. For example, on a 4 node cluster, from node 1 you measure 20MB/sec, then you start a read stream on node 2 and see another 20MB/sec while the first node shows no decrease. You then run another stream on node 3 and get another 20MB/sec, in the end you run 4 streams on 4 nodes, and get an aggregated 80MB/sec or close to that. This will prove that the shared storage is scalable. Obviously if you see poor scalability in this phase, that will be carried over and be observed or interpreted as poor RAC / Application scalability.


Is Oracle RAC supported on logical partitions (i.e. LPARs) or other virtualization technologies?

Check http://www.oracle.com/technetwork/database/virtualizationmatrix-172995.html for more details on supported virtualization and partitioning technologies.


How should voting disks be implemented in an extended cluster environment?

How should voting disks be implemented in an extended cluster environment?
Can I use standard NFS for the third site voting disk?

Standard NFS is only supported for the tie-breaking voting disk in an extended cluster environment. See platform and mount option restrictions here .Otherwise just as with database files, we only support voting files on certified NAS devices, with the appropriate mount options. Please refer to My Oracle Support Document 359515.1 for a full description of the required mount options.


What are the cdmp directories in the background_dump_dest used for?

These directories are produced by the diagnosibility daemon process (DIAG). DIAG is a database process which as one of its tasks, performs cache dumping. The DIAG process dumps out tracing to file when it discovers the death of an essential process (foreground or background) in the local instance. A dump directory named something like cdmp_ is created in the bdump or background_dump_dest directory, and all the trace dump files DIAG creates are placed in this directory.


How do I gather all relevant Oracle and OS log/trace files in an Oracle RAC cluster to provide to Support?

We recommend to install TFA in every cluster.

This is a great tool to collect the different logs across the cluster for database and cluster diagnostics. It can be run manually, automatically or at any given interval. It is included in 12.1.0.2.
If you are in 12.1.0.1 you need to download it, you can do so by going to myoracle support note 1513912.1

11.2.03 Grid Infrastructure deployments do not include TFA, but 11.2.04 deployments do include it.

TFA narrows the data for only what is relevant to the range of time you are analyzing, creates a zip file and uploads it to support.

The TFA analyzer get’s this zip file and provides an easy way to navigate the data, show relations between logs across nodes making it easier to analyze the data.


How should one review the ability to scale out to more nodes in your cluster?

Once a customer is using RAC on a two node cluster and want to see how far they can actually scale it, the following are some handy tips to follow:
\1. Ensure they are using a real enough workload that it does not have false bottlenecks.
\2. Have tuned the application so it is reasonable scalable on their current RAC environment.
\3. Make sure you are measuring a valid scalability measure. This should either be doing very large batch jobs quicker (via parallelism) or being able to support a greater number of short transactions in a shorter time.
\4. Actual scalability will vary for each application and its bottlenecks. Thus the request to do the above items. You would see similar scalability if scaling up on a SMP.
\5. For failover, you should see what happens if you lose a node. If you have 2 nodes, you lose half your power and really get into trouble or have lots of extra capacity.
\6. Measuring that load balancing is working properly. Make sure you are using RCLB and a FAN aware connection pool.
\7. Your customer should also testing using DB Services.
\8. Get familiar w/ EM GC to manage a cluster and help eliminate a lot of the complexity of many of the nodes.
\9. Why stop at 6 nodes? A maximum of 3 way messaging ensure RAC can scale much, much further.


Can I ignore 10.2 CLUVFY on Solaris warning about failed package existence checks?

Complete error is

Package existence check failed for "SUNWscucm:3.1".

Package existence check failed for "SUNWudlmr:3.1".

Package existence check failed for "SUNWudlm:3.1".

Package existence check failed for "ORCLudlm:Dev_Release_06/11/04,_64bit_3.3.4.8_reentrant".

Package existence check failed for "SUNWscr:3.1".

Package existence check failed for "SUNWscu:3.1".

Cluvfy checks all possible prerequisites and reports whether your system passed the checks or not. You should then cross reference with the install guide to see if the checks that failed are required for your type of installation. In the above case, if you are not planning on using Sun Cluster, then you can continue the install.


What is a CVU component?

CVU (Cluster Verification Utility) supports the notion of Component verification. The verifications in this category are not associated with any specific stage. The user can verify the correctness of a specific cluster component. A component can range from a basic one, like free disk space to a complex one like CRS Stack. The integrity check for CRS stack will transparently span over verification of multiple sub-components associated with CRS stack. This encapsulation of a set of tasks within specific component verification should be of a great ease to the user.


Why cluvfy reports "unknown" on a particular node?

According to the Cluster Verification Utility Reference from the Clusterware Administration and Deployment Guide from the 12c Documentation:

If a cluvfy command responds with UNKNOWN for a particular node, then this is because CVU cannot determine whether a check passed or failed. The cause could be a loss of reachability or the failure of user equivalence to that node. The cause could also be any system problem that was occurring on that node when CVU was performing a check.

The following is a list of possible causes for an UNKNOWN response:

  • The node is down
  • Executables that CVU requires are missing in Grid_home/bin or the Oracle home directory
  • The user account that ran CVU does not have privileges to run common operating system executables on the node
  • The node is missing an operating system patch or a required package
  • The node has exceeded the maximum number of processes or maximum number of open files, or there is a problem with IPC segments, such as shared memory or semaphores

What are the requirements for CVU?

According to the Cluster Verification Utility Reference from the Clusterware Administration and Deployment Guide from the 12c Documentation, CVU requirements are:

  • At least 30 MB free space for the CVU software on the node from which you run CVU

  • A work directory with at least 25 MB free space on each node. The default location of the work directory is /tmp on Linux and UNIX systems, and the value specified in the TEMP environment variable on Windows systems. You can specify a different location by setting the CV_DESTLOC environment variable.

    When using CVU, the utility attempts to copy any needed information to the CVU work directory. It checks for the existence of the work directory on each node. If it does not find one, then it attempts to create one. Make sure that the CVU work directory either exists on all nodes in your cluster or proper permissions are established on each node for the user running CVU to create that directory.

  • Java 1.4.1 on the local node


What about discovery? Does CVU discover installed components?

CVU performs system checks in preparation for installation, patch updates and/or other system changes. Checks performed by CVU can be:

  • Free disk space
  • Clusterware Integrity
  • Memory
  • Processes
  • Other important cluster components
  • All available network interfaces
  • Shared Storage
  • Clusterware home

For more information please check the Cluster Verification Utility Reference from the Clusterware Administration and Deployment Guide from the 12c Documentation.


How do I check Oracle RAC certification?

Please refer to my oracle support https://support.oracle.com/ for information regarding Certification of Oracle RAC and all other products from the stack,


Is Veritas Storage Foundation supported with Oracle RAC?

Veritas certifies Veritas Storage Foundation for Oracle RAC with each release. Check Veritas Support Matrix for the latest details. Also visit My Oracle Support for a list of certified 3rd party products with Oracle RAC.


Can I use ASM as mechanism to mirror the data in an Extended RAC cluster?

Yes, please refer to the Extended Clusters Technical Brief for more information on Extended Clusters and the RAC Stack.


What are the changes in memory requirements from moving from single instance to RAC?

If you are keeping the workload requirements per instance the same, then about 10% more buffer cache and 15% more shared pool is needed. The additional memory requirement is due to data structures for coherency management. The values are heuristic and are mostly upper bounds. Actual resource usage can be monitored by querying current and maximum columns for the gcs resource/locks and ges resource/locks entries in V$RESOURCE_LIMIT.

But in general, please take into consideration that memory requirements per instance are reduced when the same user population is distributed over multiple nodes. In this case:

Assuming the same user population N number of nodes M buffer cache for a single system then

(M / N) + ((M / N )*0.10) [ + extra memory to compensate for failed-over users ]

Thus for example with a M=2G & N=2 & no extra memory for failed-over users

=( 2G / 2 ) + (( 2G / 2 )) *0.10

=1G + 100M


What are the default values for the command line arguments?

Here are the default values and behavior for different stage and component commands:

For component nodecon:
If no -i or -a arguments is provided, then cluvfy will get into the discovery mode.

For component nodereach:
If no -srcnode is provided, then the local(node of invocation) will be used as the source node.

For components cfs, ocr, crs, space, clumgr:
If no -n argument is provided, then the local node will be used.

For components sys and admprv:
If no -n argument is provided, then the local node will be used.
If no -osdba argument is provided, then 'dba' will be used. If no -orainv argument is provided, then 'oinstall' will be used.

For component peer:
If no -osdba argument is provided, then 'dba' will be used.
If no -orainv argument is provided, then 'oinstall' will be used.

For stage -post hwos:
If no -s argument is provided, then cluvfy will get into the discovery mode.

For stage -pre clusvc:
If no -c argument is provided, then cluvfy will skip OCR related checks.
If no -q argument is provided, then cluvfy will skip voting disk related checks.
If no -osdba argument is provided, then 'dba' will be used.
If no -orainv argument is provided, then 'oinstall' will be used.

For stage -pre dbinst:
If -cfs_oh flag is not specified, then cluvfy will assume Oracle home is not on a shared file system.
If no -osdba argument is provided, then 'dba' will be used.
If no -orainv argument is provided, then 'oinstall' will be used.


Do I have to be root to use CVU?

No. CVU is intended for database and system administrators. CVU assumes the current user as Grid/Database user.


What is nodelist?

Nodelist is a comma separated list of hostnames without domain. Cluvfy will ignore any domain while processing the nodelist. If duplicate entities after removing the domain exist, cluvfy will eliminate the duplicate names while processing. Wherever supported, you can use '-n all' to check on all the cluster nodes.


How do I check minimal system requirements on the nodes?

The component verification command sys is meant for that. To check the system requirement for RAC, use '-p database' argument. To check the system requirement for CRS, use '-p crs' argument.


How do I get detail output of a check?

Cluvfy supports a verbose feature. By default, cluvfy reports in non-verbose mode and just reports the summary of a test. To get detailed output of a check, use the flag '-verbose' in the command line. This will produce detail output of individual checks and where applicable will show per-node result in a tabular fashion.


Why the peer comparison with -refnode says passed when the group or user does not exist?

Peer comparison with the -refnode feature acts like a baseline feature. It compares the system properties of other nodes against the reference node. If the value does not match( not equal to reference node value ), then it flags that as a deviation from the reference node. If a group or user does not exist on reference node as well as on the other node, it will report this as 'matched' since there is no deviation from the reference node. Similarly, it will report as 'mismatched' for a node with higher total memory than the reference node for the above reason.


At what point cluvfy is usable? Can I use cluvfy before installing Oracle Clusterware?

You can run cluvfy at any time, even before CRS installation. In fact, cluvfy is designed to assist the user as soon as the hardware and OS is up. If you invoke a command which requires CRS or RAC on local node, cluvfy will report an error if those required products are not yet installed.

Cluvfy can also be invoked after install to check if any new hardware component added after the install (like more shared disks etc) are accessible from all the nodes.


Is there a way to compare nodes?

You can use the peer comparison feature of cluvfy for this purpose. The command 'comp peer' will list the values of different nodes for several pre-selected properties. You can use the peer command with -refnode argument to compare those properties of other nodes against the reference node.


What is a stage?

CVU supports the notion of Stage verification. It identifies all the important stages in RAC deployment and provides each stage with its own entry and exit criteria. The entry criteria for a stage define a specific set of verification tasks to be performed before initiating that stage. This pre-check saves the user from entering into a stage unless its pre-requisite conditions are met. The exit criteria for a stage define another specific set of verification tasks to be performed after completion of the stage. The post-check ensures that the activities for that stage have been completed successfully. It identifies any stage specific problem before it propagates to subsequent stages; thus making it difficult to find its root cause. An example of a stage is "pre-check of database installation", which checks whether the system meets the criteria for RAC install.


How do I know about cluvfy commands? The usage text of cluvfy does not show individual commands.

Cluvfy has context sensitive help built into it. Cluvfy shows the most appropriate usage text based on the cluvfy command line arguments. If you type 'cluvfy' on the command prompt, cluvfy displays the high level generic usage text, which talks about valid stage and component syntax. If you type 'cluvfy comp -list', cluvfy will show valid components with brief description on each of them. If you type 'cluvfy comp -help', cluvfy will show detail syntax for each of the valid components. Similarly, 'cluvfy stage -list' and 'cluvfy stage -help' will list valid stages and their syntax respectively. If you type an invalid command, cluvfy will show the appropriate usage for that particular command. For example, if you type 'cluvfy stage -pre dbinst', cluvfy will show the syntax for pre-check of dbinst stage.


Can I check if the storage is shared among the nodes?

Yes, you can use 'comp ssa' command to check the sharedness of the storage. Please refer to the known issues section for the type of storage supported by cluvfy in the cluvfy help command output.


Does Database blocksize or tablespace blocksize affect how the data is passed across the interconnect?

Oracle ships database block buffers, i.e. blocks in a tablespace configured for 16K will result in a 16K data buffer shipped, blocks residing in a tablespace with base block size (8K) will be shipped as base blocks and so on; the data buffers are broken down to packets of MTU sizes.

There are optimizations in newer releases like compressing etc that are beyond the scope of this FAQ


What is Oracle's position with respect to supporting RAC on Polyserve CFS?

Please check the certification matrix available through My Oracle Support for your specific release.


How do I check network or node connectivity related issues?

Use component verifications commands like 'nodereach' or 'nodecon' for this purpose. For detail syntax of these commands, type cluvfy comp -help on the command prompt. If the 'cluvfy comp nodecon' command is invoked without -i, cluvfy will attempt to discover all the available interfaces and the corresponding IP address & subnet. Then cluvfy will try to verify the node connectivity per subnet. You can run this command in verbose mode to find out the mappings between the interfaces, IP addresses and subnets. You can check the connectivity among the nodes by specifying the interface name(s) through -i argument.


What is CVU? What are its objectives and features?

CVU brings ease to RAC users by verifying all the important components that need to be verified at different stages in a RAC environment. The wide domain of deployment of CVU ranges from initial hardware setup through fully operational cluster for RAC deployment and covers all the intermediate stages of installation and configuration of various components. The command line tool is cluvfy. Cluvfy is a non-intrusive utility and will not adversely affect the system or operations stack.


Is there a cluster file system (CFS) Available for Linux?

Yes, ACFS (ASM Cluster File System with Oracle Database 11g Release 2) and OCFS (Oracle Cluster Filesystem) are available for Linux. The following My Oracle Support document has information for obtaining the latest version of OCFS:

Document 238278.1 - How to find the current OCFS version for Linux


Is OCFS2 certified with Oracle RAC 10g?

Yes. See Certify to find out which platforms are currently certified.


How do I check the Oracle Clusterware stack and other sub-components of it?

Cluvfy provides commands to check a particular sub-component of the CRS stack as well as the whole CRS stack. You can use the 'comp ocr' command to check the integrity of OCR. Similarly, you can use 'comp crs' and 'comp clumgr' commands to check integrity of crs and cluster manager sub-components. To check the entire CRS stack, run the stage command 'clucvy stage -post crsinst'.


Where can I find the CVU trace files?

CVU log files can be found under $CV_HOME/cv/log directory. The log files are automatically rotated and the latest log file has the name cvutrace.log.0. It is a good idea to clean up unwanted log files or archive them to reclaim disk place.

In recent releases, CVU trace files are generated by default. Setting SRVM_TRACE=false before invoking cluvfy disables the trace generation for that invocation.


Can I use Oracle Clusterware for failover of the SAP Enqueue and VIP services when running SAP in a RAC environment?

Oracle has created sapctl to do this and it is available for certain platforms. SAPCTL will be available for download on SAP Services Marketplace on AIX and Linux. Please check the market place for other platforms


How do I turn on tracing?

Set the environmental variable SRVM_TRACE to true. For example, in tcsh "setenv SRVM_TRACE true" will turn on tracing. Also it may help to run cluvfy with -verbose attribute
$script run.log
$export SRVM_TRACE=TRUE
$cluvfy -blah -verbose
$exit


How do I check whether OCFS is properly configured?

You can use the cluvfy component command 'cfs' to check this. Provide the OCFS file system you want to check through the -f argument. Note that, the sharedness check for the file system is supported for OCFS version 1.0.14 or higher.


My customer is about to install 10202 clusterware on new Linux machines. He is getting "No ORACM running" error when run rootpre.sh and exited? Should he worry about this message?

It is an informational message. Generally for such scripts, you can issue echo “$?” to ensure that it returns a zero value. The message is basically saying, it did not find an oracm. If Customer were installing 10g on an existing 9i cluster (which will have oracm) then this message would have been serious. But since customer is installing this on a fresh new box, They can continue the install.


Can different releases of Oracle RAC be installed and run on the same physical Linux cluster?

Yes!!!

The details answer is broken into three categories

  • Oracle Version 10g and above only
    We only require that Oracle Clusterware version be higher than or equal to the Database release. Customer can run multiple releases on the same cluster.
  • Oracle Version 10g and higher alongside Oracle Version less than 10g
    Oracle Clusterware (CRS) will not support a Oracle 9i RAC database so you will have to leave the current configuration in place. You can install Oracle Clusterware and Oracle RAC 10g or 11g into the same cluster. On Windows and Linux, you must run the 9i Cluster Manager for the 9i Database and the Oracle Clusterware for the 10g Database. When you install Oracle Clusterware, your 9i srvconfig file will be converted to the OCR. Oracle 9i RAC, Oracle RAC 10g, and Oracle RAC 11g will use the OCR. Do not restart the 9i gsd after you have installed Oracle Clusterware. Remember to check certify for details of what vendor clusterware can be run with Oracle Clusterware. Oracle Clusterware must be the highest level (down to the patchset). IE Oracle Clusterware 11g Release 2 will support Oracle RAC 10g and Oracle RAC 11g databases. Oracle Clusterware 10g can only support Oracle RAC 10g databases.

Oracle Clusterware fails to start after a reboot due to permissions on raw devices reverting to default values. How do I fix this?

After a successful installation of Oracle Clusterware a simple reboot and Oracle Clusterware fails to start. This is because the permissions on the raw devices for the OCR and voting disks e.g. /dev/raw/raw{x} revert to their default values (root:disk) and are inaccessible to Oracle. This change of behavior started with the 2.6 kernel; in RHEL4, OEL4, RHEL5, OEL5, SLES9 and SLES10. In RHEL3 the raw devices maintained their permissions across reboots so this symptom was not seen.

The way to fix this is on RHEL4, OEL4 and SLES9 is to create /etc/udev/permission.d/40-udev.permissions (you must choose a number that's lower than 50). You can do this by copying /etc/udev/permission.d/50-udev.permissions, and removing the lines that are not needed (50-udev.permissions gets replaced with upgrades so you do not want to edit it directly, also a typo in the 50-udev.permissions can render the system non-usable). Example permissions file:
# raw devicesraw/raw[1-2]:root:oinstall:0640raw/raw[3-5]:oracle:oinstall:0660

Note that this applied to all raw device files, here just the voting and OCR devices were specified.

On RHEL5, OEL5 and SLES10 a different file is used /etc/udev/rules.d/99-raw.rules, notice that now the number must be (any number) higher than 50. Also the syntax of the rules is different than the permissions file, here's an example:

This is explained in detail in Document 414897.1 .


Customer did not load the hangcheck-timer before installing RAC, Can the customer just load the hangcheck-timer ?

YES. hangcheck timer is a kernel module that is shipped with the Linux kernel, all you have to do is load it as follows:

For more details see Document 726833.1


After installing patchset 9013 and patch_2313680 on Linux, the startup was very slow

Please carefully read the following new information about configuring Oracle Cluster Management on Linux, provided as part of the patch README:

Three parameters affect the startup time:

soft_margin (defined at watchdog module load)

-m (watchdogd startup option)

WatchdogMarginWait (defined in nmcfg.ora).

WatchdogMarginWait is calculated using the formula:

WatchdogMarginWait = soft_margin(msec) + -m + 5000(msec).

[5000(msec) is hardcoded]

Note that the soft_margin is measured in seconds, -m and WatchMarginWait are measured in milliseconds.

Based on benchmarking, it is recommended to set soft_margin between 10 and 20 seconds. Use the same value for -m (converted to milliseconds) as used for soft_margin. Here is an example:

soft_margin=10 -m=10000 WatchdogMarginWait = 10000+10000+5000=25000

If CPU utilization in your system is high and you experience unexpected node reboots, check the wdd.log file. If there are any 'ping came too late' messages, increase the value of the above parameters.


Is there a way to verify that the Oracle Clusterware is working properly before proceeding with RAC install?

Yes. You can use the post-check command for cluster services setup(-post clusvc) to verify CRS status. A more appropriate test would be to use the pre-check command for database installation(-pre dbinst). This will check whether the current state of the system is suitable for RAC install.


How do I configure my RAC Cluster to use the RDS Infiniband?

Ensure that the IB (Infiniband) Card is certified for the OS, Driver, Oracle version etc.

You may need to relink Oracle using the command

$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk ipc_rds ioracle

You can check your interconnect through the alert log at startup. Check for the string “cluster interconnect IPC version:Oracle RDS/IP (generic)” in the alert.log file.

See Document 751343.1 for more details.


Why is validateUserEquiv failing during install (or cluvfy run)?

SSH must be set up as per the pre-installation tasks. It is also necessary to have file permissions set as described below for features such as Public Key Authorization to work. If your permissions are not correct, public key authentication will fail, and will fallback to password authentication with no helpful message as to why. The following server configuration files and/or directories must be owned by the account owner or by root and GROUP and WORLD WRITE permission must be disabled.

$HOME
$HOME/.rhosts
$HOME/.shosts
$HOME/.ssh
$HOME/.ssh.authorized-keys
$HOME/.ssh/authorized-keys2 #Openssh specific for ssh2 protocol.

SSH (from OUI) will also fail if you have not connected to each machine in your cluster as per the note in the installation guide:

The first time you use SSH to connect to a node from a particular system, you may see a message similar to the following:

The authenticity of host 'node1 (140.87.152.153)' can't be established. RSA key fingerprint is 7z:ez:e7:f6:f4:f2:4f:8f:9z:79:85:62:20:90:92:z9.
Are you sure you want to continue connecting (yes/no)?

Enter |yes| at the prompt to continue. You should not see this message again when you connect from this system to that node. Answering yes to this question causes an entry to be added to a "known-hosts" file in the .ssh directory which is why subsequent connection requests do not re-ask.
This is known to work on Solaris and Linux but may work on other platforms as well.


What is Runtime Connection Load Balancing?

Runtime connection Load balancing enables the connection pool to route incoming work requests to the available database connection that will provide it with the best service. This will provide the best service times globally, and routing responds fast to changing conditions in the system. Oracle has implemented runtime connection load balancing with ODP.NET and JDBC connection pools. Runtime Connection Load Balancing is tightly integrated with the automatic workload balancing features introduced with Oracle Database 10g I.E. Services, Automatic Workload Repository, and the new Load Balancing Advisory.


How do I enable the load balancing advisory?

Load balancing advisory requires the use of services and Oracle Net connection load balancing.
To enable it, on the server: set a goal (service_time or throughput, and set CLB_GOAL=SHORT ) for the service.
For client, you must be using the connection pool.
For JDBC, enable the datasource parameter FastConnectionFailoverEnabled.
For ODP.NET enable the datasource parameter Load Balancing=true.


What are the network requirements for an extended RAC cluster?

Necessary Connections

Interconnect, SAN, and IP Networking need to be kept on separate channels, each with required redundancy. Redundant connections must not share the same Dark Fiber (if used), switch, path, or even building entrances. Keep in mind that cables can be cut.

The SAN and Interconnect connections need to be on dedicated point-to-point connections. No WAN or Shared connection allowed. Traditional cables are limited to about 10 km if you are to avoid using repeaters. Dark Fiber networks allow the communication to occur without repeaters. Since latency is limited, Dark Fiber networks allow for a greater distance in separation between the nodes. The disadvantage of Dark Fiber networks are they can cost hundreds of thousands of dollars, so generally they are only an option if they already exist between the two sites.

If direct connections are used (for short distances) this is generally done by just stringing long cables from a switch. If a DWDM or CWDM is used then then these are directly connected via a dedicated switch on either side.
Note of caution: Do not do RAC Interconnect over a WAN. This is a the same as doing it over the public network which is not supported and other uses of the network (i.e. large FTPs) can cause performance degradations or even node evictions.

For SAN networks make sure you are using SAN buffer credits if the distance is over 10km.
If Oracle Clusterware is being used, we also require that a single subnet be setup for the public connections so we can fail over VIPs from one side to another.


What is the maximum distance between nodes in an extended RAC environment?

The high impact of latency create practical limitations as to where this architecture can be deployed. While there is not fixed distance limitation, the additional latency on round trip on I/O and a one way cache fusion will have an affect on performance as distance increases. For example tests at 100km showed a 3-4 ms impact on I/O and 1 ms impact on cache fusion, thus the farther distance is the greater the impact on performance. This architecture fits best where the 2 datacenters are relatively close (<~25km) and the impact is negligible. Most customers implement under this distance w/ only a handful above and the farthest known example is at 100km. Largest distances than the commonly implemented may want to estimate or measure the performance hit on their application before implementing. Due ensure a proper setup of SAN buffer credits to limit the impact of distance at the I/O layer.


Are crossover cables supported as an interconnect with Oracle RAC on any platform?

NO. CROSS OVER CABLES ARE GENERALLY NOT SUPPORTED. The requirement is to use a switch.

The only exception is the Oracle Database Appliance (ODA), for which crossover cables are used.

Detailed Reasons:

\1) cross-cabling limits the expansion of Oracle RAC to two nodes.

\2) cross-cabling is unstable. Experience has also shown that a lot of adapters misbehave when used in a crossover configuration, leading to unexpected behavior in the cluster.


What do I do if I see GC CR BLOCK LOST in my top 5 Timed Events in my AWR Report?

You should never see this or BLOCK RETRY events. A number of issues can lead to waits on this event and is covered in Document 563566.1 Work with your system administrator or/and network administrator to diagnose the issue.

Common symptom of packet loss is reflected in netstat -s as shown below

Ip:
84884742 total packets received
1201 fragments dropped after timeout
3384 packet reassembles failed

Fragments dropped or packet reassemblies failed should either be 0 or not increasing over time.

ifconfig –a:

eth0 Link encap:Ethernet HWaddr
inet addr: Bcast: Mask:
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:21721236 errors:135 dropped:0 overruns:0 frame:95
TX packets:273120 errors:0 dropped:0 overruns:0 carrier:0

High number of errors increasing over time is an indicator of a problem that should be diagnosed and fixed. Note 563566.1 provides a useful guide to common issues and solutions that can be used to fix the problem


Will adding a new instance to my Oracle RAC database (new node to the cluster) allow me to scale the workload?

YES! Oracle RAC allows you to dynamically scale out your workload by adding another node to the cluster. You must remember that adding more work to the database means that in addition to the CPU and Memory that the new node brings, you will have to ensure that your I/O subsystem can support the additional I/O requirements. In an Oracle RAC environment, you need to look at the total I/O across all instances in the cluster.

Are Red Hat GFS and GULM certified for DLM?

Both are part of Red Hat RHEL 5. For Oracle Database 10g Release 2 on Linux x86 and Linux x86-64, it is certified on OEL5 and RHEL5 as per certify. GFS is not certified yet , certification in progress by RedHat. OCFS2 is certified and it's the preferred choice for Oracle. ASM is recommended storage for the database. Since GFS is part of the RHEL5 distribution and Oracle fully supports RHEL under the Unbreakable Linux Program, Oracle will support GFS as part of RHEL5 for customers buying the Unbreakable Linux Support. This only applies to RHEL5 and not to RHEL4 where GFS is distributed with an additional fee


How do I configure raw devices in order to install Oracle Clusterware 10g on RHEL5 or OEL5?

The raw devices OS support scripts like /etc/sysconfig/rawdevices are not shipped on RHEL5 or OEL5, this is because raw devices are being deprecated on Linux. This means that in order to install Oracle Clusterware 10g you'd have to manually bind the raw devices to the block devices for the OCR and voting disks so that the 10g installer will proceed without error.

Refer to Document 465001.1 for exact details on how to do the above.

Oracle Clusterware 11g and higher releases doesn't require this configuration since the installer can detect and use block devices directly.


Is Server Side Load Balancing supported/recommended/proven technology in Oracle EBusiness Suite?

Yes, Customers are using it successfully today. It is recommended to set up both Client and Server side load balancing. Connections coming from 8.0.6 home (forms and ccm), are directed to RAC instance based on the sequence its listed in the TNS entry description list and may not get load balanced optimally. For Oracle RAC 10.2 or higher do NOT set PREFER_LEAST_LOADED_NODE = OFF in your listener.ora.
Please set the CLB_GOAL on the service.


How do I change my Veritas SF RAC installation to use UDP instead of LLT?

Using UDP with Veritas Clusterware and Oracle RAC 10g seems to require an exception from Veritas so this may be something you should check with them.

To make it easier for customers to convert their LLT environments to UPD, Oracle has created Patch 6846006 on 10.2.0.3 which contains the libraries that were overwritten by the Veritas installation (IE those mentioned above). Converting from specialized protocols to UDP requires a relink after the Oracle libraries have been restored. This needs a complete cluster shutdown and cannot be accomplished in a rolling fashion.

NOTE: Oracle RAC 11g will not support LLT for interconnect.


How to reorder or rename logical network interface (NIC) names in Linux

Linux Operating system assigns logical Network interface (NIC)s names based on device discovery during boot process. This is fairly consistent between reboots, However sometimes a kernel upgrade or driver update can affect discovery and cause the logical name to be different. For example, What used to be eth0, would now be eth1.

Udev rules can be used to persist logical NIC names. For more details on logical NIC names, please refer here.


How does UDP over Infiniband compare to UDP over Gigabit Ethernet when used for the RAC interconnect?

Infiniband in general provides lower latency and higher bandwidth than Ethernet and hence is commonly assumed to provide better performance. On the other hand, Infiniband infrastructures are generally more expensive than traditional Ethernet infrastructures, which may need to be considered depending on the use case.

Using Infiniband, customers can choose between two different protocols for the inter-instance communication of RAC-enabled database instances, while Oracle Clusterware will use UDP (starting with 11g Rel. 2) over IPoIB (IP over Infiniband). By default, inter-instance communication of RAC-enabled database instances will use the same approach. Alternatively, the RDS (Reliable Datagram socket) can be used with Infiniband, which then must be enabled explicitly by linking RDS protocol support into the database home.

Oracle Exadata Database Machines and other Engineered Systems use RDS for the inter-instance communication by default. For generic systems, RDS support is provided on certain hardware and software configurations. More details can be found here: Oracle technology site for Linux, Oracle Technology site for Unix , Oracle technology site for Windows


Is the hangcheck timer still needed with Oracle RAC 10g and 11gR1?

YES! hangcheck-timer is required for 10g and 11gR1 (11.1.*). It is no longer needed in Oracle Clusterware 11gR2 and higher releases.

The hangcheck-timer module monitors the Linux kernel for extended operating system hangs that could affect the reliability of the RAC node ( I/O fencing) and cause database corruption. To verify the hangcheck-timer module is running on every node:

To ensure the module is loaded every time the system reboots, verify that the local system startup file (/etc/rc.d/rc.local) contains the command above.

For additional information please review the Oracle RAC Install and Configuration Guide (5-41) and Document 726833.1.


Can I have different servers in my Oracle RAC? Can they be from different vendors? Can they be of different sizes?

Oracle Real Application Clusters (RAC) requires all the nodes to run the same Operating System binary in a cluster (IE All nodes must be Windows 2008 or all nodes must be Oracle Linux 6). All nodes must be the same architecture (I.E. All nodes must be either 32 bit or all nodes must be 64 bit or all nodes must be HP-UX PARISC since you cannot mix PARISC with Itanium).

Oracle RAC does support a cluster with nodes that have different hardware configurations. An example is a cluster with 3 nodes with 4 CPUs and another node with 6 CPUs. This can easily occur when adding a new node after the cluster has been in production for a while. For this type of configuration, customers must consider some additional features to get the optimal cluster performance. The servers used in the cluster can be from different vendors; this is fully supported as long as they run the same binaries. Since many customers implement Oracle RAC for high availability, you must make sure that your hardware vendor will support the configuration. If you have a failure, will you get support for the hardware configuration?

The installation of Oracle Clusterware expects the network interface to be the same name on all nodes in the cluster. If you are using different hardware, you may need to work with your operating system vendor to make sure the network interface names are the same name on all nodes (IE eth0). Customers implementing uneven cluster configurations need to consider how they will balance the workload across the cluster. Some customers have chosen to manually assign different workloads to different nodes. This can be done using database services however it is often difficult to predict workloads and the system cannot dynamically react to changes in workload. Changes to workload require the DBA to modify the service. You will also need to consider how you will survive failures in the cluster. Will the service levels be maintained if the larger node in the cluster fails? Especially in a small cluster, the impact of losing a node could impact the ability to continue processing the application workload.

The impact of the different sized nodes depends on how much difference there is in the size. If there is a large difference between the nodes in terms of memory and CPU size, than the "bigger" nodes will attract more load, obviously, and in the case of failure the "smaller" node(s) will become overpowered. In such a case, static routing of workload via services e.g. batch and certain services, which can be suspended/stopped if the large node fails and the cluster has significantly reduced capacity, may be advisable. The general recommendation is that the nodes should be sized in such a way that the aggregated peak load of the large node(s) can be absorbed by the smaller node(s), i.e. smaller node should have sufficient capacity to run the essential services alone. Another option is to add another small node to the cluster on demand in case that the large one fails.

It should also be noted especially if there is a large difference between the sizes of the nodes, the small nodes can slow down the larger node. This could be critical one if the smaller node is very busy and must serve data to the large node.

To help balance workload across a cluster, Oracle RAC 10g Release 2 and above provides the Load Balancing Advisory (LBA). The load balancing advisory runs in an Oracle RAC database and monitors the work executed by the service on all instances where the service is active in the cluster. The LBA provides recommendations to the subscribed clients about the state of the service and where the client should direct connection requests. Setting the GOAL on the service activates the load balancing advisory. Clients that can utilize the load balancing advisory are Oracle JDBC Implicit Connection Cache, Oracle Universal Connection Pool for Java, Oracle Call Interface Session Pool, ODP.NET Connection Pool, and Oracle Net Services Connection Manager. The Oracle Listener also uses the Load Balancing Advisory if CLB_GOAL parameter is set to SHORT (recommended Best Practice if using an integrated Oracle Client mentioned here). If CLB_GOAL is set to LONG (default), the Listener will load balance the number of sessions for the service across the instances where the service is available. See the Oracle Real Application Clusters Administration and Deployment Guide for details on implementing services and the various parameter settings.


How many nodes can one have in an HP-UX/Solaris/AIX/Windows/Linux cluster?

Technically and since Oracle RAC 10g Release 2, 100 nodes are supported in one cluster. This includes running 100 database instances belonging to the same (production) database on this cluster, using the Oracle Database Enterprise Edition (EE) with the Oracle RAC option and Oracle Clusterware only (no third party / vendor cluster solution underneath).

Note that using the Oracle Database Standard Edition (SE), which includes the Oracle RAC functionality, further restrictions regarding the number of nodes per cluster apply. Also note that one cannot use a third party or vendor cluster for an Oracle Database Standard Edition based Oracle RAC cluster. For more information see the licensing information.

When using a third party / vendor cluster software the following limits apply (subject to change without notice):
Solaris Cluster: 8
HP-UX Service Guard: 16
HP Tru64: 8
IBM AIX:
* 8 nodes for Physical Shared (CLVM) SSA disk
* 16 nodes for Physical Shared (CLVM) non-SSA disk
* 128 nodes for Virtual Shared Disk (VSD)
* 128 nodes for GPFS
* Subject to storage subsystem limitations
Veritas: 8-16 nodes (check w/ Veritas)

Node limitations should always be checked with the cluster software vendor.


Are 3rd party cluster solutions supported on Linux?

For certified third party cluster solutions, please refer to the certification section of Oracle Support (My Oracle Support).

If a third party cluster solution is certified with Oracle Real Application Clusters (RAC) or Oracle Clusterware, it will be listed as certified. Note that Oracle RAC One Node certification follows Oracle RAC certification in principle.

Note also that no third party cluster solution is certified under Oracle RAC used with the Oracle Standard Edition, regardless of the operating system used. This is a licensing restriction and can be found in the Oracle Licensing guide.


How many nodes are supported or can be used in an Oracle RAC Database?

Technically and *since* Oracle RAC 10g Release 2, 100 nodes are supported in one cluster. This includes running 100 database instances belonging to the same (production) database on this cluster, using the Oracle Database Enterprise Edition (EE) with the Oracle RAC option and Oracle Clusterware only (no third party / vendor cluster solution underneath).

In previous releases, the DBCA (as a result of further MAXINSTANCES-parameter related restrictions) would only allow creating 63 instances per database. These restrictions have been lifted with Oracle 11g Release 1 and later versions, in favor of supporting 100 nodes as described. For completeness: With Oracle RAC 10g Release 1 the maximum was 63. In Oracle RAC 9i the maximum is platform specific due to the different cluster software support by different vendors.

Note that using the Oracle Database Standard Edition (SE), which includes the Oracle RAC functionality, further restrictions regarding the number of nodes per cluster apply. For more information, see: Special-Use Licensing


What are my options for setting the Load Balancing Advisory GOAL on a Service?

The load balancing advisory is enabled by setting the GOAL on your service either through PL/SQL DBMS_SERVICE package or EM DBControl Clustered Database Services page. There are 3 options for GOAL:
None - Default setting, turn off advisory
THROUGHPUT - Work requests are directed based on throughput. This should be used when the work in a service completes at homogenous rates. An example is a trading system where work requests are similar lengths.
SERVICE_TIME - Work requests are directed based on response time. This should be used when the work in a service completes at various rates. An example is as internet shopping system where work requests are various lengths
Note: If using GOAL, you should set CLB_GOAL=SHORT


Is Oracle Database on VMware support? Is Oracle RAC on VMware supported?

Oracle Database support on VMware is outlined in My Oracle Support Document 249212.1. Effectively, for most customers, this means they are not willing to run production Oracle databases on VMware. Regarding Oracle RAC - the explicit mention not to run RAC on vmware was removed in 11.2.0.2 (Novemeber 2010)


What is 'cvuqdisk' rpm? Why should I install this rpm?

CVU requires root privilege to gather information about the scsi disks during discovery. A small binary uses the setuid mechanism to query disk information as root. Note that this process is purely a read-only process with no adverse impact on the system. To make this secured, this binary is packaged in the cvuqdisk rpm and need root privilege to install on a machine. If this package is installed on all the nodes, CVU will be able to perform discovery and shared storage accessibility checks for scsi disks. Otherwise, it complains about the missing package 'cvuqdisk'. Note that, this package should be installed only on RedHat Linux 3.0 distribution. Discovery of scsi disks for RedHat Linux 2.1 is not supported.


What is the Load Balancing Advisory?

To assist in the balancing of application workload across designated resources, Oracle Database 10g Release 2 provides the Load Balancing Advisory. This Advisory monitors the current workload activity across the cluster and for each instance where a service is active; it provides a percentage value of how much of the total workload should be sent to this instance as well as service quality flag. The feedback is provided as an entry in the Automatic Workload Repository and a FAN event is published. The easiest way for an application to take advantage of the load balancing advisory, is to enable Runtime Connection Load Balancing with an integrated client.


A customer installed 10g Release 2 on Linux RH4 Update 2, 2.6.9-22.ELsmp #1 SMP x86_64 GNU/Linux, and got the error Error in invoking target 'all_no_orcl'. Customer ignored the error and the install succeeded without any other errors and oracle apparently worked fine. What should they do?

Because of compatibility with their storage array (EMC DMX with Powerpath 4.5) they must use update 2. Oracle install guide states that RH4 64 bits update 1 "or higher" should be used for 10g R2.
The binutils patch binutils-.15.92.0.2-13.0.0.0.2.x86_64.rpm is needed to relink without error. Red Hat is aware of the bug. Customers should use the latest update (or at least update 3 to fix).


Are Oracle Applications certified with RAC?

For Siebel, PeopleSoft see http://realworld.us.oracle.com/isv/siebel.htm Oracle 9i RAC (9.2) and Oracle RAC 10g (10.1) are certified with Oracle Applications EBusiness Suite. .


Do I have to type the nodelist every time for the CVU commands? Is there any shortcut?

You do not have to type the nodelist every time for the CVU commands. Typing the nodelist for a large cluster is painful and error prone. Here are few short cuts. To provide all the nodes of the cluster, type '-n all'. Cluvfy will attempt to get the nodelist in the following order: 1. If a vendor clusterware is available, it will pick all the configured nodes from the vendor clusterware using lsnodes utility. 2. If CRS is installed, it will pick all the configured nodes from Oracle clusterware using olsnodes utility. 3. In none of the above, it will look for the CV_NODE_ALL environmental variable. If this variable is not defined, it will complain. To provide a partial list(some of the nodes of the cluster) of nodes, you can set an environmental variable and use it in the CVU command. For example: setenv MYNODES node1,node3,node5 cluvfy comp nodecon -n $MYNODES


How do I check user accounts and administrative permissions related issues?

Use admprv component verification command. Refer to the usage text for detail instruction and type of supported operations. To check whether the privilege is sufficient for user equivalence, use '-o user_equiv' argument. Similarly, the '-o crs_inst' will verify whether the user has the correct permissions for installing CRS. The '-o db_inst' will check for permissions required for installing RAC and '-o db_config' will check for permissions required for creating a RAC database or modifying a RAC database configuration.


How to configure bonding on Suse SLES9.


How to configure bonding on Suse SLES8.

Please see Document 291958.1


How to configure concurrent manager in a RAC environment?

Large clients commonly put the concurrent manager on a separate server now (in the middle tier) to reduce the load on the database server. The concurrent manager programs can be tied to a specific middle tier (e.g., you can have CMs running on more than one middle tier box). It is advisable to use specilize CM. CM middle tiers are set up to point to the appropriate database instance based on product module being used.


What is the optimal migration path to be used while migrating the E-Business suite to Oracle RAC?

Following is the recommended and most optimal path to migrate you E-Business suite to an Oracle RAC environment:

\1. Migrate the existing application to new hardware. (If applicable).

\2. Use Clustered File System (ASM recommended) for all data base files or migrate all database files to raw devices. (Use dd for Unix or ocopy for NT)

\3. Install/upgrade to the latest available e-Business suite.

\4. Ensure the database version is supported with Oracle RAC

\5. In step 4, install Oracle RAC option and use Installer to perform install for all the nodes.

\6. Clone Oracle Application code tree.

Reference Documents:
Oracle E-Business Suite Release 11i with 9i RAC: Installation and Configuration : <>
E-Business Suite 11i on RAC : Configuring Database Load balancing & Failover: <>
Oracle E-Business Suite 11i and Database - FAQ : Document 285267.1


Can I use TAF with e-Business in a RAC environment?

TAF itself does not work with e-Business suite due to Forms/TAF limitations, but you can configure the tns failover clause. On instance failure, when the user logs back into the system, their session will be directed to a surviving instance, and the user will be taken to the navigator tab. Their committed work will be available; any uncommitted work must be re-started.

We also recommend you configure the forms error URL to identify a fallback middle tier server for Forms processes, if no router is available to accomplish switching across servers.


Can I use Automatic Undo Management with Oracle Applications?

Yes. In a RAC environment we highly recommend it.


Which e-Business version is prefereable?

Versions 11.5.5 onwards are certified with Oracle9i and hence with Oracle9i RAC. However we recommend the latest available version.


Should functional partitioning be used with Oracle Applications?

We do not recommend functional partitioning unless throughput on your server architecture demands it. Cache fusion has been optimized to scale well with non-partitioned workload.

If your processing requirements are extreme and your testing proves you must partition your workload in order to reduce internode communications, you can use Profile Options to designate that sessions for certain applications Responsibilities are created on a specific middle tier server. That middle tier server would then be configured to connect to a specific database instance.

To determine the correct partitioning for your installation you would need to consider several factors like number of concurrent users, batch users, modules used, workload characteristics etc.


Is the Oracle E-Business Suite (Oracle Applications) certified against RAC?

Yes. (There is no seperate certification required for RAC.)""


I am seeing the wait events 'ges remote message', 'gcs remote message', and/or 'gcs for action'. What should I do about these?

These are idle wait events and can be safetly ignored. The 'ges remote message' might show up in a 9.0.1 statspack report as one of the top wait events. To have this wait event not show up you can add this event to the PERFSTAT.STATS$IDLE_EVENT table so that it is not listed in Statspack reports.


Do I need to relink the Oracle Clusterware / Grid Infrastructure home after an OS upgrade?

Using Oracle Clusterware 10g and 11.1, Oracle Clusterware binaries cannot be relinked. However, the client shared libraries, which are part of the home can be relinked, in most cases there should not be a need to relink them. See Document 743649.1 for more information.

Using Oracle Grid Infrastructure 11.2 and higher, there are some executables in the Grid home that can and should be relinked after an OS upgrade. The following steps describe how to relink an Oracle Grid Infrastructure for Clusters home:

As root:

# cd Grid_home/crs/install
# perl rootcrs.pl -unlock

As the grid infrastructure for a cluster owner:

$ export ORACLE_HOME=Grid_home
$ Grid_home/bin/relink

As root again:

# cd Grid_home/crs/install
# perl rootcrs.pl -patch

Note: If using Oracle Grid Infrastructure for Standalone Environments (Oracle Restart), see the Oracle Documentation for more information: https://docs.oracle.com/database/121/LADBI/oraclerestart.htm#LADBI999

Note: It is recommended to use the Perl version that comes along with your Grid Infrastructure Install i.e Grid_home/perl/bin/perl rootcrs.pl -patch.


How can I configure database instances to run on 12.1.0.1 Oracle Flex Cluster Leaf nodes?

Oracle 12c introduces a new cluster topology called Flex Cluster where servers in the cluster can assume the specific roles - HUB and LEAF. In the 12.1.0.1 release, the LEAF nodes can only be configured for non-database applications. Database instances are not supported to run on the 12.1.0.1 Flex Cluster LEAF nodes. Please see the 12c Flex Cluster FAQ statement under Oracle Clusterware.


When configuring the NIC cards and switch for a GigE Interconnect should it be set to FULL or Half duplex in Oracle RAC?

You must use Full Duplex for all network communication. Half Duplex means you can only either send OR receive at a time.

Note that modern OS's default to Full Duplex unless there is some cable problem or some mis-configuration in the switch


How can a NAS storage vendor certify their storage solution for Oracle RAC ?

Please refer to this link on OTN for details on Oracle RAC Technologies Matrix (storage being part of it).


Is Infiniband supported for the Oracle RAC interconnect?

Yes, it is supported.


What kind of HW components do you recommend for the interconnect?

The general recommendation for the interconnect is to provide the highest bandwidth interconnect, together with the lowest latency protocol that is available for a given platform.

You should use a redundant 1 Gigabit Ethernet and you should use Load Balancing Across, we recommend you use HAIP's for this or the Redundant Interconnect Usage Feature. Do remember that if you use this feature you must use different subnets for the Interconnect.


Where can I find a list of supported solutions to ensure NIC availability / redundancy (for the interconnect) per platform?

IBM AIX - available solutions:

    • Etherchannel (OS based)

    • HACMP based network failover solution

    HP HP/UX - available solutions:

    • APA - Auto Port Aggregation (OS based)

    • MC/Serviceguard based network failover solution

    • Combination of both solutions

    More information: Document 296874.1 and Auto Port Aggregation (APA) Support Guide

    Sun Solaris - available solutions:

    • Sun Trunking (OS based)

    • Sun IPMP (OS based)

    • Sun Cluster based network failover solution (clprivnet)

    More information for Oracle RAC 10g and Oracle RAC 11g Release 1:

    • Configure IPMP for the Oracle VIP and IPMP introduction
    • How to Setup IPMP as Cluster Interconnect
    • More information for Oracle RAC 11g Release 2:
    • My Oracle Support Document 1069584.1 - Solaris IPMP and Trunking for the cluster interconnect in Oracle Grid Infrastructure

    Linux - available solutions:

    • Bonding

    More information: Document 298891.1

    Windows - available solutions:

    • Teaming

    On Windows teaming solutions to ensure NIC availability are usually part of the network card driver.
    Thus, they depend on the network card used. Please, contact the respective hardware vendor for more information.

    • OS independent solution:
    • Redundant Interconnect Usage enables load-balancing and high availability across multiple (up to four) private networks (also known as interconnects).
  • Oracle RAC 11g Release 2, Patch Set One (11.2.0.2) enables Redundant Interconnect Usage as a feature for all platforms, except Windows.

  • On systems that use Solaris Cluster, Redundant Interconnect Usage will use clprivnet.


What is Cache Fusion and how does this affect applications?

Cache Fusion is a new parallel database architecture for exploiting clustered computers to achieve scalability of all types of applications. Cache Fusion is a shared cache architecture that uses high speed low latency interconnects available today on clustered systems to maintain database cache coherency. Database blocks are shipped across the interconnect to the node where access to the data is needed. This is accomplished transparently to the application and users of the system. As Cache Fusion uses at most a 3 point protocol, this means that it easily scales to clusters with a large numbers of nodes.

For further information please refer to

Cache Fusion and the Global Cache Service
Part Number A96597-01
http://docs.oracle.com/cd/B10501_01/rac.920/a96597/pslkgdtl.htm

Additional Information can be found at:

Document 139436.1 Understanding 9i Real Application Clusters Cache Fusion


Can I run more than one clustered database on a single Oracle RAC cluster?

You can run multiple databases in a Oracle RAC cluster, either one instance per node (w/ different databases having different subsets of nodes in a cluster), or multiple instances per node (all databases running across all nodes) or some combination in between. Running multiple instances per node does cause memory and resource fragmentation, but this is no different from running multiple instances on a single node in a single instance environment which is quite common. It does provide the flexibility of being able to share CPU on the node, but the Oracle Resource Manager will not currently limit resources between multiple instances on one node. You will need to use an OS level resource manager to do this.


What are the restrictions on the SID with an Oracle RAC database? Is it limited to 5 characters?

The SID prefix in 10g Release 1 and prior versions was restricted to five characters by install/config tools so that an ORACLE_SID of upto max of 5+3=8 characters can be supported in an Oracle RAC environment. The SID prefix is relaxed up to 8 characters in 10g Release 2, see Bug 4024251 for more information.
With Oracle RAC 11g Release 2, SIDs in Oracle RAC with Policy Managed database are dynamically allocated by the system when the instance starts. This supports a dynamic grid infrastructure which allows the instance to start on any server in the cluster.


Is it supported to install Oracle Clusterware and Oracle RAC as different users?

Yes, Oracle Clusterware and Oracle RAC can be installed as different users. The Oracle Clusterware user and the Oracle RAC user must both have OINSTALL as their primary group. Every Database home can have a different OSDBA group with a different username.


Is it difficult to transition (migrate) from Single Instance to Oracle RAC?

If the cluster and the cluster software are not present, these components must be installed and configured. The Oracle RAC option must be added using the Oracle Universal Installer, which necessitates the existing DB instance must be shut down. There are no changes necessary on the user data within the database. However, a shortage of freelists and freelist groups can causecontention with header blocks of tables and indexes as multiple instances vie for the same block. This may cause a performance problem and require data partitioning. However, the need for these changes should be rare.

Recommendation: apply automatic space segment management to perform these changes automatically. The free space management will replace the freelists and freelist groups and is better. The database requires one Redo thread and one Undo tablespace for each instance, which are easily added with SQL commands or with Enterprise Manager tools. NOTE: With Oracle RAC 11g Release 2, you do not neet to pre-create redo threads or undo tablespaces if you are using Oracle Managed Files (EG ASM).

Datafiles will need to be moved to either a clustered file system (CFS) so that all nodes can access them. Oracle recommends the use of Automatic Storage Management (ASM) Also, the MAXINSTANCES parameter in the control file must be greater than or equal to number of instances you will start in the cluster.

For more detailed information, please see Migrating from single-instance to RAC in the Oracle Documentation.

With Oracle Database 10g Release 2, $ORACLE_HOME/bin/rconfig tool can be used to convert Single instance database to RAC. This tool takes in a xml input file and convert the Single Instance database whose information is provided in the xml. You can run this tool in "verify only" mode prior to performing actual conversion. This is documented in the Oracle RAC Admin book and a sample xml can be found $ORACLE_HOME/assistants/rconfig/sampleXMLs/ConvertToRAC.xml. This tool only supports databases using a clustered file system or ASM. You cannot use it with raw devices. Grid Control 10g Release 2 provides a easy to use wizard to perform this function.

Oracle Enterprise Manager includes workflows to assiste with migrations. (I.E. Migrating to ASM, Creating Standby, Converting Standby to RAC etc) The migration is automated in Enterprise Manager Grid Control 10.2.0.5.


Is rcp and/or rsh required for normal Oracle RAC operation ?

rcp"" and ""rsh"" are not required for normal Oracle RAC operation. However in older versions ""rsh"" and ""rcp"" were used by the installer and therefore should to be enabled for Oracle RAC and patchset installation. In later releases, ssh is used by default for these operations.


Does Oracle Clusterware or Oracle Real Application Clusters support heterogeneous platforms?

Oracle Clusterware and Oracle Real Application Clusters do not support heterogeneous platforms in the same cluster. We do support machines of different speeds and size in the same cluster. All nodes must run the same operating system (I.E. they must be binary compatible). In an active data-sharing environment, like Oracle RAC, we do not support machines having different chip architectures.


What are the dependencies between OCFS and ASM in Oracle Database 10g ?

In an Oracle RAC 10g environment, there is no dependency between Automatic Storage Management (ASM) and Oracle Cluster File System (OCFS).

OCFS is not required if you are using Automatic Storage Management (ASM) for database files. You can use OCFS on Windows( Version 2 on Linux ) for files that ASM does not handle - binaries (shared oracle home), trace files, etc. Alternatively, you could place these files on local file systems even though it's not as convenient given the multiple locations.

Oracle recommends using ASM/ACFS for your database files

本人提供Oracle(OCP、OCM)、MySQL(OCP)、PostgreSQL(PGCA、PGCE、PGCM)等数据库的培训和考证业务,私聊QQ646634621或微信dbaup66,谢谢!
AiDBA后续精彩内容已被站长无情隐藏,请输入验证码解锁本文!
验证码:
获取验证码: 请先关注本站微信公众号,然后回复“验证码”,获取验证码。在微信里搜索“AiDBA”或者“dbaup6”或者微信扫描右侧二维码都可以关注本站微信公众号。

标签:

Avatar photo

小麦苗

学习或考证,均可联系麦老师,请加微信db_bao或QQ646634621

您可能还喜欢...

发表回复