Chukwuemeka Eluka

October 3, 2017 | Penulis: Tanveer Shaikh | Kategori: Oracle Database, Databases, Data Warehouse, Microsoft Sql Server, Sql
Share Embed


Deskripsi Singkat

Description: The Teradata system uses the values in a row defined by the Primary Index to generate a hash value, which i...

Deskripsi

The Teradata system uses the values in a row defined by the Primary Index to generate a hash value, which in turn defines the node, vproc, cylinder and block that the row will be placed into. PI columns with many duplicate values will cause “hash collisions”, in which a sequential list is maintained of the rows that have the same hash value. Access to hash values with many rows is expensive. The average number of rows per hash value should never exceed 100. In very rare cases a limited number of rows per PI hash can exceed 5000, but only with justification in which it is known that the performance benefits outweigh the costs to maintain the PI. These are hard limits. In 2% of the rows, ITEM_ID is NULL because it was not always a requirement that Feedback be tied to a Listing. ITEM_ID is not a candidate because one value in the table has more than 5000 rows. SLR_ID has several hundred values, which have more than 5000 rows, it is not a candidate. BID_ID has an average of 2 rows per Bidder and a maximum of 1500 (actual value is unknown), hence it can be considered as a candidate. Alternate to Case 2: When a candidate column is very desirable (such as ITEM_ID), and is excluded by a small percentage of NULLs, the NULL values could be modified to distribute the rows so that the max/rows/value constraint is not compromised. A typical implementations would change the domain of NULLs to a negative value (say, based on a timestamp). Such a domain change should only be considered with extensive impact analysis.
Lihat lebih banyak...

Komentar

Hak Cipta © 2017 PDFDOKUMEN Inc.