Return Home
[Home Page]   [Previous Page]  
[Old Roundtable Archives]   [New Roundtable]

Re: chaid


[ Follow Ups ] [ Marketing Research Roundtable ]

Posted by rcolberg on April 13, 2000 at 02:17:59:

In Reply to: chaid posted by suhendin on April 13, 2000 at 02:17:45:

The following is obtained from www.statsoftinc.com/textbook/stathome.html. I recommend you first look at this resource for statistical information, they do an excellent job.

"Some classification trees programs, such as FACT (Loh & Vanichestakul, 1988) and THAID (Morgan & Messenger, 1973, as well as the related programs AID, for Automatic Interaction Detection, Morgan & Sonquist, 1963, and CHAID, for Chi-Square Automatic Interaction Detection, Kass, 1980) perform multi-level splits rather than binary splits when computing classification trees. A multi-level split performs k - 1 splits (where k is the number of levels of the splittting variable), as compared to a binary split which performs one split (regardless of the number of levels of the splittting variable). However, it should be noted that there is no inherent advantage of multi-level splits, because any multi-level split can be represented as a series of binary splits, and there may be disadvantages of using multi-level splits. With multi-level splits, predictor variables can be used for splitting only once, so the resulting classification trees may be unrealistically short and uninteresting (Loh & Shih, 1997). A more serious problem is bias in variable selection for splits. This bias is possible in any program such as THAID (Morgan & Sonquist, 1973) that employs an exhaustive search for finding splits (for a discussion, see Loh & Shih, 1997). Bias in variable selection is the bias toward selecting variables with more levels for splits, a bias which can skew the interpretation of the relative importance of the predictors in explaining responses on the dependent variable (Breiman et. al., 1984)."


Follow Ups:



Subject:


[ Follow Ups ] [ Marketing Research Roundtable ]