What's new in AnswerTree® ?

AnswerTree offers many new features and productivity enhancements including:

  • Enhanced scalability and performance — more effectively solve enterprise-sized problems. Disk space requirements for processing large data files are now reduced due to the use of cache compression and improved temporary file management.

  • Enhanced tree duplication — get enhanced efficiency. You now have the ability to create new trees based on existing trees in the project.

  • Show split statistics — enhance your model understanding and insight by showing/hiding split statistics conveniently within the tree. For splits using ordinal predictor variables, you can display categories as ranges or discrete values.

  • Repeatable tasks are simplified — simplify repeatable tasks (such as running weekly reports). You can now generate scripts automatically from the user interface and run them on a Solaris or Windows NT platform.

  • Enhanced connectivity with other SPSS Inc. products — extend your export options; the production mode now exports PMML-compliant files. For instance, use SmartScore to interpret AnswerTree's PMML models and use them to score new cases.

  • General Usability Enhancements

    • Get added control and efficiency. The Data Properties dialog box now allows you to view data properties before creating the root node.

    • More efficiently define intervals for continuous predictor variables in CHAID analyses by using the Intervals tab in the Advanced Options dialog box. Access this dialog from the new Tree Wizard or from the Analysis menu after the tree is grown.

    • More efficiently change the measurement level of predictor variables using the Measurement Level dialog box. You can also change measurement levels before you grow the tree in the Model Definition step of the New Tree Wizard.

    • The Define Variables dialog box has been eliminated. Perform the same tasks more efficiently using the new Data Properties and Measurement Level dialog boxes and the Advanced Options, Intervals tab.

  • Enhanced robustness — many underlying enhancements enable you to explore your data with unprecedented levels of convenience, control, and confidence

General features

  • Four powerful decision-tree algorithms

  • Supports nominal categorical, ordinal categorical and continuous variables

  • Tree generating methods: automatic, interactive, production

  • Production mode: includes scripting language that enables you to run the application in production mode

  • Assign weights to each variable to reflect sampling specifications

  • Assign costs or profits to target variable

  • Online help, online tutorial and comprehensive manual

CHAID, Exhaustive CHAID

  • CHAID by Kass (1980), a fast statistical multi-way tree algorithm to explore data efficiently

  • Exhaustive CHAID by Biggs, de Ville and Suen (1991), a thorough statistical multi-way tree algorithm to explore data exhaustively

  • Automatically partition continuous variables according to the number of categories specified by the user

  • Growing criteria

    • Alpha: control the alpha levels for splitting nodes and merging categories

    • Chi-square: select from Pearson or Likelihood Ratio

    • Convergence: specify Espilon or Maximum iterations

    • Allow splitting of merged categories

    • Use Bonferroni adjustment: correct alpha levels for multiple comparisons

  • Scores: define the order of and distance between categories of an ordinal target variable

  • Missing data: assign to a category

Classification and Regression Trees (C&RT)

  • Classification and Regression Tree by Breiman, Friedman, Olshen and Stone (1984), an exhaustive binary tree algorithm to partition data and produce accurate homogeneous subsets

  • Growing criteria: select impurity measure for categorical targets from:

    • Gini

    • Twoing

    • Ordered Twoing

  • Cost complexity pruning

  • Costs and growing criteria:

    • Gini: explicitly includes cost information in growing the tree

    • Twoing and Ordered Twoing: use costs in computing risks and in node assignment; or, incorporate cost information into priors to apply to the model

  • Missing data: impute by surrogate

QUEST

  • QUEST by Loh and Shih (1997), a statistical algorithm to select variables without bias and build an accurate binary tree model quickly and efficiently

  • Cost complexity pruning

  • Growing criteria: control the alpha levels for variable selection

  • Missing data: impute by surrogate

Growing criteria - general

  • Assign prior probabilities (weights) to each variable to reflect sampling specifications. Choose from:

    • Based on training data (empirical)

    • Equal for all cases

    • Custom (user-specified)

  • Assign misclassification costs or profit expectations to target variable

  • Pruning

  • Select subtree based on:

    • Standard error rule

    • Minimum risk

Stopping rules

Control the following settings:

  • Maximum tree depth; control the depth by specifying the maximum number of levels or minimum number of cases

  • C&RT - specify the minimum change in impurity

  • Results and reports

  • Tree map (global view)

  • Tree diagram

  • Crosstab tables displayed for each node: mean, percent, count, sum, standard deviation

  • Bar charts displayed in each node

  • Data displayed in spreadsheet format and dynamically linked to tree diagram

  • Customisable tree diagram: turn node graphs on and off and control the tree's orientation

  • Copy tree diagrams and paste to other programs as a bmp file

  • Print diagram top down or sideways

Reports

  • Gains summary: describes each node and identifies which segments have highest (and lowest) contribution, profit and incremental benefit

  • Risk summary table: describes model performance and accuracy versus actual, and estimates risk of misclassification based on the tree segments and costs assigned to each value of the target variable

  • Summary report: documents the results of the analysis along with the criteria used to build the tree

Interactive diagrams, charts and tables

  • All charts and graphs are dynamically updated whenever the diagram changes

  • Grow branches selectively

  • Prune branches and monitor risk change on dynamically updating table

  • Combine segments

  • Zoom in or out of particular nodes of the tree using the tree map, and display charts and/or graphs for each node; zoom to change font for printing

  • Inspect the data points behind a specific node and save data

Advanced capabilities

  • Advanced rule generation: Save the decision rules which define each segment in SQL code or SPSS syntax so they can be applied to another data source or can be used to extract records

  • Partitioning and validation: Train your tree model on a subset of your data and then apply it to the rest of the data for validation

    • Random sampling of source data

  • Cross validation: Split your sample into smaller subsets and validate model across the sub-samples

Data and file management

  • Read SPSS files

  • Import files from ODBC-compliant applications using the SPSS ODBC Driver

  • Export data as SPSS, SYSTAT and ASCII files

  • Exports Tree View as a Windows bitmap (.bmp) file; Gains chart and Risk Summary table as tab-delimited text files; and Rules and Summary report as text files

System requirements

AnswerTree Client

  • Operating system: Windows 98, 2000, XP or Windows NT 4.0 with Service Pack 5 or higher

  • Hardware: Pentium-class processor, SVGA monitor and CD-ROM drive for installation

  • Minimum free drive space: 70MB for software

  • Minimum RAM: 64MB or more

  • Microsoft Internet Explorer 5.0 for reading help documents

AnswerTree Server

  • Windows NT Server, Windows 2000 Server or Windows 2000 Advanced Server:
     

    • Hardware: Pentium-class processor, SVGA monitor and CD-ROM drive for installation

    • Minimum free drive space: 70MB

    • Minimum RAM: 64MB or more

  • Solaris 2.6, 7 and 8:
     

    • Hardware: Ultra Sparc 2 (or better) and CD-ROM drive for installation

    • Minimum free drive space: 70MB

    • Minimum RAM: 256 MB


© Tech4T (Technologies4Targeting Ltd.) 2002/2004 All Rights Reserved.  www.tech4t.co.uk