SSAS Mining Structure Data Types

Microsoft Reference

When we create a mining model or a mining structure in Microsoft SQL Server Analysis Services, we must define the data types for each of the columns in the mining structure. The data type tells the analysis engine whether the data in the data source is numerical or text, and how the data should be processed. For example, if our source data contains numerical data, we can specify whether the numbers be treated as integers or by using decimal places.

Analysis Services supports the following data types for mining structure columns:

Data Type Supported Content Types
Text Cyclical, Discrete, Discretized, Key Sequence, Ordered, Sequence
Long Continuous, Cyclical, Discrete, Discretized, Key, Key Sequence, Key Time, Ordered, Sequence, Time


Boolean Cyclical, Discrete, Ordered
Double Continuous, Cyclical, Discrete, Discretized, Key, Key Sequence, Key Time, Ordered, Sequence, Time


Date Continuous, Cyclical, Discrete, Discretized, Key, Key Sequence, Key Time, Ordered

You can find details for discrete vs discretized here.


Microsoft SSAS logistic regression mining model node types

For the DMX queries here is the reference of the node types for Microsoft SSAS logistic regression mining model:

A logistic regression model outputs the following node types:

Node Type ID Description
1 Model.
17 Organizer node for the subnetwork.
18 Organizer node for the input layer.
19 Organizer node for the hidden layer. The hidden layer is empty.
20 Organizer node for the output layer.
21 Input attribute node.
23 Output attribute node.
24 Marginal statistics node.

So the DMX query:

SELECT * FROM [Model_Name].Content

Mining Model Content for Logistic Regression Models

microsoft decision trees algorithm parameters COMPLEXITY_PENALTY SCORE_METHOD SPLIT_METHOD

The Microsoft Decision Trees algorithm supports parameters that affect the performance and accuracy of the resulting mining model. You can also set modeling flags on the mining model columns or mining structure columns to control the way that data is processed.

Setting Algorithm Parameters

The following table describes the parameters that you can use with the Microsoft Decision Trees algorithm.

Controls the growth of the decision tree. A low value increases the number of splits, and a high value decreases the number of splits. The default value is based on the number of attributes for a particular model, as described in the following list:

  • For 1 through 9 attributes, the default is 0.5.
  • For 10 through 99 attributes, the default is 0.9.
  • For 100 or more attributes, the default is 0.99.
Forces the algorithm to use the specified columns as regressors, regardless of the importance of the columns as calculated by the algorithm. This parameter is only used for decision trees that are predicting a continuous attribute.


Determines the method that is used to calculate the split score. The following options are available:

ID Name
1 Entropy
3 Bayesian with K2 Prior
4 Bayesian Dirichlet Equivalent (BDE) with uniform prior


The default is 4, or BDE.


Determines the method that is used to split the node. The following options are available:

ID Name
1 Binary: Indicates that regardless of the actual number of values for the attribute, the tree should be split into two branches.
2 Complete: Indicates that the tree can create as many splits as there are attribute values.
3 Both: Specifies that Analysis Services can determine whether a binary or complete split should be used to produce the best results.

The default is 3.


Defines the number of input attributes that the algorithm can handle before it invokes feature selection.

The default is 255.

Set this value to 0 to turn off feature selection.

[Available only in some editions of SQL Server]

MAXIMUM_OUTPUT_ATTRIBUTESDefines the number of output attributes that the algorithm can handle before it invokes feature selection.

The default is 255.

Set this value to 0 to turn off feature selection.

[Available only in some editions of SQL Server]

MINIMUM_SUPPORTDetermines the minimum number of leaf cases that is required to generate a split in the decision tree.

The default is 10.

You may need to increase this value if the dataset is very large, to avoid overtraining.


Microsoft Decision Trees Algorithm Technical Reference

SSAS DMX Query for Dependency Network pane of the Microsoft Naive Bayes Viewer


The node caption is used to identify the node, rather than the node name, because the caption shows both the attribute name and attribute value.

The MSOLAP_NODE_SCORE is a special value provided only for the input attribute nodes, and indicates the relative importance of this attribute in the model. You can see much the same information in the Dependency Network pane of the viewer; however, the viewer does not provide scores.

The NODE_TYPE = 10 represents the input attribute, and NODE_TYPE = 11 for each value of the attribute.

Reference: Naive Bayes Model Query Examples

Choosing Microsoft Prediction Model or Algorithm

From Micosoft.. “An algorithm in data mining (or machine learning) is a set of heuristics and calculations that creates a model from data. To create a model, the algorithm first analyzes the data you provide, looking for specific types of patterns or trends. The algorithm uses the results of this analysis over many iterations to find the optimal parameters for creating the mining model. These parameters are then applied across the entire data set to extract actionable patterns and detailed statistics.”

Reference : SSAS has 9 data mining models or algorithms

Choosing an Algorithm by Type

SQL Server Data Mining includes the following algorithm types:

  • Classification algorithms predict one or more discrete variables, based on the other attributes in the dataset.
  • Regression algorithms predict one or more continuous numeric variables, such as profit or loss, based on other attributes in the dataset.
  • Segmentation algorithms divide data into groups, or clusters, of items that have similar properties.
  • Association algorithms find correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a market basket analysis.
  • Sequence analysis algorithms summarize frequent sequences or episodes in data, such as a series of clicks in a web site, or a series of log events preceding machine maintenance.

However, there is no reason that you should be limited to one algorithm in your solutions. Experienced analysts will sometimes use one algorithm to determine the most effective inputs (that is, variables), and then apply a different algorithm to predict a specific outcome based on that data. SQL Server Data Mining lets you build multiple models on a single mining structure, so within a single data mining solution you could use a clustering algorithm, a decision trees model, and a Naïve Bayes model to get different views on your data. You might also use multiple algorithms within a single solution to perform separate tasks: for example, you could use regression to obtain financial forecasts, and use a neural network algorithm to perform an analysis of factors that influence forecasts. You can use the Microsoft Decision Trees algorithm not only for prediction, but also as a way to reduce the number of columns in a dataset, because the decision tree can identify columns that do not affect the final mining model.

Choosing an Algorithm by Task

To help you select an algorithm for use with a specific task, the following table provides suggestions for the types of tasks for which each algorithm is traditionally used.

Examples of tasks Microsoft algorithms to use
Predicting a discrete attribute:

Flag the customers in a prospective buyers list as good or poor prospects.

Calculate the probability that a server will fail within the next 6 months.

Categorize patient outcomes and explore related factors.

Microsoft Decision Trees Algorithm

Microsoft Naive Bayes Algorithm

Microsoft Clustering Algorithm

Microsoft Neural Network Algorithm

Predicting a continuous attribute:

Forecast next year’s sales.

Predict site visitors given past historical and seasonal trends.

Generate a risk score given demographics.

Microsoft Decision Trees Algorithm

Microsoft Time Series Algorithm

Microsoft Linear Regression Algorithm

Predicting a sequence:

Perform clickstream analysis of a company’s Web site.

Analyze the factors leading to server failure.

Capture and analyze sequences of activities during outpatient visits, to formulate best practices around common activities.

Microsoft Sequence Clustering Algorithm
Finding groups of common items in transactions:

Use market basket analysis to determine product placement.

Suggest additional products to a customer for purchase.

Analyze survey data from visitors to an event, to find which activities or booths were correlated, to plan future activities.

Microsoft Association Algorithm

Microsoft Decision Trees Algorithm

Finding groups of similar items:

Create patient risk profiles groups based on attributes such as demographics and behaviors.

Analyze users by browsing and buying patterns.

Identify servers that have similar usage characteristics.

Microsoft Clustering Algorithm

Microsoft Sequence Clustering Algorithm

Reference Microsoft

SSAS configuration file msmdsrv.ini

This file resides at: ?:\Program Files\Microsoft SQL Server\MSAS??.Instance_Name\OLAP\Config

Other way to find this path is:

a)      Go to Services Console (Services.msc)

b)      Find SSAS Service

c)       Right click SSAS Service and go to its properties

d)       In General  section look for Path to executable

e)      Below Path to executable you will find complete path which looks something like this for default configuration –

“C:\Program Files\Microsoft SQL Server\MSAS11.TABULAR\OLAP\bin\msmdsrv.exe” -s “C:\Program Files\Microsoft SQL Server\MSAS11.TABULAR\OLAP\Config”

f)        Second part of the Path got location of Config directory where you will find msmdsrv.ini file

In case you want to rebuild this msmdsrv.ini:

a) STOP SQL Analysis Services

b) Go to SSAS Config folder (Default Path – ?:\Program Files\Microsoft SQL Server\MSAS??.Instance_Name\OLAP\Config)

c) In config folder you will find two files msmdsrv.ini and msmdsrv.bak, move these two files in any temporary folder

d) Start SSAS

e) After starting SSAS service you will find copy of newly created msmdsrv.ini and msmdsrv.bak files in config folder

f) File is created with default settings for all properties

e) To start with you can provide right path of Data, Temp, Log and Backup Directory, otherwise root directly will be used for these folders. In a newly created msmdsrv.ini file, path looks something like this – <DataDir>\data</DataDir> which means use root directory for creating Data Folder

Thanks to this Reference.

So long AdventureWorks, and hello Wide World Importers!

Any future development on Microsoft sample databases should use Wide World Importers instead of AdventureWorks since it will not be updated or maintained after 2016.

The Wide World Importers can be downloaded from here.

Can I still use AdventureWorks with SQL Server 2016?

Yes, the final AdventureWorks build will be updated on the official download site here and you can still use AdventureWorks with your SQL Server 2016 instance.

Microsoft Reference