Wine recognition data set

 

Task: classification

Number of instances: 178

Number of attributes: 13 (numerical)

Type of attribute to be predicted: discrete with 3 classes

Download the data: DataWine

 

These data concern the chemical analysis of a set of 178 wines coming from 3 different producers (of the same area of Italy). The objective is the extraction of models enabling to find out the producer knowing the content of the following components: Alcohol, Malic acid, Ash, Alcalinity of ash, Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins, Color intensity, Hue, OD280/OD315 of diluted wines, Proline.

Sources: Forina, M. et al, PARVUS - An Extendible Package for Data Exploration, Classification and Correlation. Institute of Pharmaceutical and Food Analysis and Technologies, Via Brigata Salerno, 16147 Genoa, Italy. Data found in the UCI Machine Learning Repository.

 

Model with 1 variable

The most precise model that uses only one explanatory variable concerns the content of Flavanoids:

* If (Flavanoids is lower than 1) then (Class is rather 3)

* If (Flavanoids is higher than 2,5) then (Class is rather 1)

* Otherwise (Class is rather 2)

 

It enables to correctly classify 148 of the 178 data of the sample (83%). We can graphically represent it (red curve) with the experimental data (green points):

 

 

Model with 2 variables

This model implies a second variable: the content of Proline. It is similar to the first model, but comprises an additional rule:

* If (Flavanoids is lower than 1) then (Class is rather 3)

* If (Proline is higher than 800) then (Class is rather 1)

* Otherwise (Class is rather 2)

 

It enables to correctly classify 163 data out of 178 (91%). The following graph illustrates this model (the experimental data are the white triangles):

 

 

Model with 3 variables

* If (Flavanoids is lower than 1) then (Class is rather 3)

* If (Proline is higher than 800) then (Class is rather 1)

* If (Color intensity is lower than 2) then (Class is rather 2)

 

It enables to correctly classify 175 data out of 178 (98%). The following graph is a "4D" representation of this model:

 

 

Model with 4 variables (full classification)

The following model enables to correctly classify the totality of the 150 instances of the datadet :

 

* If (Flavanoids is lower than 0,5) and (Color intensity is higher than 4) then (Class is rather 3)

* If (Alcohol is higher than 12,5) and (Color intensity is higher than 4) and (Proline is higher than 600) then (Class is rather 1)

* If (Alcohol decreases) then (Class is rather 2)

 

 

 
 

BLIASoft Knowledge Discovery - Data mining & predictive analytics software - Fuzzy logic & artificial intelligence

              2007-2017 BLIASOLUTIONS - All rights reserved | Terms of use | Site map