Wednesday, July 21, 2010

A Survival Guide about Data Mining


Once a quick surf through the abundant resources available in Togaware you can discover that there is also Data Mining Desktop Survival Guide, a useful guide for understanding the practical deployment, updating and refining, the use of algorithms and available analytical tools, applicable in the field of data mining. Of course, the guide is based on real examples of application of Rattle and then of all the typical functions of the famous open source software R.
The author, Dr. Graham Williams, a researcher and professor of mining for over 15 years, with a long series of successful applications in the mining research field and public institutions. He has taught data mining for over 10 years and has published numerous papers sullargomento. Currently he holds the post of Principal Data Miner at the Australian Taxation Office and is responsible for technical support of the biggest mining group in Australia.
The guide available online at: http://datamining.togaware.com/

Friday, July 2, 2010

Rattle: User interface for data mining with R


Rattle (R Analytical Tool To Learn Easily) the simple and logical user interface with R for data mining. This is an application for data mining, provided with a graphical Gnome environment based on the popular open-source language R. The software runs on GNU / Linux, Macintosh, OS / X and MS / Windows. The interface provides practical and intuitive tools allowing the users to easily follow every steps of the fundamental data mining process, so as to display both the R code used from time to time. Graphical tools integrated into it should be sufficient for all purposes and all necessity.
Latest release available for download at: rattle.togaware.com
In this version, lutente have the following sections:
- Data: Importing CSV to Dataset Support R; ODBC
- Exploration: Event Summary; Correlations between Characteristics; Groups Features hierarchical dendrogram
- Graphics: Box plots, histograms, CFD Benfords Law, Bar Charts, Dot plot
- Analysis of Groups: KMeans; Analysis with Hierarchical dendrogram
- Modeling: decision trees (rpart), Generalized Linear Models, Boosting, Random Forests, Support Vector Machine;
- Rating: Confusion Matrix, Risk Chart; Lift Chart, ROC curves and AUC, Accuracy, Sensitivity.

Ecological Ordination Methods


Ecological Data Analysis cannot be ignored as a very useful source information-system/information provided by the Oklahoma State University: This is the section of the university website more known to analysts as The Ordination Web Page and fully implemented by Professor Michael W. Palmer.
Ordination is one of the most widely used methods for the individuation of relationships between ecological communities, writes Michael W. Palmer in his The Ordination Web Page. This web page was designed to answer to some of the questions most frequently asked, and more particularly aimed to students and novices. Here you can find an almost unlimited number of papers in which they are described, compared, and discussed the different techniques of sorting. The website contains the most simple concepts (and paradoxically this is also more difficult to recover) ad the more complex arguments: a section dedicated to general descriptions and references to more interesting, a section dedicated exclusively to statistical methods, a section dedicated to the softwares, and a section devoted to processing and standardization of environmental data / ecological.