Source Themes |

Interactive prior elicitation of feature similarities for small sample size prediction

Regression under the "small n$, large p" condition, of small sample size n and large number of features p in the learning data set, is a recurring setting in which learning from data is difficult. With prior knowledge about relationships of the features, p can effectively be reduced, but explicating such prior knowledge is difficult for experts. In this paper we introduce a new method for eliciting expert prior knowledge about the similarity of the roles of features in the prediction task. The key idea is to use an interactive multidimensional-scaling (MDS) type scatterplot display of the features to elicit the similarity relationships, and then use the elicited relationships in the prior distribution of prediction parameters.

Simulated annealing least squares twin support vector machine (SA-LSTSVM) for pattern classification

Least squares twin support vector machine (LSTSVM) is a relatively new version of support vector machine (SVM) based on non-parallel twin hyperplanes. Although, LSTSVM is an extremely efficient and fast algorithm for binary classification, its parameters depend on the nature of the problem. Problem dependent parameters make the process of tuning the algorithm with best values for parameters very difficult, which affects the accuracy of the algorithm. The goal of this paper is to improve the accuracy of the LSTSVM algorithm by hybridizing it with simulated annealing.

Visualizations relevant to the user by multi-view latent variable factorization

A main goal of data visualization is to find, from among all the available alternatives, mappings to the 2D/3D display which are relevant to the user. Assuming user interaction data, or other auxiliary data about the items or their relationships, the goal is to identify which aspects in the primary data support the user's input and, equally importantly, which aspects of the user's potentially noisy input have support in the primary data. For solving the problem, we introduce a multi-view embedding in which a latent factorization identifies which aspects in the two data views (primary data and user data) are related and which are specific to only one of them.

A novel bat algorithm based on chaos for optimization tasks

Bat Algorithm (BA) is a new meta-heuristic optimization algorithm, which has been developed rapidly and has been applied in different optimization tasks in recent years. In this paper an improved version of Bat algorithm with chaos is represented. The approach is based on the substitution of the random number generator (RNG) with chaotic sequences for parameter initialization.