Advances in Knowledge Discovery and Data Mining: 8th by Honghua Dai, Ramakrishnan Srikant, Chengqi Zhang

This publication constitutes the refereed complaints of the eighth Pacific-Asia convention on wisdom Discovery and knowledge mining, PAKDD 2004, beld in Sydney, Australia in could 2004.

The 50 revised complete papers and 31 revised brief papers offered have been rigorously reviewed and chosen from a complete of 238 submissions. The papers are prepared in topical sections on type; clustering; organization ideas; novel algorithms; occasion mining, anomaly detection, and intrusion detection; ensemble studying; Bayesian community and graph mining; textual content mining; multimedia mining; textual content mining and net mining; statistical equipment, sequential information mining, and time sequence mining; and biomedical info mining.

4. S. Sarawagi, S. Chakrabarti, and S. Godbole. Cross training: learning probabilistic mappings between topics. In Proceedings of the ACM SIGKDD-2003. 5. S. Godbole, S. Sarawagi, and S. Chakrabarti. Scaling multi-class support vector machines using inter-class confusion. In Proceedings of ACM SIGKDD-2002. 6. K. Crammer and Y. Singer. A family of additive online algorithms for category ranking. Journal of Machine Learning Research, 1025–1058, 2003. 7. A. Elisseeff and J. Weston. Kernel methods for multi-labelled classification and categorical regression problems.

Multi-labeled classification has also been attempted using generative models, although discriminative methods are known to be more accurate. McCallum [9] gives a generative model where each document is probabilistically generated by all topics represented as a mixture model trained using EM. The class sets which can generate each document are exponential in number and a few heuristics are required to efficiently search only a subset of the class space. The Aspect model [10] is another generative model which can be naturally employed for multi-labeled classification, though no current work exists.

22–30, 2004. © Springer-Verlag Berlin Heidelberg 2004 Discriminative Methods for Multi-labeled Classification 23 In this paper, we present algorithms which use existing discriminative classification techniques as building blocks to perform better multilabeled classification. We propose two enhancements to existing discriminative methods. First, we present a new algorithm which exploits correlation between related classes in the label-sets of documents, by combining text features and information about relationships between classes by constructing a new kernel for SVMs with heterogeneous features.

