Download Advances in Data Mining. Applications and Theoretical by Petra Perner PDF

By Petra Perner

ISBN-10: 3319089757

ISBN-13: 9783319089751

ISBN-10: 3319089765

ISBN-13: 9783319089768

This ebook constitutes the refereed lawsuits of the 14th business convention on Advances in facts Mining, ICDM 2014, held in St. Petersburg, Russia, in July 2014. The sixteen revised complete papers offered have been rigorously reviewed and chosen from a number of submissions. the themes diversity from theoretical features of knowledge mining to functions of knowledge mining, corresponding to in multimedia info, in advertising and marketing, in medication and agriculture and in procedure regulate, and society.

Show description

Read or Download Advances in Data Mining. Applications and Theoretical Aspects: 14th Industrial Conference, ICDM 2014, St. Petersburg, Russia, July 16-20, 2014. Proceedings PDF

Similar data mining books

Advanced Methods for Knowledge Discovery from Complex Data

This booklet brings jointly learn articles via energetic practitioners and top researchers reporting contemporary advances within the box of data discovery. an outline of the sector, the problems and demanding situations concerned is through assurance of modern tendencies in information mining. this gives the context for the following chapters on tools and functions.

Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice

The phenomenon of volunteered geographic info is a part of a profound transformation in how geographic info, info, and information are produced and circulated. by means of situating volunteered geographic info (VGI) within the context of big-data deluge and the data-intensive inquiry, the 20 chapters during this booklet discover either the theories and purposes of crowdsourcing for geographic wisdom creation with 3 sections concentrating on 1).

Big data Related Technologies, Challenges and Future Prospects

This Springer short offers a accomplished assessment of the history and up to date advancements of huge facts. the price chain of massive information is split into 4 levels: info iteration, information acquisition, facts garage and information research. for every section, the booklet introduces the final heritage, discusses technical demanding situations and stories the newest advances.

Additional info for Advances in Data Mining. Applications and Theoretical Aspects: 14th Industrial Conference, ICDM 2014, St. Petersburg, Russia, July 16-20, 2014. Proceedings

Example text

Proposed in [2], for this method has been proved to be highly efficient. The detail of our method is described as follows: First, we sort the shingle list we have obtained in phase 1 by shingle-hash values. The result is presented by the list L of pairs (Line 3-5). After that, we generate a list of all the pairs of contents that share any shingles, along with the number of shingles they have in common. To do this, we expand L into a list of triplets by taking each shingle that appears in multiple contents and generating the complete set of triplets for that shingle.

F score is the weighted harmonic mean of P and R, which reflects the average effect of both precision and recall. com/ 36 B. Gao and Q. Fan Table 2. 712 Experimental Results We now present the experimental results of Web page clustering and classification. In this paper, we constuct the SSOM of each Web site by all the crawled pages for each site. After labelling sentences of sampled 100 Web pages, the entropy-threshold of each SSOM node is determined automatically. Clustering. We stress the experimental procedure we use is the same as in [9], but the sets of pages used are not the same.

ConstructT emplate algorithm is described in Algorithm 4, which is divided into four phases as follows: 32 B. Gao and Q. Fan First, we produce a list of pairs by shingling algorithm, where a shingle hash is the hash value of a shingle and a contentID is the unique identification of the content the shingle appears in. Second, for each node, we cluster its segments by computing the jaccard similarity coefficient of each two segments. Third, we identify template clusters by checking if the possibility is larger than the threshold F2.

Download PDF sample

Advances in Data Mining. Applications and Theoretical Aspects: 14th Industrial Conference, ICDM 2014, St. Petersburg, Russia, July 16-20, 2014. Proceedings by Petra Perner


by Robert
4.5

Rated 4.03 of 5 – based on 44 votes