000 05276cam a2200649Ma 4500
001 on1317436663
003 OCoLC
005 20240523125544.0
006 m o d
007 cr |||||||||||
008 060731s2007 njua ob 001 0 eng d
010 _z 2006025099
040 _aSFB
_beng
_cSFB
_dOCLCF
_dOCLCQ
_dOCLCO
_dOCLCL
020 _a1280901039
020 _a9781280901034
020 _a9786610901036
020 _a6610901031
020 _a0470108096
020 _a9780470108093
020 _a0470108088
020 _a9780470108086
035 _a(OCoLC)1317436663
050 4 _aQA76.9.D343
_bM38 2007
082 0 4 _a005.74
049 _aMAIN
100 1 _aMarkov, Zdravko,
_d1956-
245 1 0 _aData mining the Web
_h[electronic resource] :
_buncovering patterns in Web content, structure, and usage /
_cZdravko Markov and Daniel T. Larose.
260 _aHoboken, N.J. :
_bWiley-Interscience,
_cc2007.
300 _a1 online resource (236 p.).
336 _atext
_btxt
337 _acomputer
_bc
338 _aonline resource
_bcr
490 1 _aWiley series on methods and applications in data mining
500 _aDescription based upon print version of record.
505 0 _aDATA MINING THE WEB; CONTENTS; PREFACE; ACKNOWLEDGMENTS; PART I WEB STRUCTURE MINING; 1 INFORMATION RETRIEVAL AND WEB SEARCH; Web Challenges; Web Search Engines; Topic Directories; Semantic Web; Crawling the Web; Web Basics; Web Crawlers; Indexing and Keyword Search; Document Representation; Implementation Considerations; Relevance Ranking; Advanced Text Search; Using the HTML Structure in Keyword Search; Evaluating Search Quality; Similarity Search; Cosine Similarity; Jaccard Similarity; Document Resemblance; References; Exercises; 2 HYPERLINK-BASED RANKING; Introduction
505 8 _aSocial Networks AnalysisPageRank; Authorities and Hubs; Link-Based Similarity Search; Enhanced Techniques for Page Ranking; References; Exercises; PART II WEB CONTENT MINING; 3 CLUSTERING; Introduction; Hierarchical Agglomerative Clustering; k-Means Clustering; Probabilty-Based Clustering; Finite Mixture Problem; Classification Problem; Clustering Problem; Collaborative Filtering (Recommender Systems); References; Exercises; 4 EVALUATING CLUSTERING; Approaches to Evaluating Clustering; Similarity-Based Criterion Functions; Probabilistic Criterion Functions
505 8 _aMDL-Based Model and Feature EvaluationMinimum Description Length Principle; MDL-Based Model Evaluation; Feature Selection; Classes-to-Clusters Evaluation; Precision, Recall, and F-Measure; Entropy; References; Exercises; 5 CLASSIFICATION; General Setting and Evaluation Techniques; Nearest-Neighbor Algorithm; Feature Selection; Naive Bayes Algorithm; Numerical Approaches; Relational Learning; References; Exercises; PART III WEB USAGE MINING; 6 INTRODUCTION TO WEB USAGE MINING; Definition of Web Usage Mining; Cross-Industry Standard Process for Data Mining; Clickstream Analysis
505 8 _aWeb Server Log FilesRemote Host Field; Date/Time Field; HTTP Request Field; Status Code Field; Transfer Volume (Bytes) Field; Common Log Format; Identification Field; Authuser Field; Extended Common Log Format; Referrer Field; User Agent Field; Example of a Web Log Record; Microsoft IIS Log Format; Auxiliary Information; References; Exercises; 7 PREPROCESSING FOR WEB USAGE MINING; Need for Preprocessing the Data; Data Cleaning and Filtering; Page Extension Exploration and Filtering; De-Spidering the Web Log File; User Identification; Session Identification; Path Completion
505 8 _aDirectories and the Basket TransformationFurther Data Preprocessing Steps; References; Exercises; 8 EXPLORATORY DATA ANALYSIS FOR WEB USAGE MINING; Introduction; Number of Visit Actions; Session Duration; Relationship between Visit Actions and Session Duration; Average Time per Page; Duration for Individual Pages; References; Exercises; 9 MODELING FOR WEB USAGE MINING: CLUSTERING, ASSOCIATION, AND CLASSIFICATION; Introduction; Modeling Methodology; Definition of Clustering; The BIRCH Clustering Algorithm; Affinity Analysis and the A Priori Algorithm
500 _aDiscretizing the Numerical Variables: Binning.
520 _aThis book introduces the reader to methods of data mining on the web, including uncovering patterns in web content (classification, clustering, language processing), structure (graphs, hubs, metrics), and usage (modeling, sequence analysis, performance).
546 _aEnglish.
504 _aIncludes bibliographical references and index.
590 _aJohn Wiley and Sons
_bWiley Online Library: Complete oBooks
650 0 _aData mining.
650 0 _aWeb databases.
650 6 _aExploration de donn�ees (Informatique)
650 6 _aBases de donn�ees sur le Web.
650 7 _aData mining
_2fast
650 7 _aWeb databases
_2fast
700 1 _aLarose, Daniel T.
758 _ihas work:
_aData mining the Web (Text)
_1https://id.oclc.org/worldcat/entity/E39PCGmjGK3FvHrf8jGBghfrdP
_4https://id.oclc.org/worldcat/ontology/hasWork
776 _z0-471-66655-6
830 0 _aWiley series on methods and applications in data mining.
856 4 0 _uhttps://onlinelibrary.wiley.com/doi/book/10.1002/0470108096
994 _a92
_bINLUM
999 _c12890
_d12890