包邮Hadoop MapReduce v2参考手册-第2版-(影印版)

1星价 ¥22.8 (3.6折)

2星价￥22.8 定价￥64.0

作者：(美)冈纳拉森(ThilinaGuna

出版社：东南大学出版社

本类榜单：计算机/网络

分类：计算机/网络 > 程序设计 > 其他

温馨提示：5折以下图书主要为出版社尾货，大部分为全新（有塑封/无塑封），个别图书品相8-9成新、切口有划线标记、光盘等附件不全详细品相说明>>

买过本商品的人还买了

暂无评论

图文详情

ISBN：9787564160890
装帧：一般胶版纸
册数：暂无
重量：暂无
开本：16开
页数：304
出版时间：2016-01-01
条形码：9787564160890 ; 978-7-5641-6089-0

本书特色

《hadoop mapreduce v2参考手册(第2版)(影印版)(英文版)》开篇介绍了hadoop yarn、mapreduce、hdfs以及其他hadoop生态系统组件的安装。在《hadoop mapreduce v2参考手册(第2版)(影印版)(英文版)》的指引下，你很快就会学习到很多激动人心的主题，例如mapreduce模式，使用hadoop处理分析、归类、在线销售、推荐、数据索引及搜索。你还会学习到如何使用包括hive、hbase、pig、mahout、nutch～bgi raph在内的hadoop生态系统项目以及如何在云环境下进行部署。

内容简介

开篇介绍了Hadoop YARN, MapReduce, HDFS以及其它Hadoop生态系统组件的安装。在本书的指引下，你很快就会学习到很多激动人心的主题，例如MapReduce模式，使用Hadoop从事分析、归类、在线销售、推荐、数据索引及搜索。

preface chapter 1：getting started with hadooo v2 introduction setting up hadoop v2 on your local machine writing a wordcount mapreduce application，bundling it and running it using the hadoop local mode adding a combiner step to the wordcount mapreduce program setting up hdfs setting up hadoop yarn in a distributed cluster environment using hadoop v2 setting up hadoop ecosystem in a distributed cluster environment using a hadoop distribution hdfs command—line file operations running the wordcount program in a distributed cluster environment benchmarking hdfs using dfsio benchmarking hadoop mapreduce using terasort chapter 2：cloud deployments—using hadoop yarn on cloud environments introduction running hadoop mapreduce v2 computations using amazon elastic mapreduce saving money using amazon ec2 spot instances to execute emr job flows executing a pig script using emr executing a hive script using emr creating an amazon emr job flow using the aws command line interface deploying an apache hbase cluster on amazon ec2 using emr using emr bootstrap actions to configure vms for the amazon emr jobs using apache whirr to deploy an apache hadoop cluster in a cloud environment chapter 3：hadoop essentials—c0nfigurations，unit tests，and other apis introduction optimizing hadoop yarn and mapreduce configurations for cluster deployments shared user hadoop clusters——using fair and capacity schedulers setting classpath precedence to user—provided jars speculative execution of straggling tasks unit testing hadoop mapreduce applications using mrunit integration testing hadoop mapreduce applications using miniyarncluster adding a new datanode decommissioning datanodes using multiple disks／volumes and limiting hdfs disk usage setting the hdfs block size setting the file replication factor using the hdfs java api chapter 4：develooin～comdlex hadooo maoreduce aoolications introduction choosing appropriate hadoop data types implementing a custom hadoop writable data type implementing a custom hadoop key type emitting data of different value types from a mapper choosing a suitable hadoop inputformat for your input data format adding support for new input data formats——implementing a custom inputformat formatting the results of mapreduce computations——using hadoop outputformats writing multiple outputs from a mapreduce computation hadoop intermediate data partitioning secondary sorting——sorting reduce input values broadcasting and distributing shared resources to tasks in a mapreduce job—hadoop distributedcache using hadoop with legacy applications——hadoop streaming adding dependencies between mapreduce jobs hadoop counters to report custom metrics chapter5：analvtics introduction simple analytics using mapreduce performing group by using mapreduce calculating frequency distributions and sorting using mapreduce plotting the hadoop mapreduce results using gnuplot calculating histograms using mapreduce calculating scatter plots using mapreduce parsing a complex dataset with hadoop joining two datasets using mapreduce chapter6：hadooo ecosystem—apache hive introduction getting started with apache hive creating databases and tables using hive cli simple sql—style data querying using apache hive creating and populating hive tables and views using hive query results utilizing different storage formats in hive.storing table data using orc files using hive built—in functions hive batch mode—using a query file performing a join with hive creating partitioned hive tables writing hive user·defined functions（udf） hcatalog—·performing java mapreduce computations on data mapped to hive tables hcatalog——writing data to hive tables from java mapreduce computations chapter7：hadood ecosystem ii—pig.hbase.mahout.and sannn introduction getting started with apache pig joining two datasets using pig accessing a hive table data in pig using hcatalog getting started with apache hbase data random access using java client apis running mapreduce jobs on hbase using hive to insert data into hbase tables getting started with apache mahout running k—means with mahout importing data to hdfs from a relational database using apache sqoop exporting data from hdfs to a relational database using apache sqoop tahie orcontencs chapter8：searching and indexine introduction generating an inverted index using hadoop mapreduce intradomain web crawling using apache nutch indexing and searching web documents using apache solr configuring apache hbase as the backend data store for apache nutch whole web crawling with apache nutch using a hadoop／hbase cluster elasticsearch for indexing and searching generating the in—links graph for crawled web pages chapter 9：ciassmcations。recommendations，and findineg relationships introduction performing content—based recommendations classification using the naive bayes classifier assigning advertisements to keywords using the adwords balance algorithm chapter 10：mass text data processing introduction data preprocessing using hadoop streaming and python de—duplicating data using hadoop streaming loading large datasets to an apache hbase data store—importtsv and bulkload creating tf and tf—idf vectors for the text data clustering text data using apache mahout topic discovery using latent dirichlet allocation（lda） document classification using mahout naive bayes classifier index

展开全部

本类五星书

更多>>

浏览历史

本类畅销

×

包邮Hadoop MapReduce v2参考手册-第2版-(影印版)

本书特色

内容简介

目录

预估到手价 ×