《HDFS简介详尽版》PPT课件.ppt
《《HDFS简介详尽版》PPT课件.ppt》由会员分享,可在线阅读,更多相关《《HDFS简介详尽版》PPT课件.ppt(35页珍藏版)》请在装配图网上搜索。
HDFS简介 杨林三2012 08 28 提纲 文件系统HDFS实验 文件系统概览 文件系统数据结构 文件系统数据结构 文件系统工作流程 LinuxExt2 LinuxExt3 NFS 小结 Hadoop Hadoop HDFS目标 Targets CharactersVerylargefilesGB TB PB Millionsoffiles10K nodesStreamingdataaccessWrite once read many timesGooddatacoherencyTimetoreadwholedatasetmattersmorethantimetoseekCommodityhardwareCheapNodefailure thechanceoverlargeclusterishighRedundancythroughreplicasFailurehandlingandrecovery HDFS目标 Targets CharactersOptimizedforbatchprocessingmovecomputationratherthandatalocationsofdataexposedtocomputationNotgoodforLow latencydataaccessmsHighthroughputagainstLow latencyLotsofsmallfilesMetadatamayrunoutofmemorySeektimeoverreadtimeMultiplewriters arbitraryfilemodificationAsinglewriterWriteatendWrite once read many times HDFS架构 Master WorkersNameNodeMaster onesingleFilenamespaceDirectoriesandfilesMetadataInmemoryMappingfilename blocksblocks datanodeClusterconfigurationReplicationManagement DataNodesWorkers manyA block serverStoreblocksinlocalFSStoremetadataofblocksServedatatoclientsFacilitatepipelinetootherdatanodes HDFS架构 DataStorage Blocks Chunks Blocksize64 128 256MBConfigurablewhysobig transferoverseekseek 10ms transferrate 100MB seektime transfertime 1 blocksize 100MBblocksofafiledistibutedoverdatanodesaspossibleasingleFILE existingofNblocks isstoredonMnodes Namenode Logofcreation deletion renameofthenamespace NamespacetreeiscachedinRAMandstoredpermanentlyinFSImage Editlogtakesrecodeofopen close renamefileordirectory etc SecondaryNamenode Why PeroidicallyMergeeditsandfsimagePreventeditlogfrombeingtoolargeProvidecheckpointofNamenode sfsimageasbackupincaseofNamenodecrash NamenodeHigh Availability FsImageandEditLogarecentraldatastructuresofHDFS AcorruptionofthesefilescancauseaHDFSinstancetobenon functional Forthisreason aNamenodecanbeconfiguredtomaintainmultiplecopiesoftheFsImageandEditLog MultiplecopiesoftheFsImageandEditLogfilesareupdatedsynchronously Meta dataisnotdata intensive NamenodeSafemode MergeFSImageandEditsEntersafemodeOffersread onlyviewoffilesystemtoclientsBlock locationinformationcollectionblock locationinfonotinfsimagedatanodescheckintoreporttheirblockscheckblocksminimalreplicationcondition 99 9 blocksmeetminimumreplicationlevel configurable default1 exitsafemodeifminimalreplicationconditionreachesReplicateblocksifnecessaryblockswhoesreplicanumberislessthan3 HDFSoperations MetadataoperationsCommunicatewithnamenodeonlyls lsr df du chmod chown R W block operationsCommunicatewithnamenodeanddatanodesput copyFromLocal copyToLocal tail DataWriting DataReading ReplicaPlacement Criticaltoreliabilityandperformance ReplicasareplacedReplicationfactoris31onanodeinalocalrack1onadifferentnodeinthelocalrack1onanodeinadifferentrack Replicationfactor 31 3ofthereplicaonanode1 3onondifferentnodeinthesamerack1 3distributedevenlyacrossremainingracks ReplicaselectionThenearestoneforread NetworkTopology Characters30 40nodes rack1GB switchforrackuplinktoacoreswitchorrouter1GB aggregatebandwidthnodesonsamerackmuchhigherthandifferentracks Bandwidth DistancesProcessesonthesamenodeDifferentnodesonthesamerackNodesondifferentracksinthesamedatacenterNodesindifferentdatacenters DatanodeFailure DiskfailureDataNode正常服务坏掉的磁盘上的数据尽快通知NameNodeMachineDown问 NameNode怎么知道DataNode挂掉了 答 datanode每3秒钟向namenode发送心跳 如果10分钟datanode没有向namenode发送心跳 则namenode认为该datanode已经dead namenode将取出该datanode上对应的block 对其进行复制 HDFS压缩 实验 拷贝数据 hadoopfs copyFromLocal home hduser0 hadoop practice data ntes logs news 20120813 23 log gz data input ntes logs 实验 数据块 实验 LineCount 后续 LZO压缩格式实验日志合并原perl流程多台js服务器rsync合并机合并读2分钟的日志按时间排序url正则匹配 多个正则表达式Hadoop可能的方案js服务器可以作为DataNode 将原始日志以LZO格式放入HDFS无需每次只读2分钟的日志无需排序 将时间作为Map输出的Key即可提升正则匹配的性能JavavsPerl多模式 状态机 参考 邓侃 北航云计算公开课讲义 GoogleFileSystem TomWhite Hadoop TheDefinativeGuide http hadoop apache org 完结 谢谢欢迎提问讨论- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- HDFS简介详尽版 HDFS 简介 详尽 PPT 课件
装配图网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
关于本文