大数据基础知识论-046--fair


2023年12月22日发(作者:uniforms是什么意思)

HadoopFairSchedulerDesignDocumentOctober18,.5CodeGuide4556677789101011

1IntroductionTheHme,ithasgrowninfunctionalitytosupporthierarchicalscheduling,preemption,cumentexHadoop’sbuilt-inFIFOscheduler,fairschedulingletssmalljobsmakeprogressevenifalargejobisrunning,eguaranteedservicelevelsto“production”jobs,letoadministerandconfiedulershoulddosomethingreasonable“outofthebox,”andusersshouldonlyneedtoconfitreconfigurationatruntime,rSchedulerwasdesignedwithfourmaingoals:3SchedulerFeaturesTledus-ageguideisavailableintheHadoopdocumentationinbuild/docs/.3.1PoolsTheFairSchedulergroupsjobsinto“pools”lthatajobisplacedinisdeterminedbyaJobConfproperty,the“poolnameproperty”.Bydefault,,r,differentpropertiescanbeused,n,sothatthereisonepoolperuserbutitisalsopossibletoplacejobsinto“special”ppetbelowshowshowtodothis:2

meproperty${}3.2MinimumSharesNormally,activepools(thosethatcontainjobs)r,itisalsopossibletosetaminimumshareofmapandreduceslotsonagivenpool,whichisanumberofslotsthatitwillalwaysgetwhenitisactive,usefulforguaranteeingthatproductionjobsgetacertaimshareshavethreeeffects:l’yexceptionisiftheminimumsharesoftheactivepoolsadduptomorethanthetotalnumberofslotsinthecluster;inthiscase,eachpool’hoserunningtaskcountisbelowtheirminimumsharegetassignedslotsfissibletosetapreemptiontimeoutonthepoolafterwhich,ifithasnotreceivedenoughtaskslotstomeetitsminimumshare,atwhenapoolisinactive(containsnojobs),itsminimumshareisnot“reserved”forit–theslotsaresplitupamongtheotherpools.3.3PreemptionAsexplainedabove,theschedulermaykillthispreemption,althoughthisusageofthewordissomewhatstrangegiventhenormaldefinitionofpreemptionaspausing;reallyitisthejobthatgetspreempted,tion,theschedulersupportsfairsharepreemption,tokilltaskswhenapool’arepreemptionismuchmoreconservativethanminsharepreemption,becausepoolswithoutminsharesareexpected3

icular,fairsharepreemptionactivatesifapoolhasbeenbelowhalfofitsfairshareforaconfigurablefairsharepreemptiontimeout,10minutes).Inbothtypesofpreemption,theschedulerkillsthemostrecentlylaunchedtasksfromover-scheduledpools,tominimizetheamountofcomputationwastedbypreemption.3.4RunningJobLimitsThefairschedulercanlimusbmittedbeyondthelimitwaitforoneoftherunningjobstofinish.3.5JobPrioritiesWithinapool,jobprioritiescanbeusedtocontroltheschedulingofjobs,whetherthepool’sinternalschedulingmodeisFIFOorfairsharing:•InFIFOpools,jobsareorderedfirstbypriorityandthenbysubmittime,asinHadoop’sdefaultscheduler.•Infairsharingpools,malprioritycorrespondstoaweightof1.0,mple,ahigh-priorityjobgetsaweightof2.0,andwillthereforeget2xtheshareofanormal-priorityjob.3.6PoolWmple,apoolwithweight2.0gets2xtheshareofapoolwithweight1.0.3.7DelaySchedulingTheFairSchedulatcannotlaunchadata-localmaptaskwaitforsomeperiodoftimebeforetheyareallowedtolaunchnon-data-localtasks,ensurinchedulingisdescribedindetailinSection4.8.3.8AdministrationTheFairSchedulerincludesawebUIdisplayingtheactivepoolsandjobsandtheirfairshares,movingjobsbetweenpools,tion,theFairScheduler’sallocationfile(specifyingminsharesandpreemptiontimeoutsforthepools)isautomaticallyreloadedifitismodifiedondisk,toallowruntimereconfiguration.4

44.1ImplementationHadoopScdeisconfiguredwithanumberofmapslotsandreduceslotsbasedonitscomputationalresources(typicallyoneslotpercore).edulersinHadoop,includingtheFairScheduler,assprovidesaccesstoaTaskTrackerManager–aninterfacetotheJobTracker–asktheschedulertoimplementthreeabstractmethods:thelifecyclemethodsstartandterminate,ackersperiodicallysendheartbeatstotheJobTrackerwiththeirTaskTrackerStatus,whichcontainsalistofrunningtasks,thenumberofslotsonthenode,romreactingtoheartbeatsthroughassignTasks,schedulerscanalsobenotifiedwhenjobshavebeensubmittedtothecluster,killed,rtantroleofthelistenersistoinitializejobsthataresubmitted–untilajobisinitialized,rSchedulercurrentlyinitializesalljobsrightaway,butitmayalsobedesirabletoholdoffinitialionoftaskswithinajobismostlydonebytheJobInProgressclass,rogressexposestwomethods,obtainNewMapTaskandobtainNewReduceTask,thodsmteralltasksinthejobhavebeenstarted,tion,ifthenodecontainingamaptaskfailed,tlersmaythereforeneedtopollmultiplejobsuntiltheyfiy,formaptasks,animportantschedulingcriterionisdatalocality:ly,NewMapTaskreturnsthe“closest”r,togiveschedulersslightlymorecontroloverdatalocality,thereisalsoaversionofobtainNewMapTaskthatallowthesched-ulertocapthelevelofnon-localityallowedforthetask(tataskonlyonthesamenode,ornullifnoneisavailable).TheFairSchedulerusesthismethodwithanalgorithmcalleddelayscheduling(Section4.8)tooptimizedatalocality.5

4.2FairSchedulerBasicsAtahighlevel,tselectsalchoosesamongitsjobsaccordingtoitsinternalschedulingorder(FIFOorfairsharing).Infact,becausejobsmightnothavetaskstolaunch(obtainNew(Map|Reduce)Taskcanreturnnull),thescheduapool,jobsaresortedeitherbypriorityandstarttime(forFIFO)firstjobintheorderingdoesnothaveatasktolaunch,thepoolwillaskthesecond,third,hemselvesaresortedbydistancebelowminshareandfairshare,soifthefirstpooldoesnothaveanyjobsthatcanlaunchtasks,thesecondpoolisasked,kesitstraightforwardtoimplementfeatureslikedelayscheduling(Section4.8)thatmaycausejobsto“pass”romtheassigntaskscodepath,theFairSchereadisresponsibleforrecomputingfairsharestodisplaythemontheUI(Section4.6),checkingwhetherjobsneedtobepreempted(Section4.5),andcheckingwhethertheallocationsfilehaschangedtoreloadpoolallocations(throughPoolManager).4.3TheSchedulableClassToallowthesamefairsharingalgorithmtobeusedbothbetweenpoolsandwithinapool,theFairSchedulerulableisresponsibleforthreeroles:equeriedforinformationaboutthepool/jobtouseinscheduling,suchas:•Numberofrunningtasks.•Demand(numberoftaskstheSchedulablewantstorun;thisisequaltonumberofrunningtasks+numberofunlaunchedtasks).•Minshareassignedthroughconfigfile.•Weight(forfairsharing).•Priorityandstarttime(forFIFOscheduling).reseparateSchedulablesformapandreducetasks,tomakeitpossibletousethesamealgorithmonbothtypesoftasks.6

4.4FairSharingAlgorithmAsimplewaytoachievefairsharingisthefollowing:wheneveraslotisavailable,llensurethatallpoolgetanequalnumberofslots,unlessapool’sdemandislessthanitsfairshare,turesoftheFairSchedulercomplicatethisalgorithmalittle:•mple,accomplishedbychangingtheschedulingruleto“assigntheslottothepoolwhosevalueofrunningTasks/weightissmallest.”•Minimumsharesmeanthatpoolsbelowtheirminshareshouldgetslotsfisortpoolstochoosewhichonestoschedulenext,rthepoolsirsharingalgorithmiparatorordersjobsbydistancebelowminshareandthenbyrunningTasks/weight.4.5PreemptionTodeterminewhentopreempttasks,theFairSchedulersmaintainstwovaluesforeachPoolSchedulable:thelasttimewhenthepoolwasatitsminshare,PreemptionVariables,ethodsalsotakeintoac-countthedemandofthepool,sothatapoolisnotcountedasstarvingifitsdemandisbelowitsmin/eemptingtasks,thescnimizestheamountofcomputationwastedbypreemptionandensuresthatalljobscaneventuallyfinish(itisasifthepreemptedjobsjustnevergottheirlastfewslots).atforminsharepreemption,itisclearwhenapoolisbelowitsminsharebecausetheminshareisgivenasanumberofslots,butforfairsharepreemption,wemustbeabletocomputeapool’mputationistrickierthandividingthenumberofslotsbythenumberofpoolsduetoweights,n4.6explainshowfairsharesarecomputed.4.6FairShareComputationTheschedulingalgorithminSection4.4achievesfairshareswithoutactuallyneedingtocomputepools’r,forpreemptionandfordisplaying7

sharesintheWebUI,wewanttoknowwhatapool’,wewanttoknowhowmanyslotsthepoolwouldgetifwestartedwithallslotsbeingemptyandranthealgorithminSection4.4untilwefitocomputetheseshareswouldbetosimulatestartingoutwithemptyslotsandcallingassignTasksrepeatedlyuntiltheyfilled,butthisisexpensive,becauseeachschedulingdecisiontakesO(numJobs)timeandweneedtomakeO(numSlots)utefairsharesefficiently,theFairSchedulotshadbeenassignedaccordingtoweightedfairsharingrespectingpools’demandsandminshares,thentherewouldexistaratiorsuchthat:hosedemanddiislessthanrwi(wherewiistheweightofthepool)hoseminsharemiismorethanrwiareassignedmin(mi,di)ls’fineafunctionf(r)asthenumberofslotsthatwouldbeusedforagivenrifconditions1-3aboveweremet,andthenfindavalueofrthatmakesf(r)=ecisely,f(r)isdefinedas:f(r)=

本文发布于:2024-09-20 15:23:52,感谢您对本站的认可!

本文链接:https://www.17tex.com/fanyi/22538.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2024 Comsenz Inc.Powered by © 易纺专利技术学习网 豫ICP备2022007602号 豫公网安备41160202000603 站长QQ:729038198 关于我们 投诉建议