您的当前位置：首页 High-level reactive support for scalable distributed systems

High-level reactive support for scalable distributed systems

来源：百家汽车网

High-LevelReactiveSupportforScalableDistributedSystems

JamesE.Lumpp,Jr.‡,JamesGriﬃoen†,RajendraYavatkar†,

ThomasKay†,ChristopherDiaz†

DepartmentofElectricalEngineering‡DepartmentofComputerScience†

UniversityofKentuckyLexington,KY40506,USA[jel,griﬀ,raj,kay,diaz]@dcs.uky.edu

Abstract

Tomakelarge-scaledistributedsystemspracticalforabroaderuserbase,weareinvestigatingahigh-levelframeworktosupportandaidthedevelopmentofreactivedistributedapplications.Thereactiveobjectframeworkoﬀerslarge-scaledistributedapplicationstheabilitytoautomati-callyadapttothestateofthedistributedsystemaswellasthebehavioroftheapplication.Thesystemautomaticallygathers,condenses,andprovidesaccesstoperformancestatisticsneededbythedistributedapplicationforadaptation.Tohideadaptationfromtheprogrammer,thesystemprovidesdefaultadaptationpoliciesandmechanismsthatcanbecombinedtocreatereactiveob-jectsthatautomaticallyrespondtochangesinthesystemtoimproveperformance.Byseparatingpoliciesfromalgorithmsanddatastructures,thesystemencouragesreuseofparallelizedcompo-nents.Ultimately,ourgoalistocreatealibraryofcommonscientiﬁcreactiveobjectsthatcanbedirectlyincorporatedintoparallelapplications.

1Introduction

GrandChallengeProblemsrequiremassivecomputationalresourceswithsustainedratesintheteraﬂops[NCO96].Scientiﬁcapplicationssuchasglobalclimatemodeling,quantumchromody-namics,andcomputationalﬂuiddynamicsarehighlyparallelandrequiremassivecomputationalpoweranddatastorage.Massivelyparallelnumericalalgorithmsthatusedomaindecompositionmethodstosolvethepartialdiﬀerentialequationsfoundinsuchapplicationshavebeenquitesuc-cessful[Cai93].Suchhighlevelsofperformancecanonlybeachievedbyasystemconsistingofhundredsorthousandsofprocessors.

However,cost/performanceratiosmakeitunlikelythatsuchpowerwillbeachievedbyasingle,massivelyparallelarchitecture.Suchsystemsaretypicallybuiltfromspecialpurposehardwareandsuﬀerfromscalabilitylimitations.Incontrast,large-scale,loosely-coupleddistributedsystemsconsistingofcommodityhigh-performanceworkstationsinterconnectedbyhigh-speednetworksappearquitepromising.Scalabilityisachievedbyintroducingnewworkstationsornetworksofworkstationstothesystem.Thesesystemsuseoﬀ-the-shelfcomponents(workstations)thatcanbeeasilyupgradedtotakeadvantagesofrapidchangesintechnology.Moreover,suchsystemsalreadyexistandarewidelyavailableinmostbusiness,governmental,andacademicsettings.Forthesereasons,weexpectthatlargescalenetworksofworkstationswillemergeasthearchitectureofchoiceformanygrandchallengecaliberapplications.However,wedonotexpectthousandsofmachinestobeconnectedtoasinglehigh-speedlocalareanetwork.Therefore,toobtaintherequisitecomputationalpoweruserswillneedtoaccessmachinesonseveraldiﬀerentlocalareanetworksconnectedbywideareanetworksspanningsubstantialgeographicaldistances.Finally,we

expectthatthesesystemswillhavedualuse,servingasgeneralpurposecomputingenvironmentsandalsoaspowerfulmassivelyparallelcomputeengines.

Unfortunately,thegeographicdistributionofprocessingpower,memory,andsecondarystoragecomplicates,ratherthansimpliﬁes,thedesignofapplications.Applicationdesignisfurthercom-plicatedbythestaticanddynamicvariabilityofthesystem.Thelarge-scalesystemsweenvisionwillinevitablyconsistofdissimilarmachineswithwidelyvaryingprocessorspeeds,memorysizes,anddiskspeeds.Similarly,thenetworksthatinterconnectthemachineswillhavewidelyvaryingbandwidthsandlatencies.Inadditiontothesestaticdissimiliarities,dynamicruntimediﬀerenceswillalsooccur.Transientusersperforminggeneralpurposetaskswillcausetheloadoncertainprocessorsandnetworkstovarydynamically.Ourgoalistodevelopaprogrammingenvironmentthatmaskstheprogrammingdiﬃcultiesassociatedwithadynamicdistributedenvironmentbutyetmakesoptimaluseoftheavailableandconstantlychangingresources.

Tomakelarge-scaledistributedsystemsusablebythescientiﬁccommunity,weareinvestigatinganewframeworktosupportandaidthedevelopmentof“reactiveobjects”.Reactiveobjectsoﬀerlarge-scaleapplicationstheabilitytoautomaticallyadapttothestateofthesystemaswellasthebehavioroftheapplication.Thesystemautomaticallygathers,condenses,andprovidesaccesstoperformancestatisticsneededbythedistributedapplicationforadaptation.Reactiveobjectsautomaticallyadjusttochangesintheunderlyingsystem,dynamicallyoptimizingtheapplicationforthecurrentenvironment.Tohideadaptationfromtheprogrammer,thesystemprovidesdefaultadaptationpoliciesandmechanismsthatcanbeincludedinreactiveobjects.Thesystemencouragesreuseofparallelizedcomponentsbyseparatingpoliciesfromalgorithmsanddatastructures.Ultimately,ourgoalistocreatealibraryofcommonscientiﬁcreactiveobjectsthatcanbedirectlyincorporatedintoparallelapplications.Noviceuserswillusereactiveobjectsashigh-performance“buildingblocks”fordevelopingapplications.Experienceduserswilluseservicesofthemonitoringsystemandpredeﬁnedreactiveobjectcomponentstodevelopobjectsthat“plugin”tothereactiveobjectsupportsystem.

Theremainderofthepaperisorganizedasfollows.Section2beginsbyreviewingrelatedwork.Section3thenbrieﬂyoverviewstheUnifydistributedsharedmemorysystemthatservesasthescalabledistributedoperatingsystemplatformforourreactiveobjectresearch.Havinglaidabackgroundanddescriptionoftheenvironment,section4identiﬁestheproblemsthatmustbeaddressedbyalarge-scalereactivedistributedsystem.Section5thenpresentsthearchitectureanddesignofourreactiveobjectsystemandsection6concludesbydescribingthestatusofthereactiveobjectsystem.

2RelatedWork

Todetectandreacttochangesinthestateofthesystem,anapplicationmustmonitorthesystemandtheapplication’srun-timeperformance.Researchershaveinvestigatedbothhardwareandsoftwareinstrumentationapproachestoobserveandrecordstatechangesinthesystem[SKB].Hardwareinstrumentationhastheadvantageofintroducingminimaldelayorintrusiontothetargetsystem[MLCS90]andcanbeinvaluableformeasurementsnotattainableviasoftware(e.g.,cachehitratios).However,hardwaremonitorsareinﬂexible,low-level(i.e.,can’tmonitorprocess-levelevents),andoftenprohibitivelyexpensive.Softwareinstrumentationhastheadvantageofﬂexibility,high-leveleventmonitoring,andnoexpensivehardware.Asaresult,virtuallyallparallel/distributeddebuggers[MH]andperformanceevaluation[HML95]toolsrelyonsoft-wareinstrumentation.Unfortunately,softwaremonitoringcanadverselyaﬀectperformanceandcorrectness.Event-basedmodelsofmonitoring[Bat,MLCS90]havebeenproposedtogatherinformation“on-the-ﬂy”[Sch,HKMC90,MC91,JK93].Event-basedmonitoringsystemsusepredicatesdeﬁnedonsubsetsofsystemstatetorecognizestatechanges[BW83,LCSM90].Ourworkbuildsonpastworkinevent-based,on-the-ﬂy,predicaterecognitionsystems.Inaddition,ourresearchattemptstoidentifyandonlymonitortheeventsofinteresttolarge-scaledistributedapplications.

Techniquestoalterthebehaviorofrunningprogramsinresponsetochangesintheunderlying

computingsystemorunforeseenalgorithmicproblemshavebeenstudiedinseveralcontexts.Theformerwasstudiedprimarilyintheareaofmigrationandthelatterintheareaofprogramsteering.

Processmigrationandloadbalancingalgorithmsarewell-knownexamplesofprogramsreactingtorun-timechangesinsystemload[Smi88].Moregeneralapproaches,suchasMESSIAHS[CS94],allowuserstodevelopapplication-speciﬁcdistributedschedulingalgorithms.Processmigrationinvolvesthetransferofwork,data,andprocessstatewhichiscostlyandtimeconsuming.Con-sequently,processmigrationistypicallyusedinsituationswheredecompositioniscoarse-grainedandmigrationsoccurinfrequentlysothatcostsareamortizedovertime.Thecoarse-grainnatureoftencausestheresulting“balancedload”tobefarfromtrulybalanced.Migrationoflight-weightprocesses,suchasFilaments[FLA94],allowsforamoreﬁne-grainedworkdecomposition.Assumingtheapplicationcanbedecomposedintomanyﬁne-grainedpiecesofwork,thesystemcanreacttosystemloadchangesquicklyandeﬀectively.Othershaveproposedmodiﬁcationstoparallelprogramminglanguagesorcompilerstoallowdynamicschedulingofapplicationmodulestoprocessors[Luc92].Bothapproachesrequirethattheapplicationbedecomposedintomanyﬁne-grainedparallelizablecomponents.

Toreacttoalgorithmicerrorsormismatchesbetweenthealgorithmanditscurrentinputdata,severalresearchershaveproposedtheuseofinteractivesteeringsystems[EGSM94,GEK+95,VS95,Sos92].Interactivesteeringsystemsputaknowledgableuserintheloop,allowingtheusertodynamicallyalterthealgorithmtoredistributetheworkor“steer”thealgorithmtothedata.Thefollowingbrieﬂydescribesafewsteeringsystemsmostcloselyrelatedtoandinﬂuencingourwork.

TheFalconandProgresssystems[EGSM94,GEK+95,VS95]allowausertomonitorandthenmanipulatearbitraryprogramvariablesviasensorsandactuatorsthatcontrolorsteertheapplicationatruntime.TheMetatoolsetdevelopedforISIS[MW91]takesasimilarapproach.Thestateofthesystemismonitoredbyinsertingprobecallsintheapplicationtosensethestateoftheapplicationwheretheprogrammerdeterminesitwillbemostproductive.Aseparatemonitoringthreadrecordsandanalyzesthestateinformationattheprobeevents.Theapplicationwritercanalsoplaceactuatorsatstrategicpointsinthecomputationwhereitissafetosteertheapplication.ThemonitorprocesscommunicateswithaGUIthatdisplaystheprogressoftheapplicationandallowstheusertodynamicallyaltertheprogram’sstateordirection.Thistechniqueprovidesaﬂexibleandpowerfulframeworkfordevelopingsteerableapplications,butrequirestheapplicationdevelopertoprovideallthesensingandactuatorcodeneededtosteertheobject.Theadaptivesystemthatwepresentinthispaperprovidesahigher-levelinterfaceanddefaultmonitoringcapabilitiesthatcouldbeimplementedonthelow-levelsupportprovidedbyasystemsuchasProgressorMeta.

TheDynascopesystem[Sos92]providesaframeworkforconstructingdirectorsthatsteeranapplication.Portionsoftheapplicationareinterpretedwhileotherportionsrunonthenativemachine.Onlyinterpretedcodecanbedirected.Theinterpretedcodesendsalleventstothedirectorwhoanalysesthecurrentprogramexecutionandmodiﬁesitsbehavior.Directoraccesstoandmodiﬁcationofapplicationstateisonlysupportedatthemachine-level.Nodirectparallelordistributedsupportispresent.

Themajorityofcurrentsteeringsystemstargettightly-coupledsystemsconsistingofuniformdedicatednodesandhigh-speedinterconnectionnetworks.Insuchanenvironment,thereislittlevarianceorchangeinthesystemstate,implyingthatpoorperformancestemsprimarilyfromalgorithmicproblemsandrequiressteering.Consequently,mostoftheresearcheﬀortsfocusonmonitoringandcontrollingtheprogram’sstateratherthanmonitoringthesystem’sstateandconformingtoit.Inalarge-scaledistributedsystem,systemstatechangesareoftentheprimaryfactorinﬂuencingperformance.Consequently,distributedsystemsmustprovideconvenientab-stractionstohelpapplicationsadjusttosystemstatechanges.Theworkpresentedhereprovidesthenecessaryframeworktoobtaininformationabout,andreactto,changesinthesystemstateinformation.

AlthoughtheMentatsystem[GSN93,Gri93]doesnotsupportinteractivesteering,itdoesprovideahigh-levelinterfacetoparallelprogrammingconstructsbyencapsulatingthecomplexitiesofparallelprogramminginparallelobjects.Applicationsinvoketheservicesofhigh-levelMentat

objectswhichhidetheparallelism.MentatstaticallyprocessesMentatobjectstodeterminetheexecutionorderoftheobjectsbasedondatadependencies.Theresultingblocksarethenexecutedasynchronouslywithdataforwardedbetweenobjectsasrequired.OurapproachissimilartoMentatinthatwealsoencapsulatethecomplexityoftheunderlyingsysteminhigherlevelobjects.However,weallowtheobjectstodynamicallyandcontinuouslyadjusttochangesinsystemloadorconﬁguration.

Finally,thereareseveralexamplesofapplication-speciﬁcimplementationsofadaptationinclud-ingdistributedbranch-and-boundalgorithms[LM92]andN-bodysimulations[GKS94].Othershavedevelopedtechniquestodynamicallychangethealgorithmorcommunicationmodeltomeetreal-timedeadlinesofaparticularreal-worldenvironment[LMnS90].Suchtechniquescanbeencapsulatedwithinthereactiveobjectswedescribehere.

Forhigh-performancedistributedenvironments,thedynamicnatureofthenetworkandhostscanmaketheoverallbehaviorofthesystemdiﬃculttopredict.Alow-levelsteeringsystemwouldrequiretheprogrammertounderstandthedetailsoftheirapplication,thecharacteristicsofahighlydynamicandcomplicateddistributedsystem,andthetypicalworkloadthatisexpectedfromexternalsources.Iftheprogrammercanbeprovidedwithsomehigherlevelabstractionsthatwillautomaticallyadapttochangesinthedistributedsystemorapplication,theprogrammercanfocusonsolvingtheproblemratherthantargetingorreactingtoaruntimeenvironment.Forexample,aprogrammershouldnothavetoincludeprobesandactuatorsthroughouttheirprogramtowatchfornodesthatbecomeheavilyloadedbyotherusers.Itispreferabletoprovideobjectsthatautomaticallyrespondtosuchconditionsbyredistributingsomeworktoanothernodeoravoidinglinksthatareperformingpoorly.

3UnifyOverview

WearecurrentlyinvestigatingreactivesupportfordistributedapplicationsinthecontextoftheUnifydistributedoperatingsystem[GYF95].TheobjectiveoftheUnifyprojectistodevelopascalablemulticomputerlinkinghundredsorthousandsofhigh-performancemachinesingeo-graphicallydistantlocations.Suchlarge-scaleparallelismisnecessarytoachievethemassivecomputationalpowerrequiredbytoday’sgrandchallengeproblems[NCO96].Toprovideacon-venientprogrammingmodel,Unify’sgoalistosupportahighlyscalabledistributedsharedmem-oryprogrammingparadigm.ConventionalDSMapproachesareinappropriateinsuchlarge-scalegeographicallydistributedenvironments.Toachievescalabilityinsuchanenvironment,Unifysupportsnewsharedmemoryabstractionsandmechanismsthat(1)maskthedistributionofre-sources,(2)limit/reducethefrequencyofcommunicationandtheamountofdatatransferred,(3)hidethepropagationlatenciestypicaloflarge-scalenetworks(e.g.,byoverlappingcommunicationwithcomputation),and(4)supportlarge-scaleconcurrencyviasynchronizationandconsistencyprimitivesfreeofserialbottlenecks.Ideallythesystemmustprovideconvenientdatasharingab-stractionsthatexhibitperformanceandscalabilitysimilartothatofexistinglarge-scalemessagepassingmulticomputerssuchasPVM[Sun90]andMPI[For94].ThefollowingbrieﬂyhighlightsthesalientfeaturesofUnify:

SingleAddressSpace:Asinglevirtualaddressspace,sharedbyallapplications,allowsap-plicationstoconvenientlyandeﬃcientlysharestructuredaddress-dependentdata(suchastreesandlinkedlists)aswellasotheraddress-independentdata.MultipleMemoryTypes:Ourdesignsupportsthreebasicmemoryabstractionsforshared

data.Randomaccessmemoryisdirectlyaddressable.Sequentialaccessmemoryisaccessedinaread/front,write/appendfashion.Associativememoryisaccessedviapairs.Sequentialaccessandassociativememorycanoftenbesupportedwithweakerspatialconsistencyguarantees(describedbelow)thatcanbeimplementedmoreeﬃcientlythanrandomaccessmemorysegments.MultipleGradesofConsistency:TheUnifydesignsupportsasetofconsistencymanage-mentprimitivesthatallowanapplicationtoselecttheappropriateconsistencysemantics

fromaspectrumofconsistencyprotocols,including”automatic”methodswheretheoper-atingsystemenforcesconsistencyand”application-aided”methodswheretheuserdeﬁnesconsistencycheckpoints.SeveralexistingDSMsystemshavedemonstratedthebeneﬁtsofweakapplication-aidedtemporalconsistencymodels.Inaddition,Unifysupportsweakau-tomaticmethodswherethememorybecomesconsistentaftersometimelagT.ThistypeofautomaticconsistencyisparticularlyusefulforapplicationsthatcandetectstaledatasuchasGrapevine[?].

Ourdesignalsointroducesanewconsistencydimensioncalled”spatial”consistency.Spatialconsistencydeterminestherelativeorderofthecontentsofthereplicasofasegment.Formanydistributedapplicationsthatusekeyedlookupsorsequentialaccess,theorderofthedataitemswithinasegmentisunimportant;onlythevaluesofindividualdataitemsareimportant.Spatialconsistencyallowseﬃcientimplementationofsuchapplications.ScalableSynchronizationPrimitives:MostDSMdesignssupportlocks,semaphores,and/or

barriersasthebasicsynchronizationmethods.However,thesesynchronizationprimitivesposeaseriousbottleneck,particularlyforlargesystemswithhighlatencies.Toprovideeﬃcientsynchronizationinthepresenceoflongandvaryinglatencies,Unifyproposestheuseofamodiﬁedformofeventcountsandsequencers.Foralargeclassofapplications,eventcountscanresultinreducedcommunicationandgreaterconcurrency.Inparticular,conventionalsynchronizationmethodsrequirethatallparticipantsobservethesynchroniza-tioneventsimultaneously.Eventcountsallowparticipantstoobservetheeventatdiﬀerenttimes,eﬀectivelyrelaxingthecommunicationconstraintsandallowinggreaterconcurrency.Moreover,conventionalsynchronizationprimitivescanbeimplementedviaeventcounts.HierarchyofSharingDomains:Webelievethatsharinginalargescaledistributedmulticom-puterwillfollowtheprincipleoflocality.Toexploitlocalizedsharingandcommunication,wepartitionthesetofhostsinto”sharingdomains”.Eachsharingdomainusesaseparatemulticastgrouptoreducethecostofintra-domaininformationsharing.Sharingdomainsdistributetheburdenofinformationretrievalanddistributionbyallowinganymemberofthedomaintoissueoranswerinter-domainrequests(addressedtomulticastgroups).Ev-eryhostconsultsthehostsinitslocalsharingdomainbeforegoingoutsidethedomainforinformation.Therefore,assoonasonehostobtainscross-domaininformationfromasiteinaremotedomain,theinformationeﬀectivelybecomesavailabletoallotherhostsinthesharingdomain.ReliableMulticastSupport:AlthoughprotocolssuchasATMandRSVPprovidequality

ofserviceguarantees,theydonotguaranteeend-to-endreliability.Infact,classicalIPperformanceoverATMiscurrentlyatopicofmuchresearchbecauseofTCP/IP’spoorperformanceandhighdatalossrates.Distributedapplicationsrequireareliablemulticastmechanismthatnotonlydeliversdatareliably,butalsoattemptstosynchronouslydeliverdatatoallrecipients.Distributedapplicationstypicallyblockuntilthemulticastinformationhasbeenreliablydeliveredtoallparticipants.Suchdelayscanseverelyaﬀectperformance.Toachievereliable,scalable,eﬃcientdisseminationofshareddataorsynchronizationin-formation,Unifysupportsreliablemulticastviaatree-basedmulticasttransportprotocol(TMTP)[YGS95].TMTPbuildsontheeﬃcientdeliveryofIPmulticast(possiblyviatheMBONE)whichiswidelyavailable.Toprovidereliability,TMTPusesacombinationofsenderandreceiverinitiatedapproachesthatconstructsaseparatecontroltreetohandleerrorandﬂowcontrol.Asaresult,retransmissionsarehandledlocallyinatimelyfashion,avoidingretransmissionstotheentireInternet.TheuseoflocalizedNACKswithNACKsuppressionensuresquickresponsetolostmessageswithminimaloverhead.Finally,batchedpositiveacknowlegementsreduceInternettraﬃcandeliminatethepacketimplosionproblem.Theutilityandscalabilityoflargescalemulticomputerscanonlybedemonstratedbydevelop-ingandevaluatingreal-life,large-scaleparallelanddistributedapplications.WehaveimplementedtheUnifysystemasaruntimelibraryonUnixworkstationsandruntestsinvolvingasigniﬁcant

numberofhosts.DSMapplicationslinkwiththelibrarytocreatesharedsegmentswithappro-priatememorytypesandconsistencysemantics.

WehaveimplementedseveralcommonDSMapplicationsincludingmatrixmultiplication,SOR,MP3D,andWaterandobservedimpressivespeedupsinourlocalenvironment.Thelibraryiscur-rentlyavailableto,andinuseby,ourscientiﬁccolleagueswithcomputationallycomplexproblems.Wearecurrentlyworkingtoprovidethemwithahigh-levelreactiveobjectsupportsystemthatwillprovidemaximalperformancedespitehighlydynamicchangesintheruntimeenvironment.

4ConformingApplicationstoNetworksofWorkstations

Programmingdistributedsharedmemoryapplicationsforlarge-scalenetworksofworkstationsiscomplicatedbyseveralfactorsthatarenotpresentorsigniﬁcantinmultiprocessorenvironments.Multiprocessorsystemsdonotexhibitthevariabilityinprocessorspeed,memorycapacity,networkloadsornetworkroutesthataloosely-coupledsystemexperiences.Eachtimeanapplicationexecutes,itmayﬁnditselfinanewenvironment.Theenvironmentcouldevenchangewhiletheapplicationisrunning.Processorloadscanchangeasotherapplicationscomeandgo,andnetworkbandwidth,latency,packetloss,androutescanvarydynamicallyinresponsetoothertraﬃcinthesystem.TomakeeﬀectiveuseofthecomputationalpowerintheUnifyenvironment,thesystemmusthelptheapplicationadaptandconformtochangesintheenvironment.

OurexperiencewiththeUnifysystemhasshownusthattherearemanyreasonswhyanapplicationmaynotachievethedesiredspeedup.Inalmostallcases,thepoorperformancestemsfromamismatchbetweenthealgorithmandthedistributedenvironmentwherethealgorithmisexecuted.Forexample,assigningspatiallyrelatedpartitionsofamatrixtonon-neighboringmachinescanresultinanexcessivenumberofhigh-latencymessages.Thesecondandlesscommonsourceofpoorperformancearisesfromamismatchbetweenthealgorithmandthedata.Acommonexampleofthistypeofmismatchistheuseofﬁxedpartitionsonasparsematrix.Inbothcasesdescribedabove,slightruntimemodiﬁcationstothealgorithmwilloftenrectifytheproblemandboostperformance.

Ourgoalistodeveloptheinfrastructureneededtorecognizethecharacteristicsoftheruntimeenvironmentthatinﬂuenceperformanceandmonitorthosecharacteristicsduringprogramexecu-tion.Thesystemmustthenprovidethenecessaryhooksfordistributedapplicationstoprocureinformationabouttheenvironmentandchangethealgorithmorthedatastructure’simplemen-tationtoconformorreacttochangesintheenvironment.

Todeterminewhichenvironmentalcharacteristicsweneedtomonitor,wemustidentifytheenvironmentalchangesthatcannegativelyaﬀectperformance.Roughlyspeaking,therearetwoenvironmentalfactorsthatlimitthespeedupadistributedapplicationcanachieve:computationaloverloadandnon-optimalcommunication.

Computationaloverloadoccurswhentheworkassignedtoamachineexceedsthemachine’scapabilities.Computationoverloadiscommonbecausemostdistributedcomputingsystemscon-sistofheterogeneousworkstationswithwidelyvaryingcomputationalpower.Thisdisparityarisesfromtheincrementalgrowthofasystemoverseveralyearsandtheneedforheterogeneitytoruncertainsoftwarepackages.Theunevendistributionofcomputationalpoweriscomplicatedbythefactthattransientuserscandynamicallyincreaseordecreasetheloadoncertainmachines.Thebase-linediﬀerencesanddynamicvariabilityincomputationalpowermeansthatanalgorithmwillrarelypartitiontheworkcorrectlyandthusmustcontinuallyre-partitiontheworkoverthelifetimeoftheapplication.Machineswillalsohavediﬀerentamountsofphysicalmemory,andtheamountavailabletotheapplicationwillchangedynamically.Evenifallmachineshaveiden-ticalprocessors,insuﬃcientmemoryatsomenodeswillleadtopagingthatdegradesthenode’sperformance.

Non-optimalcommunicationoccurswhencloselyrelatedlogicalnodesareassignedtodistantphysicalmachinesorifinappropriateDSMprimitivesareused.Evenifdomaindecompositioniscalculatedsuchthatthecorrectamountofworkisallocatedtoeachmachine,theinteractionsbetweendomainsmaynotmatchtheunderlyingnetworktopologyornetworkroutes.Unlike

multiprocessormachines,thenetworktopologyandcurrentnetworkroutesarerarelyknownandcanchangedynamically.Thismakesitdiﬃculttoensurethatspatiallyrelateddomainsarelocatedonneighboringmachines.Iffrequentlyinteractingdomainsareplacedondistantmachines,excessivelatency,packetloss,andlowbandwidthcanoccur.Non-optimalplacementofprocessesonmachinesmayalsomeanthateﬃcientmechanismssuchaslink-layermulticastingcannotbeused.Tocombatthis,thesystemmustprovideinformationaboutthecurrenttopologyandnetworkperformancetoaidintheassignmentofdecompositiondomainstomachines.Finally,thehigh-latenciesofwideareasystemscanhaveanenormousimpactonDSMsynchronizationcostswhichtendtobelatencydominated.Inmanycases,synchronizationeventscanbetradedforlargergranularity.Inawideareaenvironment,performancemayactuallybeoptimizedbyreducingthenumberofsynchronizationeventsandincreasingthesizeofsharedregions.ModifyingthewayinwhichtheapplicationusestheDSMsystemcanproducesigniﬁcantspeedups.

Thefollowingsectionsdescribeourreactivesystemarchitecture.Thesystemmonitorstheenvironmentalchangesdescribedaboveandprovidesdefaultanduser-speciﬁcmechanismstoreacttothechanges.

5AReactiveObjectSystem

Toprovidedistributedapplicationswiththeinformationnecessarytoeﬀectivelymaptheappli-cationontotheavailabledistributedresources,weproposeareactiveobjectsystemarchitecturepicturedinFigure1.

Theobjectiveistoprovideahigh-levelinterfaceforwritingdistributedapplicationsthatcandynamicallyreacttotheenvironment.Thesupportsystemisorganizedintothreelayers:ahostmonitoringlayer,adistributedstatemonitoringlayerandahigh-levelreactiveobjectlayer.Ingeneral,users’applicationswillinteractdirectlywithobjectsinthereactiveobjectlayer.Thereactiveobjectlayersupportsparallelizedobjectsthathidealladaptationfromtheuser.Forcertainapplications,theapplicationdevelopermayneedtocreatenewreactiveobjects.Onlyinrarecasesdoweenvisionusersdealingwithormodifyingcomponentsinthemonitoringlayers.

Thegoalofthemonitoringlayersistodynamicallygatherperformanceinformationandfeedittothereactiveobjectlayerwheredecisionsaboutadjustmentswillbemade.Monitoringofsystemstateisdividedintotwolayers,oneresponsibleforobservingthestateofthelocalmachineandoneresponsibleforthestateacrossmachines.Theselayersprovidestateinformationtothereactiveobjectlayerenablingthemtorespondtostatechanges.

5.1LocalStateLayer

Thelocalstatelayermonitorsthestateofthelocalmachineandpresentsthecollectedinfor-mationtotheupperlayers.Theperformanceinformationrequiredbythedistributedstateandreactiveobjectlayersisspeciﬁedaspredicatesdeﬁnedintermsoflocalstateintroducedinourpreviouswork[LCSM90].Forexample,“loadaverage>2”or“changeinloadaverage”wouldbespeciﬁedaspartialpredicatesusedtodeterminewhenloadbalancingmodiﬁcationsarerequired.Whenthespeciﬁedeventsoccuronthelocalhost,thepredicateswillevaluatetrueandtriggeractions[MLCS90]thatnotifytheupperlayersorpassstateinformationtotheupperlayers.Locallayerobjectsoneachmachinemeasureprocessorstatistics,memorysystembehaviorandnetworkutilization.Processorstatisticsincludetheamountofprocessortime,typicallymeasuredinin-structionspersecond,thattheapplicationhasbeenreceiving.Processorstatisticsalsomonitorapplicationruntime,sub-dividedintoapplicationcomputetime,communication(system)over-headtime,andmissedtime(timeusedbyotherapplications).Memorymeasuresincludedelaysarisingfrompagingwithinthetheapplication.Memoryperformancecanbeusefulinadjustingbothdatastructuresoralgorithms.Thenetworkinterfaceobjectrecordsinformationaboutthebandwidthandlatencyofcurrentconnectionsaswellasretransmissionanddroppedpacketrateswhichcanbeusedtoinferinformationaboutthenetworktopologyandlinkperformance.

DistributedApplicationDistributedApplicationUser−Level Application Layer

Reactive ObjectAP DReactive ObjectAP DReactive ObjectAP DReactive Object Layer

RemoteProcessorStatisticsRemoteMemoryStatisticsNetworkStatisticsDistributed State Layer(System Independent DistributedPerformance Monitoring Layer)

ProcessorMonitorMemoryMonitorNetworkInterfaceLocal State Layer

(Host/Network SpecificPerformance Monitoring Layer)

Figure1:OrganizationoftheReactiveObjectsSystem.

5.2DistributedStateLayer

Thedistributedstatelayersupportspredicatesthatcanbelocal,non-local,orglobal.Reactiveobjectspresentthedistributedstatelayerwithlocal,non-localorglobalpredicates.Localpred-icatesarepassedthroughtothelocalstatelayer.Non-localpredicatesspecifystatesspanningmorethanonemachine(e.g.,“theloadofthelocalprocessor50%greaterthantheloadonaneigh-borprocessor”)whileglobalpredicatesspecifyoverallsystemstate(e.g.,“allprocessorsidle”).Theobjectsthatcomprisethedistributedstatelayercommunicatebetweenmachinestoevaluatepredicatesandnotifythereactiveobjectswheneventsoccur.

Inadditiontoeventrecognition,distributedstatelayerobjectsalsotabulateandmaintainstateinformationthatreactiveobjectscanquery.Forexample,areactiveobjectmayrequestnetworktopologyorinterconnectioninformationsuchasthelatencyrecentlyexperiencedbetweentwonodesofthesystem.Thenetworkstatisticsobjectprobesotherhostsinthesystemtodeterminetheexpectedlatency,lossrate,orpossiblebandwidthtothesemachines.Givenexpectednetworkperformance,thesystemconstructsapseudo-topologyindicatingwhichmachinesaredistantandwhichmachinesareneighborsandcompriseaUnifysharingdomain.Areactiveobjectmightusethisinformationtorequestthefourprocessorswiththeleasttotallatencyamongthefullyconnectedset.Thereactiveobjectcanthenusethefourselectedmachinestorunacommunicationintensivesub-task.

Althoughthedistributedstatelayerknowsnothingabouttheapplication,itdoesunderstandalimitednumberofcommonhigh-levelDSMconceptssuchassharedmemoryregions,synchro-nizationevents,datatransferevents,sharingdomains,pagefaultrates,processorspeeds,etc.Consequentlythedistributedstatelayercanseparatethingssuchascommunicationcostsintosynchronizationcommunicationcostsanddatatransfercommunicationcosts.

Giventheabilitytoobtainperformanceinformationabouttheenvironmentinwhichanap-plicationﬁndsitself,wemustdevelopanapplicationinterfacetosimplifyinteractionwiththemonitoringsystem.TheReactiveObjectLayerprovidestheapplicationwithself-adaptingdis-tributedobjects.

5.3ReactiveObjectLayer

Themonitoringlayersprovidealltheinformationandnotiﬁcationmechanismsnecessaryforanapplicationtoconformtothecurrentenvironmentorreacttochangesintheenvironment.How-ever,thesecapabilitiesareataninappropriatelevelofabstractiontosupportacomputationalscientistdevelopinganapplication.Suchusersrequirehigher-levelsupporttoaidtheminthedevelopmentofareactiveprogram.Tothatend,thesystemincludesahigh-levelreactiveobjectlayerthathidesperformancemonitoringandadaptationfromtheapplicationdeveloper.

Adistributedapplicationwilltypicallyinterfacewiththereactivesystembyinvokingtheser-vicesofpredeﬁnedreactiveobjects.Eachreactiveobjectimplementsareactiveparallelizedversionofsomecommonlyusedabstraction.ExamplesofcommonabstractionsmightincludeamatrixabstractionwithmethodstoperformLUdecomposition,successiveover-relaxation,Choleskyfac-torization,averaging,andmin/maxelementidentiﬁcation,animageprocessingmatrixabstractionwithmethodstoperformcontrastenhancement,thresholding,segmentation,correlation,andedgedetection,oraqueryablestorageabstractionwithinsertion,deletion,andquerymethodsthatmightbeimplementedwithahashtable,B-tree,linkedlistorarraydependingonwhichisthemosteﬃcientgiventhecurrentstateofthesystem.Givenalibraryofpredeﬁnedreactiveobjects,theapplicationdevelopercanconcentrateontheproblemtobesolvedratherthanreactingtochangesinthesystem.Intheworstcase,adevelopermayneedtoimplementnewapplication-speciﬁcreactiveobjects.Eveninthiscase,thereactiveobjectlayerprovidesreusablepoliciesanddatastructurestosimplifythecodingofauserdeﬁnedreactiveobject.

Thereactiveobjectlayerrespondstomethodinvocationsfromuser-levelapplicationsandalsocommunicateswiththeunderlyingglobalstatemonitoringsystem.Communicationwiththemonitoringsystemoccursinoneoftwoways.Areactiveobjectcanregisterwiththedistributedstatelayertobenotiﬁedwhenaneventoccurs.Thisisanalogoustoa“performanceinterrupt”

wherethereactiveobjectisinformedthatthemonitoredconditionhasoccurred.Forexample,anobjectmayregisterviaapredicatespeciﬁcationtobeinformedofanexcessivelyslowhost.Oncenotiﬁedofthiscondition,thereactiveobjectcanredistributeworktoavoidthathost.Thesecondmethodofcommunicationisthatofpollingthedistributedstatelayerfortheinformationithasgatheredandtabulated.Forexample,areactiveobjectmayrequestprocessorperformancesummaryinformationbetweeniterationsofaconcurrentlooptodetermineifchangesinthedistributionofworkbetweenprocessorswouldbebeneﬁcial.

FromDistributedApplicationObjectInterfaceAdaptiveAlgorithmsAlgorithm 1Algorithm 2Algorithm 3AdjustableDataPredefinedData StructuresReactivePoliciesPredefinedPoliciesUser DefinedData StructuresUser DefinedPoliciesState InformationFigure2:ReactiveObjectTemplate.

Figure2illustratestheinternalstructureofareactiveobject.Theobjectconsistsofthreemajorcomponents:adjustabledatastructures,adaptivealgorithms,andreactivepolicies.There-activeobjectorganizationseparatespolicyfrommechanism,therebyallowingexistingadjustablealgorithmsanddatastructurestobereusedandcombinedwithappropriate(possiblyuser-deﬁned)policies.Moreover,theseparationofalgorithmfromdatastructureallowsadjustabledatastruc-turestobeusedbymanydiﬀerentalgorithmsthatcandynamicallymanipulatethestructuretomeetthealgorithm’scurrentneeds.

Anadjustabledistributeddataobjectprovidesaconﬁgurablestorageabstractionthatalgorithmscanuse.Forexample,areactivematrixobjectwithsuccessiveover-relaxationoraveragingmethodsmightmakeuseofanadjustableneighbor-exchange-matrixstructure,wheretheneighbor-exchange-matrixstructureisamatrixstorageabstractiondevelopedspeciﬁcallytosup-portparallelizedaccessandexchangeofinformationbetweenneighboringpartitions.Althoughtheabstractionunderstandsdecomposition(i.e.,howthematrixispartitionedandthehostcurrentlyworkingoneachpiece),itdoesnotdecidethedecomposition.Instead,itprovidesmethodsthatthealgorithmorpolicycomponentcaninvoketotelltheobjecthowtochangethedecomposition.Givenanewdecompositiondescription,theobjectcanadjustitself(shifttheboundariesandrede-ﬁnethesharedsegmentsused)toaccommodatethenewdescription.Othermethodsmightallowthealgorithmtoswitchtheobject’simplementationfromUnify’srandomsegmentstosequentialsegments.Inmanycases,theobjectmustprovideanoperationbywhichthepolicyoralgorithmcaninformtheobjectthatisitsafetoadjust[GEK+95].Ourexperienceindicatesthatthesetypesofparallelizedstoragestructuresareusedrepeatedlyinawiderangeofapplicationsandwillincurheavyreuse.

Thepolicycomponentrepresentstheheartofthereactiveobject,monitoringstatechangesandmodifyingtheadjustabledataobjectsandadaptivealgorithmsasneeded.Thepolicycom-ponentregisterspredicateswith,andrespondstonotiﬁcationsfrom,thedistributedstatelayer.

Inmanycases,apredeﬁnedpolicycanbeusedinconjunctionwithapredeﬁneddatastructuretocontroladaptation.Forexample,apredeﬁnedloadbalancingpolicymaymonitorgloballoadinformationbyregisteringpredicateswiththedistributedstatelayer.Whentheloadbecomesimbalancedthepolicycodewillbeinvokedthatwillinturninvoketheadjustmentmethodsofaneighbor-exchange-matrixstructuretochangethepartitionsofthematrix.Alternatively,apolicytoconformtospatiallocalitymightrequestthetopologyfromthedistributedstatelayerandinvoketheadjustmentmethodsofthematrixobjecttoreassignspatiallyrelatedpartitionstoneighboringhosts.Finally,anotherpolicymightmonitorhigh-levelstateinformationsuchasthesynchronizationtodatatransmissionratioandthenmodifythealgorithmtouselargeorsmallgranularitysharingwithmoreorlesssynchronization.

Anadaptivealgorithm,orsetofalgorithms,canchangethestructureofthecomputationsaswellaschangeoradjustimplementationsofthedatastructuresituses.Changestothealgorithmwilltypicallybeinitiatedbythepolicycomponentthatwilldirectlymodifythealgorithmoradjustthedatacomponentwhichin-turnwillbedetectedbythealgorithm.Whenthealgorithmisdirectlyaltereditmaybemodiﬁedoranentirelynew(moreappropriate)algorithmwillbesubstituted.If,instead,thedatacomponentismodiﬁed,thealgorithmwilldetectchangesinthedatastructureandadjustaccordingly(e.g.,betweentwoiterationsofaloopthealgorithmmayﬁndithasalargerorsmallerpartitionoftheproblemspace).

6Status

WearecurrentlycompletingtheadditionoftheprobesandsystemmonitorsneededbythelocalanddistributedstatelayersinUnify.Thelocalstatelayertracksnetworkstatisticsincludingpacketloss,networkcongestion,andnetworkbandwidthachieved.Wehaveimplementedafewbasicmatrixreactiveobjectsandalsoaninsert-query-deletereactiveobjectthatcanchangeitsimplementation.Todemonstratetheutilityofreactiveobjects,weperformedexperimentsim-plementingasuccessiveover-relaxation(SOR)algorithmandtheSPLASHWaterbenchmark[SWG92]withmachineassignmentsthatignorednetworktopologyandthenwithassignmentsthattriedtogroupcommunicatingprocessestogether.

Distributation ADistributation B

130012001100Execution Time (in Seconds)1000900800700600500400

Distributation ADistributation B

22Execution Time (in Seconds)20

234

5Workers

678234

5Workers

678

Figure3:Examplevariationinruntimesresultingfromtwodiﬀerentmappingsofdecomposeddomainstomachines.ThegraphsshowUnifyexecutiontimesfor(a)SORand(b)theSPLASHWaterbenchmarkusing2,4,6,and8machines.

Figure3ashowstheexecutiontimeofadistributedSORapplicationworkingona512X512matrix,whileFigure3bshowstheexecutiontimesoftheWaterapplicationona4913moleculeproblemsize.Thegraphsshowtheexecutiontimefor2,4,6,and8workstations.Eachworksta-tionresidesononeoftwosubnetsinterconnectedbythecampusbackbone.Withtheexceptionofthemachinelocation,thesystemiscompletelyhomogeneous.AllworkstationsareSunSparc20

HypersparcswithMbytesmemory,identicaldisks,and100MbpsEthernetconnections.Distri-butionArepresentsadistributionofworkacrossmachineswheremachinesareselectedwithoutregardtonetworkconsiderations.ForDistributionB,thelogicalproblempartitionsthatrequiredthemostcommunicationbandwidthwereplacedinthesamesubnet,therebyminimizingcommu-nicationbetweensubnets.Theﬁguresforbothapplicationsshowa10-20%increaseinperformanceonidenticalmachineswhenthedistributionofworkcanadapttothenetworktopology.

600

Unify RandomUnify Sequential

Unify LL

500Total Run Time (seconds)400

300

200

Number of Machines

Figure4:PerformancecomparisonbetweenmultipleimplementationsoftheSPLASHWaterappli-cation.

Figure4illustratestheperformanceimprovementsthatresultfromselectingtheappropriatealgorithm,memorytypesandsynchronizationschemesforthecurrentenvironment.ThegraphshowsthreediﬀerentimplementationsoftheWaterproblemwhenexecutingonasinglelocalareanetworkofhomogeneousmachines.The“Random”curverepresentsaconventionalimplementa-tionusingUnify’srandomsegmenttype.The“Sequential”curveusesUnify’ssequential-accesssegmentstoimplementtheprogram’sdatastructures.Finally,“UnifyLL”showstheexecutiontimewhensmallersegmentsandalocallockingmechanismareusedtogethertoreducesyn-chronizationoverhead.Asthenumberofprocessorsincreasetheversionwithlocallockingandsequentialsegmentsshowsadramaticperformanceimprovement.

7Summary

Forhigh-performancedistributedenvironments,thedynamicnatureofthenetworkandthenodescanmaketheoverallbehaviorofthesystemdiﬃculttomanage.Reactiveobjectsprovideuserswithhigh-levelabstractionsthatautomaticallyadapttothestateofthedistributedsystemorapplication.Thesystemautomaticallygathers,condenses,andprovidesaccesstoperformancestatisticsneededbythedistributedapplicationforadaptation.Fornoviceusers,reactiveobjectscanprovidehigh-performance“buildingblocks”fordevelopingapplications.Experiencedusersareprovidedwithdefaultpolices,alibraryofpreﬁnedadjustabledatastructures,andinterfacestothemonitoringsystemthatallowthemtodevelopnewreactiveobjectsthatcanbe“pluggedin”tothereactiveobjectsupportsystem.Initialanalysisofthepotentialperformancegainsattainablewiththeareactivearchitecturearepromising.

References

[Bat]

PeterBates.DebuggingHeterogeneousDistributedSystemsusingEvent-BasedMod-elsofBehavior.ProceedingsoftheACMSIGPLAN/SIGOPSWorkshoponParallel

andDistributedDebuggingpublishedinACMSIGPLANNotices,24(1):11–22,January19.

[BW83]

P.C.BatesandJ.C.Wileden.High-leveldebuggingofdistributedsystems:Thebehavioralabstractionapproach.JournalofSystemsandSoftware,3(4):255–2,April1983.

X.C.Cai.Anoptimaltwo-leveloverlappingdomaindecompositionmethodforellipticproblemsintwoandthreedimensions.SIAMJ.Sci.Comp.,14,1993.

SteveJ.ChapinandEugeneH.Spaﬀord.SupportforImplementingSchedulingAl-gorithmsUsingMESSIAHS.ScientiﬁcProgramming,pages325–340,1994.

[Cai93][CS94]

[EGSM94]GregEisenhauer,WeimingGu,KarstenSchwan,andNiruMallavarupu.Falcon

-TowardInteractiveParallelPrograms:TheOn-lineSteeringofaMolecularDy-namicsApplication.InProceedingsoftheThirdInternationalSymposiumonHigh-PerformanceDistributedComputing,pages26–34,August1994.[FLA94]

V.Freeh,D.Lowenthal,andG.Andrews.DistributedFilaments:EﬃcientFine-GrainParallelismonaClusterofWorkstations.InFirstSymposiumonOperatingSystemsDesignandImplementation,1994.

MessagePassingInterfaceForum.MPI:AMessage-PassingInterfaceStandard.TechnicalReportComputerScienceDepartmentCS-94-230,UniversityofTennessee,Knoxville,TN,1994.

[For94]

[GEK+95]WeimingGu,GregEisenhauer,EileenKraemer,KarstenSchwan,JohnStasko,and

JeﬀeryVetter.Falcon:On-lineMonitoringandSteeringofLarge-ScaleParallelPro-grams.InProceedingsofthe5thSymposiumontheFrontiersofMassivelyParallelComputation,February1995.[GKS94][Gri93][GSN93]

A.Grama,V.Kumar,andA.Sameh.ScalableParallelFormulationsoftheBarnes-HutMethodforn-BodySimulations.InSupercomputing,1994.

A.S.Grimshaw.EasytoUseObject-OrientedParallelProgrammingwithMentat.IEEEComputer,pages39–51,May1993.

A.S.Grimshaw,W.T.Strayer,andP.Narayan.DynamicObject-OrientedParallelProcessing.IEEEParallelandDistributedTechnology:SystemsandApplications,pages33–47,May1993.

J.Griﬃoen,R.Yavatkar,andR.Finkel.Unify:AScalableApproachtoMulti-computerDesign.IEEEComputerSocietyBulletinoftheTechnicalCommitteeonOperatingSystemsandApplicationEnvironments,7(2):24,July1995.

[GYF95]

[HKMC90]RobertHood,KenKennedy,andJohnMellor-Crummey.Parallelprogramdebugging

withon-the-ﬂyanomalydetection.InSupercomputing’90,pages74–81,November1990.[HML95]

J.K.Hollingsworth,B.P.Miller,andJ.E.Lumpp,Jr.TechniquesforPerformanceMeasurementofParallelPrograms.InF.PlasilT.L.CasavantP.Tvrdik,editor,ParallelComputers:TheoryandPractice,LosAlamitos,CA,1995.IEEEComputerSocietyPress,.

Y.K.JunandK.Koh.On-the-ﬂyDetectionofAccessAnomaliesinNestedParallelLoops.InProceedingsofACM/ONRWorkshoponParallelandDistributedDebugging,pages107–117,May1993.

[JK93]

[LCSM90]J.E.Lumpp,Jr.,T.L.Casavant,H.J.Siegel,andD.C.Marinescu.Speciﬁcation

andidentiﬁcationofeventsfordebuggingandperformancemonitoringofdistributedmultiprocessorsystems.InProceedingsofthe10thInternationalConferenceonDis-tributedComputingSystems,pages476–483,June1990.[LM92]

R.LulingandB.Monien.LoadBalancingforDistributedBranch-And-BoundAlgo-rithms.InSixthInternationalParallelProcessingSymposium,1992.

[LMnS90]

T.J.Leblanc,E.P.Markatos,andndSoftware.Operatingsystemsupportforadapt-ablereal-timesystems.InProceedingsoftheSeventhIEEEWorkshoponReal-TimeOperatingSystems,pages1–10,May1990.

S.Lucco.ADynamicSchedulingMethodforIrregularParallelPrograms.InIntheProceedingsoftheSIGPLAN’92Conference,June1992.

J.M.Mellor-Crummey.On-the-ﬂydetectionofdataracesforprogramswithnestedfork-joinparallelism.InProceedingsofSupercomputing’91,pages24–33,November1991.

C.E.McDowellandD.P.Helmbold.Debuggingconcurrentprograms.ACMCom-putingSurveys,21(4):593–622,December19.

[Luc92][MC91]

[MH]

[MLCS90]D.C.Marinescu,J.E.Lumpp,Jr.,T.L.Casavant,andH.J.Siegel.Modelsfor

monitoringanddebuggingtoolsforparallelanddistributedsoftware.JournalofParallelandDistributedComputing,9(2):171–184,June1990.[MW91][NCO96]

KeithMarzulloandMarkWood.Toolsforconstructingdistributedreactivesystems.TechnicalReportTR91-1193,CornellUniversity,1991.

NCO.HighPerformanceComputingandCommunications:FoundationforAmer-ica’sInformationFuture.TechnicalReporthttp://www.hpcc.gov/blue96/index.html,NationalCoordinationOﬃceforHighPerformanceComputingandCommunications(NCO),1996.

E.Schonberg.On-the-FlyDetectionofAccessAnomalies.ProceedingsoftheSIG-PLAN’ConferenceonProgrammingLanguageDesignandImplementationpub-lishedinACMSIGPLANNotices,24(7):285–297,June19.

M.Simmons,R.Koskela,andI.Bucher.InstrumentationForFutureParallelCom-putingSystems.ACMPress,19.

J.M.Smith.ASurveyofProcessMigrationMechanisms.OperatingSystemsReview,pages28–40,July1988.

RokSosic.Dynascope:AToolforProgramDirecting.InProceedingsofSIGPLANConferenceonProgrammingLanguageDesignandImplementation,pages12–21,July1992.

V.S.Sunderam.PVM:AFrameworkforParallelDistributedComputing.Concur-rency:PracticeandExperience,2(4),December1990.

JaswinderPalSingh,Wolf-DietrichWeber,andAnoopGupta.SPLASH:StanfordParallelApplicationsforShared-Memory.ComputerArchitectureNews,20(1):5–44,March1992.

JeﬀeryVetterandKarstenSchwan.Progress:aToolkitforInteractiveProgramSteering.TechnicalReportGIT-CC-95-16,GeorgiaInstituteofTechnology,1995.R.Yavatkar,J.Griﬃoen,andM.Sudan.AReliableDisseminationProtocolforInteractiveCollaborativeApplications.InTheProceedingsoftheACMMultimedia’95Conference,pages333–344,November1995.

[Sch]

[SKB][Smi88][Sos92]

[Sun90][SWG92]

[VS95][YGS95]

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文