没有合适的资源?快使用搜索试试~ 我知道了~
A Guide To The Kafka Protocol
需积分: 9 9 下载量 163 浏览量
2015-09-01
16:25:02
上传
评论
收藏 429KB PDF 举报
温馨提示
试读
15页
A Guide To The Kafka Protocol - Apache Kafka - Apache Software Foundation
资源推荐
资源详情
资源评论
9/1/2015 AGuideToTheKafkaProtocolApacheKafkaApacheSoftwareFoundation
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol 1/15
Pages / Index
CreatedbyJayKreps,lastmodifiedbyEvanHuusonMar27,2015
AGuideToTheKafkaProtocol
Introduction
Overview
Preliminaries
Network
Partitioningandbootstrapping
PartitioningStrategies
Batching
VersioningandCompatibility
TheProtocol
ProtocolPrimitiveTypes
Notesonreadingtherequestformatgrammars
CommonRequestandResponseStructure
Requests
Responses
Messagesets
Compression
TheAPIs
MetadataAPI
TopicMetadataRequest
MetadataResponse
ProduceAPI
ProduceRequest
ProduceResponse
FetchAPI
FetchRequest
FetchResponse
OffsetAPI
OffsetRequest
OffsetResponse
OffsetCommit/FetchAPI
ConsumerMetadataRequest
ConsumerMetadataResponse
OffsetCommitRequest
OffsetCommitResponse
OffsetFetchRequest
OffsetFetchResponse
Constants
ApiKeys
ErrorCodes
SomeCommonPhilosophicalQuestions
Introduction
ThisdocumentcoverstheprotocolimplementedinKafka0.8andbeyond.Itismeanttogiveareadableguidetothe
protocolthatcoverstheavailablerequests,theirbinaryformat,andtheproperwaytomakeuseofthemto
implementaclient.Thisdocumentassumesyouunderstandthebasicdesignandterminologydescribedhere.
Theprotocolusedin0.7andearlierissimilartothis,butwechosetomakeaonetime(wehope)break
incompatibilitytobeabletocleanupcruftandgeneralizethings.
Overview
TheKafkaprotocolisfairlysimple,thereareonlysixclientrequestsAPIs.
9/1/2015 AGuideToTheKafkaProtocolApacheKafkaApacheSoftwareFoundation
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol 2/15
1. MetadataDescribesthecurrentlyavailablebrokers,theirhostandportinformation,andgivesinformation
aboutwhichbrokerhostswhichpartitions.
2. SendSendmessagestoabroker
3. FetchFetchmessagesfromabroker,onewhichfetchesdata,onewhichgetsclustermetadata,andone
whichgetsoffsetinformationaboutatopic.
4. OffsetsGetinformationabouttheavailableoffsetsforagiventopicpartition.
5. OffsetCommitCommitasetofoffsetsforaconsumergroup
6. OffsetFetchFetchasetofoffsetsforaconsumergroup
Eachofthesewillbedescribedindetailbelow.
Preliminaries
Network
KafkausesabinaryprotocoloverTCP.Theprotocoldefinesallapisasrequestresponsemessagepairs.All
messagesaresizedelimitedandaremadeupofthefollowingprimitivetypes.
Theclientinitiatesasocketconnectionandthenwritesasequenceofrequestmessagesandreadsbackthe
correspondingresponsemessage.Nohandshakeisrequiredonconnectionordisconnection.TCPishappierifyou
maintainpersistentconnectionsusedformanyrequeststoamortizethecostoftheTCPhandshake,butbeyond
thispenaltyconnectingisprettycheap.
Theclientwilllikelyneedtomaintainaconnectiontomultiplebrokers,asdataispartitionedandtheclientswill
needtotalktotheserverthathastheirdata.Howeveritshouldnotgenerallybenecessarytomaintainmultiple
connectionstoasinglebrokerfromasingleclientinstance(i.e.connectionpooling).
TheserverguaranteesthatonasingleTCPconnection,requestswillbeprocessedintheordertheyaresentand
responseswillreturninthatorderaswell.Thebroker'srequestprocessingallowsonlyasingleinflightrequestper
connectioninordertoguaranteethisordering.Notethatclientscan(andideallyshould)usenonblockingIOto
implementrequestpipeliningandachievehigherthroughput.i.e.,clientscansendrequestsevenwhileawaiting
responsesforprecedingrequestssincetheoutstandingrequestswillbebufferedintheunderlyingOSsocketbuffer.
Allrequestsareinitiatedbytheclient,andresultinacorrespondingresponsemessagefromtheserverexcept
wherenoted.
Theserverhasaconfigurablemaximumlimitonrequestsizeandanyrequestthatexceedsthislimitwillresultin
thesocketbeingdisconnected.
Partitioningandbootstrapping
Kafkaisapartitionedsystemsonotallservershavethecompletedataset.Insteadrecallthattopicsaresplitintoa
predefinednumberofpartitions,P,andeachpartitionisreplicatedwithsomereplicationfactor,N.Topicpartitions
themselvesarejustordered"commitlogs"numbered0,1,...,P.
Allsystemsofthisnaturehavethequestionofhowaparticularpieceofdataisassignedtoaparticularpartition.
Kafkaclientsdirectlycontrolthisassignment,thebrokersthemselvesenforcenoparticularsemanticsofwhich
messagesshouldbepublishedtoaparticularpartition.Rather,topublishmessagestheclientdirectlyaddresses
messagestoaparticularpartition,andwhenfetchingmessages,fetchesfromaparticularpartition.Iftwoclients
wanttousethesamepartitioningschemetheymustusethesamemethodtocomputethemappingofkeyto
partition.
Theserequeststopublishorfetchdatamustbesenttothebrokerthatiscurrentlyactingastheleaderforagiven
partition.Thisconditionisenforcedbythebroker,soarequestforaparticularpartitiontothewrongbrokerwill
resultinantheNotLeaderForPartitionerrorcode(describedbelow).
Howcantheclientfindoutwhichtopicsexist,whatpartitionstheyhave,andwhichbrokerscurrentlyhostthose
partitionssothatitcandirectitsrequeststotherighthosts?Thisinformationisdynamic,soyoucan'tjust
configureeachclientwithsomestaticmappingfile.InsteadallKafkabrokerscananswerametadatarequestthat
describesthecurrentstateofthecluster:whattopicsthereare,whichpartitionsthosetopicshave,whichbrokeris
theleaderforthosepartitions,andthehostandportinformationforthesebrokers.
Inotherwords,theclientneedstosomehowfindonebrokerandthatbrokerwilltelltheclientaboutalltheother
brokersthatexistandwhatpartitionstheyhost.Thisfirstbrokermayitselfgodownsothebestpracticeforaclient
implementationistotakealistoftwoorthreeurlstobootstrapfrom.Theusercanthenchoosetouseaload
balancerorjuststaticallyconfiguretwoorthreeoftheirkafkahostsintheclients.
Theclientdoesnotneedtokeeppollingtoseeiftheclusterhaschanged;itcanfetchmetadataoncewhenitis
9/1/2015 AGuideToTheKafkaProtocolApacheKafkaApacheSoftwareFoundation
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol 3/15
instantiatedcachethatmetadatauntilitreceivesanerrorindicatingthatthemetadataisoutofdate.Thiserrorcan
comeintwoforms:(1)asocketerrorindicatingtheclientcannotcommunicatewithaparticularbroker,(2)anerror
codeintheresponsetoarequestindicatingthatthisbrokernolongerhoststhepartitionforwhichdatawas
requested.
1. Cyclethroughalistof"bootstrap"kafkaurlsuntilwefindonewecanconnectto.Fetchclustermetadata.
2. Processfetchorproducerequests,directingthemtotheappropriatebrokerbasedonthetopic/partitionsthey
sendtoorfetchfrom.
3. Ifwegetanappropriateerror,refreshthemetadataandtryagain.
PartitioningStrategies
Asmentionedabovetheassignmentofmessagestopartitionsissomethingtheproducingclientcontrols.That
said,howshouldthisfunctionalitybeexposedtotheenduser?
PartitioningreallyservestwopurposesinKafka:
1. Itbalancesdataandrequestloadoverbrokers
2. Itservesasawaytodivvyupprocessingamongconsumerprocesseswhileallowinglocalstateand
preservingorderwithinthepartition.Wecallthissemanticpartitioning.
Foragivenusecaseyoumaycareaboutonlyoneoftheseorboth.
Toaccomplishsimpleloadbalancingasimpleapproachwouldbefortheclienttojustroundrobinrequestsoverall
brokers.Anotheralternative,inanenvironmentwheretherearemanymoreproducersthanbrokers,wouldbeto
haveeachclientchoseasinglepartitionatrandomandpublishtothat.Thislaterstrategywillresultinfarfewer
TCPconnections.
Semanticpartitioningmeansusingsomekeyinthemessagetoassignmessagestopartitions.Forexampleifyou
wereprocessingaclickmessagestreamyoumightwanttopartitionthestreambytheuseridsothatalldatafora
particularuserwouldgotoasingleconsumer.Toaccomplishthistheclientcantakeakeyassociatedwiththe
messageandusesomehashofthiskeytochoosethepartitiontowhichtodeliverthemessage.
Batching
Ourapisencouragebatchingsmallthingstogetherforefficiency.Wehavefoundthisisaverysignificant
performancewin.BothourAPItosendmessagesandourAPItofetchmessagesalwaysworkwithasequenceof
messagesnotasinglemessagetoencouragethis.Acleverclientcanmakeuseofthisandsupportan
"asynchronous"modeinwhichitbatchestogethermessagessentindividuallyandsendstheminlargerclumps.We
goevenfurtherwiththisandallowthebatchingacrossmultipletopicsandpartitions,soaproducerequestmay
containdatatoappendtomanypartitionsandafetchrequestmaypulldatafrommanypartitionsallatonce.
Theclientimplementercanchoosetoignorethisandsendeverythingoneatatimeiftheylike.
VersioningandCompatibility
Theprotocolisdesignedtoenableincrementalevolutioninabackwardcompatiblefashion.Ourversioningisona
perapibasis,eachversionconsistingofarequestandresponsepair.EachrequestcontainsanAPIkeythat
identifiestheAPIbeinginvokedandaversionnumberthatindicatestheformatoftherequestandtheexpected
formatoftheresponse.
Theintentionisthatclientswouldimplementaparticularversionoftheprotocol,andindicatethisversionintheir
requests.OurgoalisprimarilytoallowAPIevolutioninanenvironmentwheredowntimeisnotallowedandclients
andserverscannotallbechangedatonce.
Theserverwillrejectrequestswithaversionitdoesnotsupport,andwillalwaysrespondtotheclientwithexactly
theprotocolformatitexpectsbasedontheversionitincludedinitsrequest.Theintendedupgradepathisthatnew
featureswouldfirstberolledoutontheserver(withtheolderclientsnotmakinguseofthem)andthenasnewer
clientsaredeployedthesenewfeatureswouldgraduallybetakenadvantageof.
Currentlyallversionsarebaselinedat0,asweevolvetheseAPIswewillindicatetheformatforeachversion
individually.
TheProtocol
ProtocolPrimitiveTypes
剩余14页未读,继续阅读
资源评论
renzhewh
- 粉丝: 39
- 资源: 101
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功