Table 1: The observed N-terminus of 223 E. coli genes
Locus | Predicted N-terminusa | Observed N-terminusb |
accA | m*slnfldfeqpiael | slnfldfeqpia |
accC | mldkivianrgeia | mldkivianrge |
aceE | m*serfpndvdpietr | serfpndvdpie |
aceF | m*aieikvpdigadev | aieikvpdigad |
adk | mriillgapgagkg | mriillgapgag |
agp | mnktliaaavagivllasnaqa*qtvpegyqlqqvlm | qtvpegyqlqqv |
ahpC | m*slintkikpfknqa | slintkikpfkn |
ahpC | (67)y*avstdthfthka(104) | avstdthfthka |
aldA | M*SVPVQHPMYIDGQF | SVPVQ(KG)PMYXD |
araF | mhkftkalaaiglaavmsqsama*enlklgflvkqpee | enlklgflvkqp |
arcA | mqtphilivedelvt | mqtphilivede |
argD | M*AIEQTAITRATFDE | AIEQTAI(FS)RATF |
argG | m*ttilkhlpvgqrig | ttilkhlpvgqr |
argI | m*sgfyhkhflklld | sgfyXkXflkGL |
argT | mkksilalsllvglstaassya*alpetvrigtdttya | plpetvrigtdt |
aroG | mnyqnddlrikeik | mnyqnddlrike |
aroK | mrfqfmscrrslseaglsltnslsstekm*aekrniflvgpmg | aekrniflvgpm |
artI | MKKVLIAALIAGFSLSATA*AETIRFATEASYPP | AETIRFATEASY |
artJ | mkklvlaallasftfgasa*aekinfgvsatypp | aekinfgvsaty |
asd | mknvgfigwrgmvg | mknvgfigwrgm |
asnS | m*svvpvadvlqgr | AVvpvadvlq |
aspC | MFENITAAPADPIL | MFENITAAPADP |
atpA | mqlnsteiselikq | mqlnsteiseli |
atpD | m*atgkivqvigavvd | atgkivqvigav |
atpF | MNLNATILGQAIAF | |
bcp | mnplkagdiapkfs | mnplkagdiapk |
btuB | mikkaslltacsvtafsawa*qdtspdtlvvtanr | qdtspdtlvvta |
carA | liksallvledgtq | miksallvledg |
cpdB | mikfsatllatliaasvna*atvdlrimettdlh | atvdlrimettd |
crr | m*glfdklkslvsddk | glfdklkslvsd |
cspC | m*akikgqvkwfnesk | akikgqvkXfne |
cysI | msm*sekhpgplvvegkl | sekhpgplvveg |
cysK | m*skifednsltight | skifednsltig |
cysP | mavnllkknslalvaslllaghvqa*tellnssydvsrel | tellnsxydvsr |
dapA | mftgsivaivtpmd | mftgsivaivtp |
dapD | mqqlqniietafer | mqqlqniietaf |
dkaA | MQEGQNRKTSSL | MQENQNRK(PS)FXL |
dnaK | m*gkiigidlgttnsc | gkiigidlgttn |
dppA | mrislkksgmlklglslvamtvaasvqa*ktlvycsegspegf | ktlvyXsegspe |
dps | m*staklvkskatnll | staklvkskatn |
dsbA | MKKIWLALAGLVLAFSASA*AQYEDGKQYTTLEK | AQYEDGKQYTTL |
dsbC | mkkgfmlftllaafsgfaqa*ddaaiqqtlakmgikssdiq | ddaaiqqtlakm |
eco | mktilpavlfaafattsawa*aesvqplekiapyp | aesvqplekiap |
efp | m*atyysndfraglki | atyysndfragl |
eno | m*skivkiigreiids | skivkiigreii |
fabD | m*tqfafvfpgqgs | tqfafvfpgq |
fabI | mgflsgkrilvtgv | mgflsgkrilvt |
fabI | m*gflsgkrilvtgva | gflsgkrilvtg |
fba | m*skifdfvkpgvitg | skifdfvkpgvi |
fklB | m*ttptfdtieaqasy | ttptfdtieaqa |
fkpA | MKSLFKVTLLATTMAVALHAPITFA*AEAAKPAT
AADSKA | AEAAKPATAADS |
fliC | m*aqvintnslslitq | aqvintnslsli |
fliY | MKLAHLGRQALMGVMAVALVAGMSVKSFA*D
EGLLNKVKERGTL | DEGLLNKVKERG |
folE | M*PSLSKEAALVHEAL | PSLSKEAALVTE |
frr | misdirkdaevrmd | misdirkdaevr |
ftsZ | mfepmeltndavik | mfepmeltndav |
fumA | m*snkpfhyqapfpl | snkpfhyqapf |
fusA | m*arttpiaryrnigi | arttpiaryrni |
gadA | (378)f*klkdgedpgytl(72) | Llkdgedpgytl |
galU | m*aaintkvkkavipv | aaintkvkkavi |
galU | maaint*kvkkavipvaglgt | ANLkavipvagl |
galU | maain*tkvkkavipvaglgt | TINLkavipvag |
gapA | m*tikvgingfgrigr | tikvgingfgri |
gcvT | M*AQQTPLYEQHTLCG | AQQTPLYEQHTL |
gdhA | mdqtyslesflnhv | mdqtysleXfln |
glnA | m*saehvltmlneh | saehvltmln |
glnH | mksvlkvslaaltlafavssha*adkklvvatdtafv | adkklvvatdta |
glnS | m*seaearptnfirqi | seaearptnfir |
gltD | M*SQNVYQFIDLQRVD | SQNVYQFIDLQR |
glyA | mlkremniadydae | mlkremniadyd |
glyS | m*sektflveigteel | sektflveigte |
gpmA | m*avtklvlvrhgesq | avtklvlvrhge |
guaB | mlriakealtfddv | mlriakealtfd |
guaC | mrieedlklgfkdv | mrieedlklgfL |
hdeA | mkkvlgvilggllllpvvsna*adaqkaadnkkpvn | adaqkaadnkkp |
hdeB | MGYKMNISSLRKAFIFMGAVAALSLVNAQSALA*
ANESAKDMTCQEFI | ANESAKDMTHQE |
hemX | MTEQEKTSAVVEET | MTEQEKTSAAXE |
hisD | M*SFNTIIDWNSCT | SFNTIIDPNX(PYEK)T |
hisJ | MKKLVLSLSLVLAFSSATAAFA*AIPQNIRIGTD
PTY | AIPQNIRIGTDP |
hlpA | vkkwllaaglglalatsaqa*adkiaivnmgslfq | adkiaivnmXsl |
hmpA | mldaqtiatvkati | mldaqtiatvka |
hns | m*sealkilnnirtlr | sealkilnnirt |
htpG | mkgqetrgfqse | mkgqetXgfq |
hupA | mnktqlidviaeka | mnktqlidviae |
hupB | mnksqlidkiaaga | mnksqlidkiaa |
icdA | meskvvvpaqgkki | meskvvvpaqgk |
ilvC | M*ANYFNTLNLRQQLA | ANYFNTLNLRQQ |
ilvI | memlsgaemvvrsl | memlsgaemvvr |
imp | MKKRIPTLLATMIATALYSQQGLA*ADLASQC | ADLAS |
kdsA | mkqkvvsigdinva | m(GM)qkvvsigdin |
leuA | m*sqqviifdttlrdg | sqqviifdttlr |
leuB | M*SKNYHIAVLPGDGI | SKNYHIAVLPGD |
leuC | m*aktlyeklfdahvv | aktlyeklfdah |
livJ | mnikgkallagcialafsnmala*edikvavvgamsgp | edikvavvgams |
livK | mkrnaktiiagmialaishtama*ddikvavvgamsgp | ddikvavvgams |
lolA | MMKKIAITCALLSSLVASSVWA*DAASDLKSRLD
KVS | DAASDLKSRLDK |
lpdA | mm*steiktqvvvlgag | steiktqvvvlg |
malE | mkiktgarilalsalttmmfsasala*kieegklviwingd | kieegklviXin |
manX | v*tiaivigthgwaaet | tiaivigthgwa |
mdh | mkvavlgaaggigq | mkvavlgaaggi |
mdoG | MMKMRWLSAAVMLTLYTSSSWA*FSIDDVAKQ
AQSLA | FXIDDVAKQAXS |
metE | m*tilnhtlgfprvgl | tilnhtlgfprv |
mglB | mnkkvltlsavmasmlfgaaaha*adtrigvtiykydd | adtrigvtiyky |
minD | m*ariivvtsgkggvg | ariivvtsgkgg |
mopA | m*aakdvkfgndarvk | aakdvkfgndar |
mopB | mnirplhdrvivkr | mnirplhdrviv |
mreB | mlkkfrgmfsndls | mlkkfrgmfsnd |
nadE | mtlqqqiikalgen | mtlqqqiikalg |
nfnB | mdiisvalkrhstk | mdiisvalkrhs |
nuoB | mdytltridpngen | mdytltridpng |
nuoG | m*atihvdgkeyevng | atihvdgkeyev |
nuoI | mtlkellvgfgtqv | mtlXellvgfgt |
nusA | mnkeilavveavsn | mnXeilavveXv |
ompA | MKKTAIAIAVALAGFATVAQA*APKDNTWYTGA
KLG | APKDNTWYTGAK |
ompC | mkvkvlsllvpallvagaana*aevynkdgnkl | aevynkdgn |
ompF | mmkrnilavivpallvagtana*aeiynkdgnkvdly | aeiynkdgnkvd |
ompF | (36)k*avglhyfsk(313) | avglhyfsk |
oppA | mtnitkrslvaagvlaalmagnvala*advpagvtlaekqt | advpagvtlaekq |
osmC | M*TIHKKGQAHWEGDI | TIHKKGQAHIEG |
osmY | mtmtrlkisktllavmltsavatgsaya*ennaqttnesagqk | ennaqttnesag |
pal | MQLNKVLKGLMIALPVMAIAA**CSSNKNASNDGS | |
panB | mkpttisllqkykq | mkpttiSLLQXY |
pckA | mrvnngltpqelea | mrvnngltpqel |
pgk | m*svikmtdldlagkr | svikmtdldlag |
pnp | llnpivrkfqygqh | mlnpivrkfqyg |
potD | mkkwsrhllaagalalgmsaaha*ddnntlyfynwtey | ddnntlyfynXt |
potF | MTALNKKWLSGLVAGALMAVSVGTLA*AEQKT
LHIYNW | AENKTLXIYNV |
ppa | m*sllnvpagkdlped | sllnvpagkdlp |
ppa | (92)l*kmtdeagedakl(68) | kmtdeagedakl |
ppiB | mvtfhtnhgdivik | mvtfhtnhgdiv |
proS | mrtsqyllstlket | mrtsqyllstlk |
prsA | vpdmklfagnatpe | (PA)pdmklfagnat |
pstS | mkvmrttvatvvaatlsmsafsvfa*easltgagatfpap | easltgagatfp |
ptsH | mfqqevtitapngl | mfqqevtitapn |
ptsI | misgilaspgiaf | misgilaXpgi |
purA | M*GNNVVVLGTQWGDE | GNNVXXLGTQXA(VL) |
purC | mqkqaelyrgkakt | mqkqaelyrgka |
purH | mqqrrpvrrallsv | mqqrrpvrrall |
purM | M*TDKTSLSYKDAGV | TDKTSLSXXDD |
pykF | mkktkivctigpkt | mkktkivAtigp |
pyrB | m*anplyqkhiisin | anplyqkhiis |
pyrC | m*tapsqvlkirrpdd | tapsqvlkirrp |
pyrG | m*ttnyifvtggvvss | ttnyifvtggvv |
pyrI | mthdnklqveaikr | mthdnklqveai |
pyrI | m*thdnklqveaikrg | thdnklqveaik |
rbsB | mnmkklatlvsavalsatvsanama*kdtialvvstlnnp | kdtialvvstln |
rfaD | miivtggagfigsn | miivtggagfig |
rho | mnltelkntpvsel | mnltelkntpvs |
rpiA | mtqdelkkavgwa | mtqdelkkavg |
rplA | m*akltkrmrvirekv | akltkrmrviIe |
rplC | MIGLVGKKVGMT | MIGLVGKKVG |
rplD | MELVLKDAQSALTV | MELVLKDAQSAL |
rplF | m*srvakapvvvpagv | srvakapvvvpa |
rplI | mqvilldkvanlgs | mqvilldkvanl |
rplL | M*SITKDQIIEAVAAM | SITKDXIIEXV |
rplM | (34)R*RLRGKHKAEYTP(96) | RLRGKHKAEYTP |
rplY | MFTINAEVRKEQGK | MFTINAEVRREQ |
rpoA | mqgsvteflkprlvd | mqgsvteflkprl |
rpsA | MTESFAQLFEESLK | MTESFAQLFEES |
rpsA | m*tesfaqlfees | tesfaqlfe |
rpsB | m*atvsmrdmlkagvh | atvsmrdmlkag |
rpsF | mrhyeivfmvhpdq | mrhyeivfmvXp |
rpsJ | mqnqririrlkafd | mqnqriXirlLa |
rpsP | MVTIRLARHGAKKR | MVTIRLAR(EA)GA(VP) |
sbp | mnkwgvgltfllaatsvma*kdiqllnvsydptr | kdiqllnvsydp |
sdhA | mklpvrefdavvig | mklpvrefdavv |
sdhB | mrlefsiyrynpd | mrlefsiyryn |
serA | m*akvslekdkikfll | akvslekdkikf |
serC | m*aqifnfssgpamlp | aqifnfssgpam |
slp | MNMTKGALILSLSFLLAA**CSSIPQNIKGNN | |
sodA | m*sytlpslpyaydal | sytlpslpyayd |
sodB | m*sfelpalpyakdal | sfelpalpyakd |
sseA | M*STTWFVGADWLAEH | STTXFVGADDXA |
sspA | m*avaankrsvmtlfs | avaankrsvmtl |
sucB | M*SSVDILVPDLPESV | SSVDILVPDLPE |
sucC | mnlheyqakqlfar | mnlheyqakqlf |
sucD | m*silidkntkvicqg | silidkntkvic |
sufI | mslsrrqfiqasgialcagavplkasa*agqqqplpvpplle | agqqqplpvppl |
surA | mknwktlllgiamiantsfa*apqvvdkvaavvnn | apqvvdkvaavv |
talB | m*tdkltslrqyttvv | tdkltslrqytt |
thrC | mklynlkdhneqvs | mklynlkdhneq |
tig | mqvsvettqglgrr | mqvsvettqglg |
tig | (35)k*kvridgfrkgkv(381) | kvridgfrkgkv |
tig | (42)r*kgkvpmnivaq(374) | kgkvpmnivaq |
tig | (43)r*kgkvpmnivaqr(373) | kgkvpmnivaqr |
tig | (44)k*gkvpmnivaqry(372) | gkvpmnivaqry |
tnaA | menfkhlpepfrir | menfkhlpepfr |
tolC | mkkllpiliglslsgfsslsqa*enlmqvyqqarlsn | enlmqvyqqarl |
tpiA | mrhplvmgnwklng | mrXplvmgnXkl |
tpx | M*SQTVHFQGNPVTVA | SQTVHFQGNPVT |
trpA | meryeslfaqlker | meryeslfaqlk |
trpB | M*TTLLNPYFGEFGGM | TTLLNPYFGEFG |
tsf | m*aeitaslvkelrer | aeitaGlvkelr |
tufA/B | M**SKEKFERTKPHVNV | |
tufA/B | (308)y*ilskdeggrhtp(70) | ilskdeggrhtp |
upp | mkivevkhplvkhk | mkivevkhplvk |
ushA | mkllqrgvalallttftlasetala*yeqdktykitvlht | yeqdktykitvl |
uspA | m*aykhiliavdlspe | aykhiliavdls |
valS | mektynpqdieqpl | meFtynpqdieq |
xylF | mkiknilltlctsllltnvaaha*kevkigmaiddlrl | kevkigmaiddl |
yacI | VLEEYRKHVAERAA | MLEEYRKHVAER |
yacK | mqrrdflkysvalgvasalplwsravfa*aerptlpipdlltt | aerptlpipTll |
yaeT | mamkklliasllfssatvyg*aegfvvkdihfegl | aegfvvkdihfe |
yaeT | (350)R*KIRFEGNDTSKD(448) | KIRFEGNDTSXD |
yajG | MFKKILFPLVALFMLAG**CAKPPTTIEVSP | |
ybdQ | MYKTIIMPVDVFEM | MYKTIIMPVDVF |
ybiS | MNMKLKTLFAAAFAVVGFCSTASA*VTYPLPTDG
SRLVG | VTYPLPTDGSRL |
yceI | mkksllgltfaslmfsagsava*adykidkegqhafv | adykidkegqha |
ychF | mgfkcgivglpnvg | XXfkXgivglpn |
ychF | m*gfkcgivglpnvgk | (AS)fkXgivglpnv |
ydcG | MDRRRFIKGSMAMAAVCGTSGIASLFSQAAFA*A
DSDIADGQTQRFD | ADSDIADGQTQR |
ydfG | MIVLVTGATAGFGE | MIVLVTGATAGF |
yeaD | MKLKDCV*MIKKIFALPVIEQI | MIKKIFALPVIE |
yebL | MKCYNITLLIFITIIGRIMLHKKTLLFAALSAALW
GGATQAADA*AVVASLKPVGFIAS | AVVASLKPVGFI |
yeeQ | (104)A*ADIVVHPGETV(976) | ADIVVHPGTTT |
yfiA | m*tmnitskqmeitpa | tmnitskqmeiF |
ygaG | MPLLDSFTVDH | (A)PLLDSFTV |
ygaG | M*PLLDSFTVDHT | PLLDSFTVD |
ygaU | M*GLFNFVKDAGEKLW | GLFNFVKDAGEK |
ygfZ | M*AFTPFPPRQPTASA | AFTPFPPRQPTA |
yggX | M*SRTIFCTFLQREAA | SRTIFXTFLQIE |
ygiN | MLTVIAEIRTRPGQ | MLTVIAEIRTRP |
yhbG | m*atltaknlakaykg | atltaknlaXay |
yhbN | MKFKTNKLSLNLVLASSLLAASIPAFA*VTGDTD
QPIHIESD | VTGDTDQPIHIE |
yhfO | MMYGVYRA*MKLPIYLDYSATTP | MKLPIYLDYSAT |
yhjJ | mqgtkirllaggllmmatagyvqa*dalqpdpawqqgtl | dalqpdpaXqqg |
yhjW | (228)a*rvdessdnnsll(245) | rvdessdnnsll |
yiaE | MERS*MKPSVILYKALP | M(NI)PSVI(NVD)YTAIP |
yifE | M*AESFTTTNRYFDNK | AESFTTTNRYFD |
yigW | MKKFAAVIAVMALCSAPV*MAAEQGGFSGPSAT
QS | AAAEQGGFSGPSA |
yihK | vieklrniaiiahv | Mieklrniaiia |
yjbJ | MNKDEAGGNWKQFK | MNKDEAGGNXKQ |
yjbP | mrkitqaisavcllfalnssavala*sspsplnpgtnvar | sspsplnpgtnv |
yjbP | MRKITQAISAVCLLFALNSSAVA*LASSPSPLNPG
TNV | LASSPSPLNPGT |
yjgF | M*SKTIATENAPAAIG | SKTIATENAPAA |
yjjK | M*AQFVYTMHRVGK | A(DE)FVYTMXRV(LI)(GA) |
ynaF | MNSVITQKVSSGVTLYADTKTGGF*MNRTILVPID
ISDS | MNRTILVPIDIS |
yphF | MPTKMRTTRNLLLMATLLGSALFARA*AEKEMT
IGAIYLDT | AEKEMTIGAIYL |
ytfJ | mtlrkilaltclllpmmasa*hqfetgqrvppigi | hqfetgqrvppi |
ytfQ | MWKRLLIVSAVSAAMSSMALA*APLTVGFSQVG
SES | APLTVGFSQVGS |
A "*" in the protein sequence shows the observed start site based on the N-terminal sequence tag.
A "**" in the protein sequence shows the predicted N-terminus of the mature protein based on published literature. For observed N-termini matching the internal region of an E. coli gene, the number of amino acids between the predicted N- and C- terminus of the conceptual protein and the observed N-terminus is shown in parenthesis.