These are unedited transcripts and may contain errors.
EIX Working Group session at 11 a.m. on 3rd November, 2011:
CHAIR: Good morning everybody. I would like to welcome you all to EIX at RIPE 63. We have got a fairly busy schedule today, quite a lot of talks, and we are also reorganising things in the middle as people have had meetings pulled on them. So, welcome to you all. I would like to thank Amanda for scribing for us and we will have Susan in a later being Jabber describe. Microphone etiquette, if you are asking questions, give your name and affiliation to the people watching the webcast can see you, we are being webcast.
If there is anyone else have anything else they would like to add to the agenda? OK.
I have been asked by to announce the BoF this evening at 17:45 in the small room next door which is on IPv6 privacy issues, and that will run probably for an hour, hour?and?a?half, before the dinner.
OK. First up is Christian Panigl who is our host and he is going to talk about local peering in the Austrian area.
CHRISTIAN PANIGL: Good morning everybody. Those of you who have been at the party, I hope you enjoyed it and are awake already. I am trying my best. We were cleaning up until 3:00 in the morning, so it was a nice but challenging night.
Thanks to all of you who have, who came and celebrated with us.
So this talk was called all about peering in Austria and I have to admit, I know very little about peering in Austria, because I am just a little exchange point operator and a little operator of of an academic network, so I am actually not knowing everything what commercial ISPs, carriers, wholesale, and so on is doing in Austria. Therefore, I can only focus on the little I know about peering in Austria. Nevertheless, it's a little bit. So actually, IP peering in Austria and I think that is what we are talking about today, started with the birth of the Vienna Internet Exchange back in 1996 at the University of Vienna, the famous address, more or less everything which started in connection with the Internet in Austria was created, it was the first Internet connection and E /PWOPB POP and it was also the reason that a close to the first E /PWOPB POP at the university we set up an Internet Exchange point, a small facility to give those who were using E /PWOPB as upstream provider, the chance to exchange traffic, national traffic without burdening the international transit builds.
So that was back in 1996, and I looked up our first logo, it was a picture of the Vienna fairy wheel, and there is a small anecdote after we put this picture up on our website, I got a whole lot of requests about this fairy wheel in Vienna, about information and so on, so I also put some information on?line behind this picture, you could click on the picture and you got the history of the fairy wheel and the most funny request I got was, how much is it? OK.
The early days of /SR*EPB Internet Exchange, we had five founding members, originally connected on H G R G 58 cable, thin wire Internet cable. That was Austrian network, V I A net, no longer existing. IBM global /T*BG changed into A TT global network, then EU net Austria, no longer existent and Austria press agency so you see from the five founding members only two are in the original state and three are still alive in a way.
But from then on, at the early days, nobody was believing that this exchange point is really needed in a us tree I can't or will stay very small and little but only 12 months we had 20 participants and really the alternative ISP scenery was growing like crazy. Another 48 months later we had 70 participants, already a lot of them coming from international area, and this made us into looking at additional space and the space at the university was very limited; therefore, we were looking for a second location for an extension for the Vienna Internet Exchange site, and founded interaction Austria was just starting to establish their data centre and it was at that time and more or less still is, the only carrier neutral data centre in Vienna.
So we created a dual site set up for the Vienna Internet Exchange, which is still in place and is looking like this. So we have resilient infrastructure interconnected Vienna Internet Exchange, VIX one at university, the original one and VIX two at interconnected by two bridge resilient fibre connect.
So the days after 2001 have been not so nice. There were massive consolidations going on in the ISP market in Austria, like most European countries and not only European countries, every I guess. Telecom Austria, the Austrian incumbent was integrating a lot or most of the alternative ISPs, so telecom regulation was not actually working out as it was planned for, so that is just a little bit sad story I must say. Other small and very innovative and active ISPs have been taken over by tel 2 and UPC. So today, there are only three big eyeball providers left in Austria, A1 telecom Austria, telly 2 and UPC.
So brief overview of our current status at the Vienna Internet Exchange: We are still, it's still university operated Internet Exchange, one of the very few remaining in Europe, most have been organised, meanwhile, as a separate entity. We are still doing it in the context of the Vienna university and we feel it's still well hosted there. We are operating it with the same team which is responsible for the national research network backbone for Aconet, and we are, I think, really living a neutral, robust, resilient and nonprofit environment for both for the academic and for the commercial Internet sector in the central Eastern Europe enregion. Today, we have 107 active participants, so 107 AS numbers, so to say, on 138 ports, and more than half of them is either dual stacked IPv6 or dedicated IPv6 but I think we only have left two dedicated IPv6 ports; all others are, meanwhile, mixed dual stack set?up.
And 18 participants with tendency growing 10 gig so it's not comparable to the big ones like AMS?IX, DE?CIX and so on, but in the mid?range I think we have a pretty nice set?up here.
The traffic summary peaks are claiming about 60 gigabits today and we have a well accepted route server in place since two years, three years, with a very convenient administration GUI so it's very configure to set up and overwrite defaults individually and we have a really great availability history with a overall availability of better than 99 percent, and in every single year we will never dropping below 99.97, and since we are dual side, never below 99.98 so I think it's for a university operator, the exchange point, it's quite nice.
So, I have looked up, actually, what else is available in Austria and there have been some other initiatives over the years, and those two which I could really identify were the public web page and announcing properly the members and so on, so what we are in the Eurix community calling an official IP peering point, in Austria, I have identified two, one is the Ops Adria Internet Exchange, you see the web link here. It's currently announcing eight participants and another four in connecting set?up state, so 12 at least in due time, and the Gratsa Internet Exchange, GRAX, which has currently nine participants, at least this is announced on the website and both are free exchange points so they are offering one gigabit ports to all participants for free, so these are sponsored by data centres or ISPs and try to facilitate regional peering especially in the cities of Gratz and in the other region and further south, Slovenia and so on.
Is there somebody here from the Gratza Internet Exchange? Or anybody who is connected to it? But Gratz is not run by you ??
AUDIENCE SPEAKER: The same idea ??
CHRISTIAN PANIGL: . Thank you. One in cooperation with a data centre, yes. He is here if you want to talk to him. And there are some other private peering facilities at universities, at ISPs and data centres, but I did not find any official information about that, any public information about that.
So the peering situation at coming back to the Vienna Internet Exchange, because I can't talk for the others, is that we have 40% of the participants already coming from outside Austria and this tendon see is growing; as I said earlier, the regional, the local national peering scene has ?? was consolidated in a way, so most of the ISPs have been integrated and, therefore, there is not much peering traffic left, national peering traffic but because so many new participants are coming from abroad or from neighbouring countries, they are filling the gap, so to say. It's carriers, it's ISPs, it's content delivery networks, it's content providers and hosters and we are very happy to announce that Limelight will turn on their connection very soon and Paul and myself are very optimistic that Google will show up very soon, also, as a participant of the Vienna Internet Exchange, since quite some time we have acmy, Microsoft incorporated and other international players, so it's good to see that it's going on, it's growing and still alive and more than just alive.
Also, a good story is that 70 percent of the participants at the Vienna Internet Exchange have an open peering policy, so it's good for new members, they get an easy start, connect with the route server, configure an open peering policy, more or less covered within 70 percent of the other members, that is really good.
We are still focused on central European and regional peering. We are not trying to get into competition with the big ones. At one of the last Eurix fora, I kind of introduced the terminology of tier 1, 2, 3, ISPs and I regarded ourselves as the classical one so the tier 2 IXP, while our friends from AMS?IX, LINX, DE?CIX and so on, the big ones, are more the Tier1s so they are shipping a lot of international traffic and transit traffic is actually done there, international transit peerings and we are belonging to the classical regional IXP, trying to optimize traffic flows in the region and this is our role, this is our role, we can live also in the university context, it's not our task to go commercial or to try to get everybody from all around the world to connect to Vienna. That is definitely not our role. But we have very high potential identified of new participants from neighbouring countries, ISPs and content networks and we should address them, so we definitely need a little bit more PR and that was one of the motivations also to volunteer as local hosts for the RIPE meeting, to bring you all to Vienna, have a chance to talk to you and I would like forward during this day tomorrow, during the dinner, probably, if you are interested, please come forward, talk to me or any of of my colleagues.
Also, we think that partnership programmes might help to a little bit speed up the growth and to use the high potential we already have identified. Any comments, suggestions, questions? Is there anybody who was left out who is operating in exchange point in Austria which I do not know of? Or knows of anybody who is operating in exchange point? There are microphones over there.
AUDIENCE SPEAKER: I think there is the ?? I don't know if it's still alive so a private peering initiative but I also don't know a website ??
CHRISTIAN PANIGL: OK. Thank you. I have also found something in Salzburg but with a completely broken web link, therefore I didn't put it up on the slides. Any questions, any comments?
CHAIR: OK. Thank you, Christian.
Next up is Andy who is going to talk a bit about things that are happening in the Working Group, primarily some Address Policy and the famous switching list has resurrected itself.
ANDY DAVIDSON: Thanks. Hi everyone. I represent lots of organisations including the ISP Hurricane Electric and a couple of Internet Exchange points in Britain, LONAP and IX leads and co?chair of the EIX Working Group so I am very happy to be able to talk about some of the things we are trying to do to make peering better for ISPs and for Internet Exchange points.
A very, very quick update on a policy that we talked about in the last EIX Working Group, 2001?05, well it didn't have a name or a number back at the last Working Group, it was just an idea, but in case any of you don't follow the Address Policy Working Group mailing list or the sessions, the policy that we described in the last meeting is now being discussed in the Address Policy Working Group. This policy proposal suggests that RIPE NCC should retain a /16 from the final /8 which it makes assignments and allocations from. Which would be only available to Internet exchange points for use as a peering LAN. The policy has great support on Address Policy Working Group but more support is always welcome, so if you think this is a good idea because you want to see Internet Exchange points get created in the future, with IPv4 as well as IPv6 of course address space, then please come along to the Address Policy Working Group mailing list and plus one that idea. Thank you for your support so far everyone who has helped.
The switch wish list. This isn't something that is brand new. This was a document that was written in 2005, and certainly, I am convinced was the second most beautiful thing to happen in 2005. The most beautiful thing, of course, was miss Iceland was crowned as miss world. Here is a photograph of the event, but I genuinely believe that the switch wish list was almost as beautiful because it's got so many uses, it's so useful, it provides guidance to new Internet Exchange points, on features that they really need to look for in switches that they buy. It provides guidance to switch vendors so that they know what it is that we as Internet Exchange points need in order to grow the service and to be a reliable Internet Exchange point and it provides hints to operators because all of the time that an Internet Exchange point gets built, it's actually group of operators who self organise, say we want to solve our problems with interconnection in the region, we just need some guidance and this is a great document that provides some hints to operators about the things they think to need about. The there are four key existing things in this switch wish list document. They are grouped into security measurement features, scalability and resilience, environmental monitoring and a description of the physical features which are needed by Internet Exchange points, and by and large these are ?? these are the correct splits but in all of these sections, there are some areas or some topics or themes that are missing and probably some problems that we described that have been solved. So really, the aim of the update project which we want to kick off within the Working Group is to update the switch wish list come to, remove any problems that have been solved already, increase the detail, the level of detail of the information we provide in the document for any problems which still exist, and also, add and document the things we use today which we may not have used in 2005, and also ask the switch vendors to implement things in their road mapping in ways that are useful for Internet Exchange points. So really, what I am doing is looking for volunteers and I am going to run through a handful of themes that I think will be useful to put into the new switch wish list, and at the end it will be really good if people could come and say hey, I really fancy helping with this bit.
The things that are missing from the document or things that we could expand upon that are used by Internet Exchange points today on the switchings is maybe some information on WDM optics which exchange points are using to solve particular long range issues or colour?optic features that are requested by members, really here I want to document what you guys are doing on Internet Exchange points in the wild today so that those hints are left for future operators in the switch vendors know exactly what we are doing with them.
Also, there is port density is described, mentioned in the document and it will be good to describe some of the port density and if there is any physical cabling density issues which we have at Internet Exchange points we would quite like to mention in the document, that would be really good. It would be really good talk about ethernet OM, or anybody isn't using it but wants to, that, it is a wish list after all, would like to describe the features that we can't use today and want to, anybody who is doing research into that, please do come to the talk. It would be good to describe things we are doing with port security that isn't mention in the document like RA guard.
Link aggregation, I think here it would be useful to document the algorithms that are used today, whether they are working or not working for us and document any desired change to the behaviour that we as Internet Exchange points would like to see in switches.
SFlow, currently the document just says sFlow export support, and my feeling is that might not be enough detail, we might like to describe what we do with sFlow or different modes so anybody with a particular opinion on that who might like to shepherd a paragraph on that is welcome to come to the mic. VPLS, more and more, prevent a loop free topology system for their eschange point but we don't mention at all in the current document. More and more people are going to use MPLS, VPLS to build Internet Exchange point peering LANS in the future, how do the people doing that, do it and are there ?? do we bring in topics from other parts of the documents into the VPLS section? For example, multiple mac?port security that we can describe how we use in the document.
It will be good as well to talk about some of the road maps that switch vendors are talking about, switches, path bridging, it would be good describe how as as Internet Exchange points would like to see those implemented. Again, this is a wish list, we are allowed to be somewhat fanciful and describe what we would like vendors to do and describe how we would like these to be implemented, that would be a really good use of the document. It would be good review multicast section of the document and make sure it's accurate.
Some other themes that I thought of that would be really useful in the document to describe our, problematic management of the switches, if you provide an API in this way, this would be a really useful paragraph in the document.
Things that I have missed, include absolutely everything else in the world ever, you may have some ideas that I haven't described, it's a non?exhausted list, so come to the mic if you think that I have missed some topics out. And also a quick question: Do we need some ?? similar document about any optical path technology that our exchange points might use today such as these WDM systems from people like trance mode and do we need to describe how exchange point need these or maybe by exchange points are buying a lot of these today but maybe not because our requirements are no different to traditional ISPs in that regard.
To finish off, your document needs you. If people come to the mic and say I fancy this topic and helping with sFlow or VPLS or multicast, that would be really, really kind.
Remco: Yes, I really fancy this topic and what is the other thing you were saying, in short yes, I would like to help you out.
ANDY DAVIDSON: OK.
CHAIR: Any particular area?
Remco: Well, back in ?? was it 2005, Christ I am getting old ?? I pretty much worked on a whole bunch of items in the document so I have no particular preference right now.
AUDIENCE SPEAKER: Martin Pels, AMSIX and I am happy to help out on VPLS and other stuff.
ANDY DAVIDSON: That is kind. That is an area I have used in an Internet Exchange environment. That is really kind. Over to you.
CHAIR: We like volunteers. Anybody else? OK. Thanks a lot Andy.
ANDY DAVIDSON: Please get me privately, you might not want to run to the microphone and offer, if you give a sentence or two, that would be really helpful.
CHAIR: And a big thanks to mike for doing, of course, the regional version all those years ago with help from the community. Next up is Harald. And he is going to talk about edges and ??
HARALD MICHL: Thank you. Good morning, my name is Harald MIchl, I work for the Vienna Internet Exchange, and I have to tell you something about how we improved our monitoring into last year. As Christian already mentioned in the introduction, we consider ourselves as a rather small eschange point but if you also think that we are in charge of the computer network and all the components that are involved there, the network itself and everything we monitor is not so small, so this, all in all, is I think, an implementation of a monitoring system that can be really run for other companies exchange points, provider networks. Therefore, I am, in fact, carrying for this kind of presentation two hats, it's the Vienna Internet Exchange and the Aconet hat.
What do we do? We had an installation of Nagios in the past, an open source tool that we brought into service around 2004. Since then we added service checks and service checks and the system got slower so we decided we have to do something, and then there came some circumstances which were very lucky for us. First of all, we had the chance to hire a core developer of the new fork of Nagios, who now works in our team for more than a year, and we thought maybe we should give this new system a try and see how we can improve monitoring performance. Several steps involved in this which we will see later in detail.
First step was we should or had to migrate from the old monitoring system to the new one, and this was a very lucky situation for us because since I think it's just a fork of Nagios, the configuration of Nagios also worked, at least at the time we made the transition, the differences between the monitoring system that was not a problem, so we decided to give it a try. Then, we had to make some improvements of course because just taking the new version wouldn't solve that many problems. I come to the optimisations we made later on, and at the end we had now a performance increase of about 3 to 4 magnitudes, depending whether all the systems work or not and we also see later why it is now so much faster.
One of the big advantages of ICI N G A, compared to Nagios, I don't know, if you had 20 alarms you had to acknowledge each of the alarm in a very slow web interface and in the single you have to change to just mark them and make one comment for all of the marked alarms or messages, which also is very nice for the user, as operator.
But what do we check? We currently check about 100 hosts and 6,000 services on these hosts.
The main difference of checks that work on this monitoring systems are active checks and passive checks. The difference is that on the active check side, the core of the monitoring system does a per check plug in start and wait for the result and then interprets the result, which leads to the situation that the overall, it's a complicated term so I have it here on my hand, the service check execution time depends on how fast this plug in is complete so in case everything works and only a few milliseconds we ran the situation that all of the checks needed about 1,000 seconds to complete, which means that if a plug problem occurs you might have to wait about 1,000 seconds to see the problems, but even better, if there are problems, the service check execution time increases because normally the checks had to wait for time?outs and the service check excuse time increases by massive factors, consider one millisecond compared to five seconds, for example, it's a factor of 5,000, and then we had the situation that after some maintenance works we had a big receipt site of alarms and needed more than an hour to disappear because it was so slowly recovering.
On the other side, there are passive checks, run autonomously and put the result back to the core, which is nice thing because the core itself doesn't have to wait for the result because it takes the result whenever it comes and if there is no result, nothing to interpret and to do. And therefore, we splitted all of our monitoring checks we had into active and passive results and tried to increase the number of passive results as high as possible and we used a tool which I will describe later on.
So, the solution, we chose was a plug in called Check MK whereas MK are the initials of the programmer that made it, so the situation is the following:
We have the single core running and have one active check per host that cheques whether ?? check and pay process is running for this host and this check then knows about which services have to be checked on this one specific host.
It then retrieves data mainly by SNMP from the device we want to monitor it and it does it not one MSP query per check but all the sections that are interesting for the check results, transfer of all the data in SNMP and the results of this SNMP not only consists of the status of the checks but also so?called performance data. The performance data can be used to pre ?? process on the core here and we use it for generating statisticsal information, so not only the status information about the network device but also the statistics data is retrieved within one SNMP block. The check results of this one host monitored by Check MK is fed back into the single core via passive checks. So in fact, we have, normally, if you consider a switch having 100 interfaces, only one active check, which is the check of Check MK, and then these passive checks feeding back normally more than 100 results because not only monitor of the port is up or down but also if there is errors amount of multicast packets and so on.
The performance data itself is processed by another plug in which is called P N P and you still see these tools are rather similar you can use these tools for plug in for Nagios as it was developed for that but also for Isinga and this takes the performance data that was retrieved and puts it into an RR daft database where the graphics are generated.
So, this slide is mainly what I have talked, for those who want to read it afterwards. So we can skip that.
The main performance problem of of this monitoring device is the disc speed so main disks read and write actions cause it to be slowly, therefore we also installed daemon called rrd cached daemon which speeds up the axis to write and read files, some of the most important files were cached and therefore we have less read/write accesses to the physical hard disc.
We also made an integration of this graphics in our standard exchange point web portal, so also in former days where we used generating graphs, participants were able to see these graphics and of course, we didn't want to lose this functionality, so the graphs we producing are still integrated in our web portal.
What do we check? I think mainly standard checks that everyone does, so back down, up/down, errors, dropped packets, we are able to monitor the status of the BGP sessions on our BIRD routing table which we use as route server for the exchange point and monitor some health data from our service and network components.
The graphics we generate, we try to graph as much as possible, of course, these are the normal things you see and for us new was this graph of the BGP data, so we are now able to generate graphics about accepted prefixes of a BGP session which is very helpful if you consider BGP peering configure it with prefix limit and for example, if you have a running level of 80% and running level triggers you can have a look at the accepted prefix statistics and you see whether the prefix count is slowly increasing and you think OK, that is the normal way of life and I have to increase the max prefix limit. Of course, there is a you can say, OK, whether there was something wrong and I should ask the operator or this is just because they add some new customers, it's also OK and I should increase the max prefix in it.
We further monitor the CPU components and to detect memory leaks in our IOS software we monitor the largest continuous chunk of memory that is free and useable.
I want to go over a little bit more in detail about how we monitor the BIRD routing dame en, in fact the ?? we implemented a system where this BGP sessions running on the route server are monitor able, we SNMP so this routing deem enthen behaves just like a standard network component like a Cisco router and if you know you can make an SNMP work like on a standard switch so we have a standard running on the machine and defined in a script that has to be executed if someone asks about the ODs here and this is just one line of the script, but in fact, it says gets the information about all routing processes running on that route server, which includes the information, the script needs to create a result.
And then, from the Check MK point of view, it's just a standard SNMP query with this defined OD and the result that is coming back, this is only one line, it's one line per routing process, is, for example, that this routing process with this AS number and this protocol has this status, where 6 is established, and this is the amount of prefixes that are announced, and this is the performance data that the script then puts back into the Check MK process, so we see here, the plug in, the Check MK plug in, this processes this data, so this data in front of you is required and the former data is optional and we see the output of the plug in, which says this peer here, this protocol has this data so the 6 here is translated into OK, and we see here the performance data, we can generate.
How do we generate the config? Imagine if you have 100 hosts and 6,000 services, it's not possible to do this by hand. In the old set?up, we had mixed thing between database and switch config, for example, we decided what to monitor by defining descriptions on a port basis. In doing the up grade to the new monitoring system we said we only have one authoritative source for all of our monitoring and graphing tool and this should be a database, so the database now knows everything about our network, how it is configured and what we want to monitor there and on which levels we want to have triggers to set these alarms.
Then, there is an interesting thing because if you think your monitoring system is good, you don't know if you monitor everything you want to monitor, so, there is one check running that compares the checks defined in the database and still daily crawl through the network components where it guesses what should be checked, and if there is a mismatch, then alarm is created and you can have a look, so normally either there is a problem in the documentation in the database so you forgot something to document or you made a short test configuration of something and you have to remove it or you just forgot to document it afterwards.
So this is very useful to keep this system consistent.
And the hardware use is nothing special. We do have two machines just for resilient purposes. Because we have two sites in Vienna so there is one machine at each site. And there is the date of the machine. The only thing that is important is that you have fast hard disks because as I said before, the throughput to the hard disc is the limiting factor. Also for reducing the access to the hard disks we have configured 100 megabyte RAM disc for the status of the monitoring tool so this part of the information test would be written and read, it's not so ?? in the memory and not on the disc. The old monitoring platform was was not dedicated one. Now, we began to also monitor the visualisation itself and therefore it just didn't make any sense any more to run it on a virtual machine because you can visualise it on a machine and all because the disc load is high enough, it was OK for us to take two dedicated machines now for monitoring.
Here is some statistics of the machine. You can see the CPU is rather low, and this is the load and the utilisation, so not a server guy but I have been told there is a big difference between load and utilisation you can see there is a load but a very low utilisation.
On the other side of the web interface there is here on the right, you see this feature of the commands so you can select more than one and execute the same command for awful them that are selected and for this service description here there is mouse?over feature and at the time I made the screen shot of us over this small button and you get an immediate over feel for what is happening here in the service as this preview of the statistics is shown so you see here it's per second and packets per second and errors, fortunately they are zero and if you click on that button you see it more in detail, so you can get from the service check directly to the graphic information, which is very useful especially if there is a problem, you can just click on the graphic and see whether there is a real outage or if the customer just implemented a new access list, for example.
Here is an example of the BGP statistics we have, so you can see here the number of prefixes that were accepted and you see also here that there was a change in between, and for a long time monitoring is also nice because if you have a look on our upstream connection we see that the number of IPv4 prefixes here is continuously increasing and we see, for example, a chunk in the IPv6 global routing table on about week 35 this year.
OK. Any questions? Thank you.
CHAIR: Thank you, Harald.
AUDIENCE SPEAKER: Can you talk about the RD cache and the disc side of this. Is that a significant part of the performance? You talked about ten to three, ten to four.
HARALD MICHL: This is related to the active and passive checks. If you have active checks and the core has to wait for the result, it doesn't do anything until it gets the result, and if you have a check, for example, that works, and you have one millisecond for the result, compared to a check that doesn't work, where you have five seconds to get the result, multi?plied by 6,000, it's catastrophic so that was the main improvement, doing as much as possible with passive checks.
AUDIENCE SPEAKER: So the RRD cache part is just icing on the cake?
HARALD MICHL: I don't know about the percentage, what it really is. It's ten percent or twenty or around that that. It's there, it works, so we took it.
AUDIENCE SPEAKER: My last question is where does the name come from? I suppose I could look it up on?line.
HARALD MICHL: You mean this one?
AUDIENCE SPEAKER: Yes.
HARALD MICHL: I don't know, I even don't know how to pronounce it to be honest.
AUDIENCE SPEAKER: Perfect. Thanks.
CHAIR: . Any more questions for Harald? Thank you very much.
Next up is Maksym from AMS?IX to talk about Jumbo Frames.
MAKSYM TULYUK: Thank you for people from Netnod that agree to swap it. Our main aim of my topic about Jumbo Frames is not say Jumbo Frames good or bad; it's more feedback from community and especially from our customers, do they really need Jumbo Frames.
There is a small survey, please open my presentation, click on it. There is only five questions. My marketing team says it is really easy to fill. Everybody can do this.
My presentation contains five topics. First, I will speak how all this discussion started. Then, I will speak about advantages and disadvantages of Jumbo Frames and short summary, and conclusion, what we have for now.
So, the story started when some of our customers came to us and said we want Jumbo Frames. OK, we put questions about Jumbo Frames in survey. Survey was half year ago, and you see the results.
So, then we discuss it internally and see two ways to implement it, first is change MTU on existing VLAN and second way is create a new VLAN for Jumbo Frames size.
This is signs about ?? points about the support, about implementing Jumbo Frames in existing VLANs. As you see, technical possibility: Yes. Our equipment is supported. So that is no problem at all. However, there are a few negatives, drawbacks. First of all, customers dislike changes so we are quite proud about our stability. We start providing SLA recently so we dislike changes on the current system.
Second, floss official standard about Jumbo Frames. MTU and also if you connect to routers in Jumbo Frames, they have different jumbo frame size. MTU frame size. There is no protocol to negotiate between them what MTUs they use it. And also, if you enable jumbo frame in one point and some people start sending this packet you probably have some problems on the whole path because you need to discover MTU for whole path from point A to point B and from my point of view and what I read and speak, path MTU discovery just doesn't work.
So when we looked ?? if you look also once more on survey, you see that 37% people say no, we shouldn't support it, so we decided not to implement it. We had to keep our platform stable and do not make any tests on a broken platform.
So, we came to second solution. Again, technical possibilities exist. However, we came from another few questions. First of all, in my opinion, the most important: Will anybody pay if we start providing as a second connection or ?? yes, as a second connection to our ?? for Jumbo Frames VLAN. Second again: No official standard what Jumbo Frames is. And the last, still path MTU discovery doesn't work and that is what I talked just now.
So, it was decided to make a short review of Jumbo Frames is and who will use it. I mean, find out who from other customers are really interested in Jumbo Frames support. Now, about advantages of Jumbo Frames:
So, advantages is quite well?known. It's less CPU load, there is research was made eight years ago. Still the same. Recently repeated research and you see on graphs the CPU load if you use big Jumbo Frames, second is network overhead, if you use Jumbo Frames packet you send less packets so you will receive some ?? you reduce overhead and you get as you see, near five percent increased traffic. You can put more five percent more traffic as use TCP and 48 .5 if you use UDP. And the last one, this research said that if you use Jumbo Frames, your TCP for four months will double it. So if you increase it from current 1,500 to 1,300 it's double it. In this case, if you use 9,000, it will be six times better performance.
So, this is advantages, once more. Less CPU load and less network overhead and better TCP performance.
Now, I will speak about disadvantages:
First of all, it's no standard. So, if you look at even I standard, all of them provide different opinion what ethernet side should be, and also, however, all these standards speaking about header; it's not what we expect to see inside our network, and inside possible Jumbo Frames VLAN. So, about payload. We expect that customers will use ethernet payload you higher. Again no standards. I find out three specification, one is standard 1518; second is made for fibre channel over ethernet, it's definitely not what people from IX expect to see and again 9 K. Not what we see.
Also, there are a lot of things about terminology. My favourite is giant jumbo. So people even cannot agree how to use this terminology.
Also, there are more drawbacks, another is if you look at this report you see that if you increase size, it takes more time to transmit the packets so it's possible increase delay. Again, you increase MTU but you need more buffers and switches so it means possible more expensive equipment that you need to use. Path MTU discovery, it doesn't work. Even people who strongly support Jumbo Frames said yes, it's strong. I spoke with few guys from broadband, from content, they said, yeah, the best way to exclude any problem with path MTU discovery, just put it low, even from current 1,500, just to put less lot on ?? on the support.
And this is statistics from us. As you see, when we made these statistics, everybody inside Amsterdam IX was really surprised, for the percent of all packets is size from 64 up to 128. You can open this link, watch it. It doesn't change too much, so all the similar statistics. It shows a frame size on our back bones. We cannot look inside packets because forbidden for us; we provide ?? to sellers, we can evaluate size of ethernet packets. If you look at size of maximum, what people use as a maximum size, you see only 20, 25% of maximum size packets so it's possible packets that can be have advantage from Jumbo Frames, so ?? but, however, as you see, half of all traffic through our exchange is packet less than one kill bite.
One more about all disadvantages: No standards, increase in transmission time, packet delay, jitter, bigger buffers on equipment, path discovery doesn't work and low traffic, low possible traffic with Jumbo Frames.
Now, the most important part is what we thought and what we did. It's all advantages and disadvantages of Jumbo Frames at once, and I try to look what application should receive advantages from Jumbo Frames and what have disadvantages. So as you see, if you use big data transfer like NFS, yes, you have advantages. SAN, also advantage. But this not for exchanges, so I don't expect to see any traffic on exchange so there is only two ways for now I see how people can use Jumbo Frames via exchange, is NNTP. I don't know does anybody remember what this is? Yes, still use it. And VPNs, in current station when you want to provide VPNs, you always put lower VPN, I mean people who work in broadband know, if you run you cannot set up 1,500 bytes because there is entopslation so if you increase the size, yes you can provide clear service; however, the disadvantages much bigger. There are a lot of protocols that send small packets like DNS and VoIP, for example, and definitely they receive the biggest disadvantage if you put, if you start mix it up all traffic. Also, inter process and communications, even in case data transfer you need to set up connections, send some message about how it is going, yes again if you start mix it up, probably you have some disadvantages, and ?? come patability between different vendors. If, for example, we have now one vendor and decided to set up another vendor, both of them have different opinion what through size of Jumbo Frames. So, we came to conclusion:
First of all it's my conclusion, personal, so we see 50% of all traffic is still low, so Internet is using small traffics and another, we see two graphs, from left side is jumbo frame distribution sorry, and on the right side this is Ethernet type distribution. This big yellow boxes is IPv4 traffic and this, does everybody see this? ?? small orange is IPv6. IPv6 and Jumbo Frames story started near the same time and there was a lot of talks and if you look on this graph and this graph, we see that communities should put more efforts to introduce Jumbo Frames than IPv6, but IPv6 is over and current normal ethernet frames still working. However, it's my personal opinion. Our official opinion that for now, we are postponing and we expect and want to receive feedback from community. Also, I hope that you will take part in survey.
That is all.
So, any feed backs?
KURTIS LINDQVIST: We have supported Jumbo Frames on separate VLAN since 2002, we left the customers decide what they want, we won't tell them what is right or wrong, I don't understand why they need a second port, they can detect ports and they are basically connected to one or the other VLANs and connect to both of them, we don't tell people what is right or wrong. That is not our job. If they want to send large MTUs, they can, and people do. We see a lot more traffic ??
MAKSYM TULYUK: I understand the point. My presentation is not about what is good what is bad; it's more about analyse who can use jumbo frames.
KURTIS LINDQVIST: On that point you left out the most common protocol they can use which is BGP.
MAKSYM TULYUK: For now we have a big network, it's a stable platform and if we need to increase size we need to put three months planet works just to increase MTU size and if nobody use it, why should we do this? I mean, all our customers will have three months receive notification, this started, this start already started.
KURTIS LINDQVIST: I am just sharing our experience that is it works, it works fine, people use it and the most common protocol that can Tuesday on exchange is BGP and you actually have benefits.
AUDIENCE SPEAKER: Patrick Gilmore. Kurtis covered most of what I was going to say, I went through a survey and did a lot of research, there is another exchange that is very willing to share information that has operational real customer experience with us. Did you even ask them?
MAKSYM TULYUK: That is why survey is ??
AUDIENCE SPEAKER: So that is a no. The other thing is you say look at the packet distribution. Is anybody in this room surprised that half of the packets are small? I am just wondering because most of the traffic is TCP and in general, it goes in both ways and one of those two packets is small, well one out of every three, plus you run Internet Exchange, there is a lot of BGP and other things, so saying that 45 percent of your packets, you have all kinds of things attached there, so going 45 percent are small packets so clearly nobody wants large packet sizes, seems a little silly to me. So the data that you have collected, while factually correct, I would interpret it very differently, and there was other data that you could have collected with no effort, that you didn't ask for, so before asking us to fill out another survey, especially since there is a lot of overlap in peers between Netnod and AMS?IX, you should probably ask them about it.
MAKSYM TULYUK: Are you interested in Jumbo Frames on AMS?IX?
AUDIENCE SPEAKER: Akami does not send any because the majority of our end users are at 1,500 byte MTUs or lower but Akami bite is not the Internet, there is lots of other people.
AUDIENCE SPEAKER: We are about half or something. I don't know.
Martin: Before we hit the technical items on this talk I want to talk about other things. You talked about being an exchange that does not look at packets because you can't, you had looked at your layer two offering so my first to you is: Your opinion on what we do at layer 3 is totally uninteresting to me, totally 100 percent. I am a customer and I want a layer 2 service. So that is the first issue.
MAKSYM TULYUK: OK.
Martin: Now we will go through the technical stuff. There is the studies that you have talked about with the Altion stuff which I was on the advisory board which back when, is amazingly old. Go look at the latest Internet speed test results which has got much better TCP in it and jumbo frame usage, that is the first issue, update the stats. The number issue is well?known, there is a great NASA memo about how to pick numbers for jumbo frames and what they came to the conclusion was, after going through every number they could find out in the industry, 9,000 is an amazingly nice number because you can remember it. It has no technical merit whatsoever. It's just a number that everybody can remember, go find the memo. Third issue: As a customer yes, I came to you ages ago. We have customers that want jumbo frame, I am sure they don't want to get to acmy information, that is not their issue. I mean that is not the part that we came to you and asked you about.
MAKSYM TULYUK: I know you came to us, you asked for Jumbo Frames. Do you know anyone from another AMS?IX who are ready to peer with you with Jumbo Frames?
Martin: Yes, I do actually.
MAKSYM TULYUK: OK, we have a meeting after two weeks so we can speak about it. OK.
Martin: A couple of other things here. The Netnod experience is blaming?? I am repeating what got said for a reason, there is information out there on thousand do this and the comments that you have made are very valid but as a customer, asking for this solution, I'd like a little bit more, you know, sort of how do I put this? Positive response, because the part about what you are seeing today on the exchange, is not a valid measurement, and I think that you need to go outside and look at other people (valid) Netnod is the easeiest one to describe. There is actually jumbo frame available at fix west ways very small exchange by standards in Europe, but has been doing this for quite some time as well, so I'd just want ?? want to go do a survey again that is fine. But I want a different mindset towards this. I am very unhappy with the mindset.
MAKSYM TULYUK: As I said, we prefer stability of platform. For now as you see, 37% said no, we shouldn't support, and ??
Martin: You have a great future of politics, the ability to look at something that says yes and no, the correct way from a politician's point of view to point out the 30 plus percent. You are very good. Very good.
MAKSYM TULYUK: Thank you.
GERT DORING: I am not sure exactly which hat I am wearing, network operator. There is one thing that I think you overlooked regarding what could benefit from large packets and that is all these pesky overlay networks like DSL customers being terminated on network A and handed over LTTP at network B, and having to fragment LTTP packets is actually costing money because it costs CPU and you have to buy bigger boxes to be able to do the fragmentation and all the MTU nastiness, so having bigger MTUs in the core network and I consider exchanges to be the core of the network, is beneficial to be able to present unfragmented 1,500 bytes MTU at the edge with all the overlays stacking of technologies going on in the core. So come from that side, and we are not an AMS?IX customer so this is more general, I'd like to see bigger MTUs in the core. Coming from another point of of view, we are connected to the DE?CIX and Frankfurt and Munich and there is some 400 other providers there. Some of them have issues getting the net mask correctly for v4, and I see no way that the peering mesh is going to work correctly with all of them trying to figure out how to get peak MTUs working on their gear, and the solution with two different VLANs is also something that means I have to effectively configure two different exchange points at the same time on the same port, on different sub?interfaces, so I have more peering sessions to set up and more stuff to monitor, so from an operator's perspective that is faced with operational instability or lots of extra work, I am not exactly sure whether I want to go there, so I am sort of a little bit torn on this.
MAKSYM TULYUK: Thank you for feedback. One comment about Netnod, as I know Netnod only one exchange in Europe that provides Jumbo Frames. So it's another point in my opinion, because if people really want Jumbo Frames they will follow Netnod way and also start providing jumbo frame support. If I am wrong, correct me.
AUDIENCE SPEAKER: Jenna Google: For most of the customers in IPv4 it wouldn't be so beneficial because they have all MTU and we need to fragment and just from my experience because I have seen peers use large MTU on our peering links, I would say never do it on the production VLAN because I have seen once peer session MTU 1,600 but the agreement ?? properly calculate M M S if options are used. So, BGP sessions are MT 5 immediately went down. So, probably you need to mention in your slides one more disadvantage is we ?? which are vendor bugs on the agreement.
MAKSYM TULYUK: Thank you for your feedback.
AUDIENCE SPEAKER: I spent two days with peers and sending RFC and explaining how MMS should be calculated.
MAKSYM TULYUK: Thank you for feedback from operational experience.
CHRIS BUCKRIDGE: We have had a little bit of of discussion in the Jabber room, and this is just from David Freedman who says there are two issues, the first being if you are packing more information into these large frames more is lost when they are dropped and the second is if your MTUs are mismatched that is your peer does 9 K and you don't you have to issue a too big and he says and also what Jenna is saying if you can't do PM TUI D, you are screwed.
ANDY DAVIDSON: If the peering LAN isn't announced by BGP the two big will never get out and if we can't have any more v4 space for new peering LANs in this service region, then that will never work as well, so that is another good reason to support 2011?05, closing tag adverts.
NICK HILLIARD: INEX. A couple of years ago we experimented with introducing the jumbo VLAN to INEX and it wasn't terribly successful because we only had two participants who were trillion interested in it, one of them had an internal core MTU of 4470 and the other one was 9,000 so you can probably understand how that went. Which is to say not very well. We advertise an MTU of 1,500 on our primary peering LAN or LANs, should I say; however, we actually commit to supporting up to 92100 bytes and we anticipate that in the future, now that we have introduced a VLAN, private VLAN service that if providers want to interconnect with each other at higher MTU they can do so but will have to do it over their own private VLAN so that gets around any sort of MTU compatibility issues that there might be on a shared VLAN infrastructure.
MAKSYM TULYUK: OK. So I understand, you will provide Jumbo Frames but as a private VLAN interconnection?
NICK HILLIARD: Yes, on tagged supports. We can support them on primary VLAN as well but we strongly recommend people don't do this for obvious reasons.
MAKSYM TULYUK: Yes. Any comments, feed backs? Yes. Thank you a lot, once more.
CHAIR: Thank you, Maksym.
Just before we break for lunch, Gaurab is going to give us a quick update on apicks.
SPEAKER (GAURAB RAJ UPADHAYA): Thank you. I am Gaurab. But the presentation was actually made by /TA*UR because he is the interim chair of the APIX.
Have we ever presented at EIX Working Group before? No, this is the first time we are doing it
What is APIX, as the Eurix equivalent, it's not a new Internet Exchange, it's operating in Asia and pacific region. These are the current participating IXP, all the major Japanese Internet Exchanges, in Seoul, in India, the MP IX in Nepal, in SOX, Vietnam IX and all the Equinix I is SS in the Asia Pacific.
So, we have had a few meetings. The first preliminary meeting was in 2009 and we have continued to have invite more people and more IXPs. We generally tend to meet primarily at APRICOT and the recent in Hong Kong we had participants from actually AMS?IX and Netnod and even LINX a and few other members of Eurix. We met during the second ?? the APNIC meeting in our northern summer, yes, sometimes we go to Australia and the summer and winter all get mixed /OUPB the meeting so we tend to say north meeting ?? in September AP meeting where we tend to meet as well.
Limited only to IXP people so this is not yet open form, only invited and people who are involved in IXP are invited. We are gradually working and more aware supporters and other folks who are interested in IXPs and work with IXPs will be invited in some form.
What do we talk about at APIX? We talk about IXP update, and at the background we use to have IX SIG in APNIC or APRICOT peering forum and somehow it has evolved into being API X, and that is why we tend to meet at APNIC meetings and part of the meeting tonnes do IXP updates from IXPs in the region and then we have been talking about taking on an operational discussion within IXPs, we have been recently talking about trial in one of the Japanese exchanges and also VPLS type exchanges.
Actually a big push on route servers in Asia right now. It's not as prevalent as in Europe and IXPs to have a large server operating and everybody has their own different ways, so in fact, at the next APIX meeting route servers is going to be a big topic on the agenda and Andy volunteered to come and do some work on that.
Internet switch and trying to get all our members to enroll and start using peering DB properly. Get on with research stuff about what the charters would be and by?laws set up and setting up of mailing lists and portal and so on.
These are the current chairs and co?chairs. From JPNAP is our current charge and I Gaurab along with car require along the co?chairs coordinating group for API X.
The next meeting will be at the APRICOT in new deli, India, so you are all invited to come to the meeting. I also want to point out here that we have been really grateful to Serge, if he is around, because one of the things we are talking about is how different IXPs in different regions can collaborate and Serge has been very, very helpful in, you know, helping us to avoid things that Eurix did wrong, so hopefully we will get into some of that here to deal with and hopefully we will be working with /PWEUPBLGal and EuroIX as well, continuing in the future.
I think the last time this presentation was in ?? at the EuroIX meeting but if you need to reach us, APIX?admin at A picks dot Asia is the address. Thank you.
CHAIR: Thank you. Any questions? OK, thanks, Gaurab.
So, we will break for lunch. Next up will be Kurtis straight after lunch and if we could try and be back reasonably sharpish, that would be cool. See you then.