Digging Through the Problem of IPv6 and Email – Part 2

IPv6 multiplies this problem. We have seen that spammers already possess the ability to hop around IP addresses quickly. They do this because once an IP gets blocked, it is no longer useful to them. There are only so many places they can hide, though — 4.2 billion places they can hide. However, in IPv6, if they are able to do the same pattern of sending out mail and hopping around IP addresses the same way they do in IPv4, then there is virtually unlimited space they can hide in. To put it one way, I’ve seen estimates that there are 250 billion spam messages sent out per day. Under IPv6, spammers could send out 1 piece of spam per IPv6 address, discard it and then move on to the next IPv6 address for the next 1000 years and never come close to needing to re-use a previous IPv6 address. A mail server could never load a file big enough even for one day’s IPv6 blocklist if spammers sent every single spam from a unique IPv6 address. Because spammers could hop around so much, IP blocklists could conceivably encounter the following problems:

  1. They would get to be too large for anyone to download, process and upload.
  2. They would be latent since by the time the IP was listed, spammers would have discarded it and moved on to the next IP address.

This is why no mail receivers are thrilled about the idea of using IPv6 to send mail. It means that they have to allow for the worst case scenario, and that worst case scenario is that spammers will overwhelm their mail servers and drain processing power having to deal with a 10x increase in traffic.

So how do we deal with it?

One idea, as referenced by the writers above, is to use whitelists instead of blocklists. Block all mail from everyone and then maintain a central whitelist of good mail servers that send legitimate mail. The weakness here is that it defeats the whole purpose of email. The purpose of email is that you can hear from new people you haven’t heard from before. New mail servers are brought up all of the time. There’s no way for you to know about it and the process of having to opt people in is a pain and hassle. This idea could be centralized, but the legitimate mail servers for one set of folks is not going to be legitimate for another set of folks.

Another idea is to take an unmanageable problem and break it down into a manageable one. I haven’t really fleshed this out through any working groups, but let’s go back and take a look at how CIDR notation works and how blocklists take advantage of them. Consider the IP 216.32.180.16. This can be broken down into four 8-bit octets, and then combined to make one 32-bit number:

A CIDR range is something that is a bit-wise operator. The CIDR range is the number of bits that is common to the range and contains every IP within that range which contains the first xx number of bits (wow, that didn’t sound very clear). Let me use an example. Let’s take the range 216.32.180.0/24. If we convert this down to the bits that it represents, then this range of IPs is any IP that contains the first 24 bits since the /24 says to take the first 24 bits:

216.32.180.16 is said to fall within the range 216.32.180.0/24 because the first 24 bits of the 32-bit representation of 216.32.180.16 is the same as the first 24 bits of 216.32.180.0/24:

The first 24 bits match, the last 8 do not (illustrated by the 1 in green) but it doesn’t matter because we only need to match the first 24 bits. The red and blue parts match up and therefore 216.32.180.16 falls within the range of 216.32.180.0/24. However, if we take a slightly different IP address, 216.32.181.16, that will have a different 32-bit mapping. It will not fall into the /24 range because the last bit does not match:

You can see that specifying things in CIDR notation is a very quick and easy way to list IPs on a blocklist. It makes sense to us humans reading it because we can interpret the numbers “naturally”, and it works from a technical perspective because it translates into bit-mapping. This is how PBL and some other lists are able to manage so many IPs. The IP range 65.55.0.0/16 lists any IP that matches 65.55.xx.xx; this is 65,536 IP addresses. They all fall into a logical range.

The number of IPs that falls within a CIDR range is evaluated as 2^(32-n) where n is the CIDR range (the number after the slash). So, a /24 (pronounced slash 24) is 2^(32-24) = 2^8 = 256 IPs, a /12 is 4096 IPs, and so forth. The larger the CIDR range number n, the smaller the range of IPs it covers. To newbies, this is counterintuitive and takes a bit of time to wrap your head around it but after a while you pick up the lingo. The smallest IP range is a /32 (1 IP) whereas the largest is a /1 (every single IP).

Click to read Part 1 and Part 3.

Written by Terry Zink, Program Manager

Digging Through the Problem of IPv6 and Email – Part 1

Recently, a couple of anti-spam (or at least email security related) bloggers have written some articles about IPv6 and the challenges that the email industry faces regarding it. John Levine, who has written numerous RFCs and a couple of books about spam fighting, writes the following in his article A Politically Incorrect Guide to IPv6, part III:

We will eventually figure out both how people use IPv6 addresses for mail, and how to manage and publish v6 reputation data (I’ve been doing some experiments, which I’ll blog about when I have enough results), but until then, running a mail server on v6 will be a lot harder than running one on v4. And since you’ll be able to handle all the real mail on v4, why bother?

We will eventually figure out both how people use IPv6 addresses for mail, and how to manage and publish v6 reputation data (I’ve been doing some experiments, which I’ll blog about when I have enough results), but until then, running a mail server on v6 will be a lot harder than running one on v4. And since you’ll be able to handle all the real mail on v4, why bother?

Barry Leiba, another email security writer, writes the following on CircleID on an article entitled IP Blocklists, Email, and IPv6:

John Levine has one approach: leave the email system on IPv4 for the foreseeable future. Even, John points out, when many other services, customer endpoints, mobile and household devices, and the like have been — have to have been — switched to IPv6, we can still run the Internet email infrastructure on IPv4 for a long time, leaving the IP blocklists with v4 addresses, and a system that we’re already managing fine with.

John Levine has one approach: leave the email system on IPv4 for the foreseeable future. Even, John points out, when many other services, customer endpoints, mobile and household devices, and the like have been — have to have been — switched to IPv6, we can still run the Internet email infrastructure on IPv4 for a long time, leaving the IP blocklists with v4 addresses, and a system that we’re already managing fine with.

Of course, some day, we’ll want to completely get rid of IPv4 on the Internet, and by then we’ll need to have figured out a replacement for the IP blocklist mechanism. But John’s right that that won’t be happening for many years yet, and he makes a good case for saying that we don’t have to worry about it.

Both writers are saying the same thing, and I have been on discussion threads where the consensus was similar: there is no agreement on how to handle IPv6 over email at least in the short term, but eventually it will probably have to be figured out (there are some believe mail will never move to IPv6 vs some who think that it will have to go there one of these days). In the meantime, just use IPv4 to send mail.

To expand a bit on what both writers are saying, the biggest reason why no mail providers are particularly thrilled about using IPv6 to handle email is because there is no way at the moment to deal with the problem of abuse. Today, spammers make extensive use of botnets. Each day, they compromise new machines and start using them to spew out spam. Each of these bots use different IP addresses, and the IP addresses change all of the time. I haven’t done an analysis in a while, but if you had 10,000 IP addresses today that are sending out spam, then tomorrow there would be 10,000 again but at least 9700 of them would be different IP addresses than were there the previous day.

The reason that there is so much rotation in IP addresses is because spam filters today make use of IP blocklists. When a blocklist service detects that an IP is sending spam, it adds it to the blocklist and rejects all mail from it. There are exceptions to this rule such as a legitimate IP that sends a majority of good mail (such as a Hotmail or Gmail IP address), but in general, mail servers reject all mail from blocklisted IPs. The reason they do this is the following:

  1. 90% of all email flowing across the Internet (not including internal mail within an organization) is spam. If a sending IP is on a blocklist, a mail server can reject it in the SMTP transaction and save on all of the processing costs associated with accepting the message and filtering it in the content filter. Many mail servers these days would topple over and crash because they could not keep up with the load if they had to handle all of the mail coming from blocklisted IPs since it would increase the number of total messages to deal with by a factor of 10.
  2. Spam filters get slightly better antispam metrics by using IP blocklists. Content filters are pretty good today, but rejecting 100% of mail from a spamming IP address means that there is no possibility of a false negative from that IP address. By contrast, if a content filter does not use an IP blocklist, the content filter has to learn to recognize the spam coming from that IP address, update the filter and then replicate out the changes. This is almost always slower than pulling down a blocklist and then using it as the first line of defense. Without an IP blocklist, a spam filter might be expected to filter between 80% and 99% of the mail coming from a blocklisted IP. While many spam filters get pretty close to that 99% range, it’s still not 100%.

Those are the two primary reasons to use IP blocklists. They are essential in blocking spam. Next up, the question is how blocklists are populated, and I’m going to leave that aside because there are resources elsewhere on how to deal with that. Blocklist operators publish their lists in two ways:

  1. They list individual IP addresses of all the servers that are sending mail, one by one.
  2. They make use of CIDR notation. CIDR notation, or Classless Internet Domain Routing, is basically a way to group large blocks of IP addresses. In IP blocklists, a provider would list a larger group of IP addresses in CIDR notation in order to save on space in the file (they don’t have to list them one by one). For example, the XBL is about 7 million entries (lines of text) and is around 100 megs in size. By contrast, the PBL contains 200,000 lines of text (without exceptions in ! notation) and is 6 megs. However, the PBL is represented mostly in CIDR notation. If all of these ranges are expanded, it is over 650 million individual IP addresses. That’s a whole heck of a lot more IPs in the PBL for a whole lot less file size space.

In terms of effectiveness, we run XBL in front of PBL and XBL blocks about 4 times as much mail as PBL (I don’t know how many would be blocked if we ran them in reverse). The XBL is better at catching individual bots that are sending out spam but are not listed anywhere (they are new IPs) whereas the PBL is better at pre-emptively catching mail servers that should never send out spam (probable bots but it doesn’t matter because they shouldn’t be sending mail anyhow). They are designed to be used in tandem. However, if we had to list every single PBL IP singly instead of compressing it into CIDR ranges, and if we use about the same ratio of 7 million IPs ~ 100 megs, then the PBL would be 9.4 gigs in total size. 9.4 gigs is a large file size. It isn’t completely unmanageable but it goes from being a minor inconvenience to being a major one. It takes a long time to download/upload/process a 9.4 gig file. It’s also far easier to store the file entries in a database if it is only 500,000 entries (or even 7 million) vs 650 million of them. Databases that large start to run into the problem of scale.

The PBL and XBL are prime examples of why different styles of IP blocklists are required. The PBL lists 650 million IPs and we still have over 7 million IPs on the XBL that aren’t on the PBL. Clearly, spamming bots can and do move around such that they are not listed on the lists that have large swaths listed. Bots are very good at hiding in places that are not called out and blocked yet. If they could not do this they would not be in business, and spammers are still in business. The fact is that given enough space to hide, spammers will hide in that space. The problem that we in the industry face is that as soon as we find a hiding space, we can block it for a bit but the spammer will vacate it, relocate elsewhere and continue to spam.

And therein is the problem of IPv6. An IPv4 IP address consists of 4 octets, and each octet is a number running from 0-255. This means that there are 256 x 256 x 256 x 256 possible IP addresses, which is 4.2 billion possible IP addresses. In reality, there are far less than this because there are lots of ranges of IPs that are reserved and not for public consumption. Still, using our formula from above, if you had to list every single IP address singly in a file, then the size of the file would be 61 gigs. 61 gigs is a very large file size and there are very few pieces of hardware that can handle that size of file in memory (whether you are doing IP blocklist look ups in rbldnsd or some other in-memory solution on-the-box). Processing the file and cleaning it up would take a very long time; you simply couldn’t do it in real time where IP blocklists need to be updated frequently (once per hour at a bare minimum).

Click to read Part 2 and Part 3.

Written by Terry Zink, Program Manager

Brocade offers NAT64 in software update to ServerIron ADX

Brocade has brought much-needed enhancements to the IPv6 capabilities available in ServerIron ADX series with software release version 12.3. The details are a bit unclear as this blog entry is being written since the Brocade website still seems to have collateral related to release 12.2. The You-Tube announcement does reference a standards-based NAT64, which should be an implementation of the soon to be published RFC currently called draft-ietf-behave-v6v4-xlate-stateful, but there was no specific statement related to that.

Deploying IPv6: The Surfnet Case Study

Dutch organisation SURFnet has created a document that looks in depth at how they deployed IPv6 across their network. Intended for network architects and network managers implementing IPv6 in their organisations, the document has been translated by the RIPE NCC and is available online.

Read the PDF

NYI Steps Up Support of World IPv6 Day

Mission-Critical Data Services Provider Achieves v6 Corporate Compliance as Part of Aggressive Campaign for World IPv6 Day, June 8, 2011

NEW YORK, NY March 16, 2011—NYI (www.nyi.net), a New York City-based, mission-critical data services provider, today announced IPv6 corporate compliance, the company’s latest milestone in its aggressive campaign to support World IPv6 Day, June 8, 2011. The news comes on the heels of NYI’s February announcement of full client support of the new network protocol that replaces the depleted supply of Internet Protocol version 4 (IPv4) addresses with a brand-new pool that offers such functionality as improved performance rate and security as well as reduced latency.

By achieving corporate compliance, NYI’s principal web presence is now v6 reachable, placing it at the forefront of supporters who understand the importance of raising awareness.

“NYI is among the growing list of companies making IPv6 available for commercial use,” said Leslie Daigle, Chief Internet Technology Officer of the Internet Society. “Companies who recognize the business imperative of forward progress for Internet technologies such as the deployment of IPv6 provide an example to the industry for others to follow.”

“The Internet Society has done a fantastic job of bringing IPv6 to the forefront,” added Phillip Koblence, VP Operations, NYI. “With World IPv6 Day, they have created a groundswell of attention that no one in technology can ignore. We all owe them a depth of gratitude for what they’ve done and look forward to doing whatever else we can to supporting their mandate.”

To find out more about World IPv6 Day and lend your support, visit: http://isoc.org/wp/worldipv6day/. For any business-critical questions about deploying IPv6, NYI has set up a special support team whom you can contact at [email protected] or 1-800-288-7387.

IPv6 Address Allocations

Last year, we presented statistics on the number of RIPE NCC members and the resources distributed to them. Now, one year later, we revisit the topic and look at how things evolved in 2010. We were particularly interested to see how the number of IPv6 allocations increased over time.

With unallocated IPv4 addresses nearing exhaustion, it is good to see the number of IPv6 allocations continued to grow exponentially in 2010 and early 2011. A total of 834 new allocations were made in 2010, compared to 554 in 2009. The graph below shows demand for IPv6 addresses was especially high during the last two months: 316 IPv6 allocations were made in January and February 2011 alone.

Another good sign is the growth in the number of LIRs holding an IPv6 allocation. The percentage of the RIPE NCC membership with one or more IPv6 allocations increased from 25% in 2009 to 35% in 2010.

For more information, please refer to the background article on RIPE Labs: Members and Number Resources, One Year Later

Written by Mirjam Kuehne