Author's note: Lindy White passed away shortly before this article was posted. I'm grateful to Lindy and his supervisor, Kevin LaBranche, for bringing Lindy's solution to my attention and arranging the interview. --ag
Lindy White, Systems Specialist, Coconino County, Arizona
Businesses with public websites
face the trade-off of providing
unfettered access to legitimate
site users versus blocking security
threats to the site, such as
hackers and bots. Local “.org” websites, such as governments and
school districts, often publish employees’ contact information—but
posting that information also makes the site a prime target for spambots
that comb the Internet for email addresses to collect, or reap.
Coconino County (Arizona) employees, whose contact information
is published on department pages on the county website (www.coconino.az.gov), noticed a steep increase in spam early this year,
despite the use of a spam-filtering product. County systems specialist
Lindy White solved the problem by writing an ASP.NET 2.0 HTTP
module that intercepts county email addresses being accessed from
outside the county’s Microsoft IIS web server, then redirects legitimate
users to a contact form. I spoke with Lindy about how he developed
his innovative solution and how it has drastically reduced the spam
in Coconino County employees’ mailboxes.
Q: Let’s start by talking about the county site and what made it a
target for spambots.
A: On our public site, all our departments have a home page,
and some have several additional pages. Department employees
administer the content on those pages using a content management
system (CMS). They’re very reliable and responsible about the kind of
information that they’re publishing. Because we want our services to
be reachable, [the employees] all make sure there are plenty of email
addresses on these department pages.
Starting this year, we were filtering out roughly 400,000 emails a
month, which isn’t atypical for an organization. But then we started
seeing a straight-line increase in spam going up maybe 50,000 spam
messages a month. I wondered whether our county website was
contributing to this increasing load on our spam filter, with the number
of email addresses we were exposing to web crawlers, web-bots,
and spambots. You want Google and Yahoo! to crawl your site, but
you don’t want the crawlers that are specifically there to reap email
addresses.
Then I took off my white hat and put on my black hat. I wrote my
own spambot, turned it loose against the county site, and came up
with almost 600 unique county email addresses. That told me everything
I needed to know. We needed to stop handing those [email
addresses] out to spambots while still making those addresses available
to people.
Q: How did you solve the spambot problem?
A: I proposed several solutions and pitched the best one to Kevin
LaBranche, my division manager. Microsoft .NET Framework lets you
write some very low-level hook-ins to the IIS web server. So I decided
to write an HTTP module that sits in the web server’s memory and
basically looks for email addresses that are leaving the web server to
go to somebody’s computer. At that point, I chose to substitute a form
with CAPTCHA, to enable my program to distinguish whether a person
or a computer was accessing an email address. The email form hides
the email address, but automated spammers can still fill out the form
and submit it. The CAPTCHA test is a second level of security directed
at preventing that. The module is all callbacks; it’s not linear programming
at all, it’s all event driven.
When the HTTP module snags an email address, the module connects
to a database and checks a list of email addresses maintained
there. If that email address isn’t on the list, [the module] adds it and
assigns it a unique number. If the email address is on the list, [the module]
just reads that number and substitutes it for the email address, so
that your random web-bot will never see it.
Q: How complex was the solution to develop?
A: Where the complexity came in was that the CMS editors needed to
see the actual email addresses, not the contact ID of the form. I think I did
what was probably pioneering work in how to selectively make exceptions
for certain pages that you might classify as administrative pages
and display email addresses to the employees who needed them.
Q: When you started using the HTTP module, what happened to
the amount of spam employees were receiving?
A: I brought the solution online and put it in production in late February.
In March, the number of spam caught in the filter was still going
up in that same straight line, 50,000 a month. But in the March–April
timeframe, we saw the first drop that we had ever seen. That curve
dropped off by maybe 44,000 spam [messages].
Q: You’re primarily a system- and server-level scripter and programmer
and don’t work with end users much. Nevertheless, you
solved a big end-user problem. Did you get any recognition within
your organization for your solution?
A: Yes, I was absolutely astonished to learn that I’d been nominated
for a county award because of the solution. Nobody cares about the
behind-the-scenes programming that I usually do. But whole departments
were coming up to me and saying how they were so tired of
all the spam they were getting on their public email addresses and
thanking me for my hard work.
End of Article