Cardholder Services is a firm specializing in scamming.

Here's one air-squirter now.
Cardholder Services is a machine that calls you, waits a few seconds until after you answer the phone, and then plays a recording to you about credit card debit "lowering your interest rates" and "something something set to expire." If you signal your interest to the machine by presssing 9 on your touch-tone phone, you get transferred to another machine that communicates to you by reading a script and squirting air through it meat into a telephone receiver. (You have to hold a few moments for this, since you're inconveniencing them by playing along.)

Cardholder Services doesn't have a specific or legal entity name. They are simply "Cardholder Services."

Cardholder Services explicitly claims to represent Visa and Mastercard.

Cardholder Services isn't headquartered in any particular city. C.S.:"We're out of Washington." J.P.:"Washington, as in D.C.?" C.S.:"No, Washington State" (which is where I live).

Cardholder Services denies having the obligation to add you to their "do not call" list upon request. C.S.:"(sigh) SIR, we're trying to HELP YOU here by LOWERING YOUR CREDIT CARD DEBT! We're not trying to sell you anything!" Or, they'll tell you that you're added, and then call you again within a month.

Cardholder Services does not respect the United States Do Not Call registry.

Cardholder Services will either berate you or hang up on you when it's clear you're not going to play along.

Cardholder Services breaks United States law by purporting to represent someone that they do not (your credit card company), and by not respecting the DNC registry, and by not maintaining and following their own DNC list.

Cardholder Services cannot be held accountable for their actions, since they cannot be identified. They dial anonymously with fictitious originating Caller ID, and they don't give you a callback number.

After several tries of civilly asking them to place me on their DNC list, this last call went something like this (roughly):

C.S.: "Blah blah blah $5000 or more in credit card debt and at least a 477% interest rate?"

J.P.: "Sure. But what's the real name of your company? How do I know you're legit?"

C.S.: "Oh I understand that blah blah blah, and we really appreciate your concerns. Now how much debt did you say you're in?"

J.P.: "$6200." (The last time I made up a number here, it was less than $5000, and that got me hung up on.)

C.S.: "And what would you say your interest rate is?"

J.P.: "Oh, I dunno, in the 30% range." (Why not?)

C.S.: "Gee wowzers!"

Proceeded to give fake name, DOB, last-four-of-SSN.

Then they get to the point where they ask for the customer service number on the back of your credit card, "for verification purposes." I've read that what they do is look up your financial institution from that phone number, then read to you what are most likely the first four digits of any credit card number issued by that firm, probably to carry along with the illusion that they have any connection whatsoever to your financial institution. In my case, I read them the customer service number on the back of the debit card issued by a small local bank. I guess they didn't have that number in their map, because they asked for the first four digits of "my credit card number."

(Hey, what's my Visa number doing here?)

He put me on hold for a few minutes to "verify my information," then took me off hold, pretended like he couldn't hear anything on his end of the call, and then hung up. Shucks.

I need to build a phone recorder. Since unfortunately Washington is a two-party recording state, I'll have to announce this whenever they call, but the responses to that should be amusing enough in and of itself.


Lisp libraries

I've released a few Common Lisp libraries this month:

Calispel is a Common Lisp library for thread-safe message-passing channels, in the style of the occam programming language.

Calispel channels let one thread communicate with another, facilitating unidirectional communication of any Lisp object. Channels may be unbuffered, where a sender waits for a receiver (or vice versa) before either operation can continue, or channels may be buffered with flexible policy options.

Because sending and receiving on a channel may block, either operation can time out after a specified amount of time.

A syntax for alternation is provided (like ALT in occam, or Unix select()): given a sequence of operations, any or all of which may block, alternation selects the first operation that doesn't block and executes associated code. Alternation can also time out, executing an "otherwise" clause if no operation becomes available within a set amount of time.

jpl-queues is a Common Lisp library implementing a few different kinds of queues.

These kinds of queues are provided:

  • Bounded and unbounded FIFO queues.
  • Lossy bounded FIFO queues that drop elements when full.
  • Unbounded random-order queues that use less memory than unbounded FIFO queues.

Additionally, a synchronization wrapper is provided to make any queue conforming to the jpl-queues API thread-safe for lightweight multithreading applications. (See Calispel for a more sophisticated CL multithreaded message-passing library with timeouts and alternation among several blockable channels.)

SimpSamp is a Common Lisp library for simple random sampling without selection. In English, that means it will return a requested number of random elements from a set, without duplicating any of the elements.

For instance, if you have a list of 1,000,000 elements and want 1,000 unique random elements from the list, SimpSamp will do the trick.

SimpSamp implements two algorithms: selection sampling and reservoir sampling, both described in The Art of Computer Programming.

  • Current Music
    Super Mario Brothers, world 2-1
  • Tags

Releasing Common Lisp geospatial libraries

cl-wkb implements the OGC Well-Known Binary geographic geometry data model, and provides WKB encoding and decoding functionality.

Three people have asked for it—two in one day—so today I've released it under a BSD-like license.

Of probably more use to Lisp GIS hackers (together with cl-wkb, or overall), I'm throwing in: cl-geo, a set of geographic data structures and operations.

I hope some people find it to be useful.


How to manage a set of servers

This article is discusses sane system administration for a group of similarly-configured machines. It's also the basis for a talk that I'd like to give at the Spokane Linux User's Group.

Due credit: this is essentially my digested version of the concepts presented in the paper Bootstrapping an Infrastructure, and the derived material at http://www.infrastructures.org/. Some bits, like unified user accounts and NFS home directories, aren't important to me, so I haven't included them here. And I've been a lot more specific about choices of tools than the infrastructures.org folks have—you'll find a general Debian bias. =)

Due disclosure: I'm only in the initial stages of implementing for myself what I describe here. I'm writing this while the goals and principles are fresh in my mind. Actual mileage may vary, especially around section 5.

Your current situation may look something like this: you manage several machines—could be a few or it could be hundreds. These machines could be servers, routers, or workstations. You have certain preferences and practices for what you like to have in common for all your machines.

  • Security updates should run at 2:10am.
  • All my web servers should have these bits of configuration.
  • All my firewalls should have these particular rules.
  • I really want curl and wget on all my machines, and pv is a neat little utility that I want everywhere, too.

But there are problems:

  • You find that your machines, in aggregate, easily diverge from your ideal vision, gradually decaying into chaos.
  • You find yourself doing the same thing over and over again. Shell loops and SSH can only get you so far, and mistakes lead to even greater divergence among machines.

You want the things that should be in common to machines to actually be consistent. You want to quit repeating yourself.

Here, we'll be discussing what to do about your situation.

1. Figure out how to install a consistent machine image, automatically if you can.

I'm using Debian preseeding because I use Debian, and FAI is too complex and hard to learn. FAI can manage machine configuration, and kind of needs to to some extent, but that overlaps with what I want Puppet to do (see below). Debian preseeding is "as simple as possible, but no simpler."

2. Learn the hierarchy of data.

Your infrastructure is divided into a number of machines. The data on each machine is divided into system and non-system data.

System data means installed programs, initialization scripts, and other stuff that comes from your OS distribution. System stuff is boilerplate. It's easy to replace this.

Non-system data is everything that makes a machine unique. It is further divided into configuration and local state.

Configuration is easy: parameters that define what a machine is supposed to do, and how it's supposed to do it. This includes the selection of installed software packages, and most of everything in /etc.

Local state is the non-configuration, non-distribution-supplied data that your applications need in order to operate. This would include the web root hierarchy of a web server, the mail spool and mailboxes on a mail server, or the zone files of a DNS server. Note that this includes both machine-generated state (mail spool directories) and human-generated state (web roots). The local state includes most of everything in /var.

3. Understand key goals when it comes to machine data.

System data can be thrown away and replaced easily, because you can reimage a new machine consistently.

Configuration data should largely be consistent across machines.

  • Some things will always vary: hostname, IP address, and so forth.
  • Some things you'll want to be standardized: NTP configuration, smarthost configuration for non-mail servers, OS package repository selection, a set of software packages you want to have available on all machines, and so forth.
  • Some things will be standardized, but only for certain classes of machines: web servers will always have Apache with certain configuration bits, mail servers will always run Postfix with other configuration bits—or whatever you prefer, of course.

A machine's configuration is important—of course you don't want to lose it. But we'll be managing configuration from a central location, so if it gets physically lost on a particular machine, we can recreate it with the configuration management system (discussed below).

Local state is the crux of a machine's data. Your web server just won't work correctly unless you have the right HTML pages in place. You cannot lose this data, so you'll back it up regularly. With the methodology described here, it won't be hard to make this automatic.

3. Implement a system for centralized, managed configuration.

You have an ideal vision of what the common configuration of your machines should be. You need the ability to express that configuration, as well as the parts that differ on each machine. The differences are conditional on certain variables: what's the MAC address or the hostname of the machine? What's the role or class (web, mail, etc.) that I've assigned to it? And so forth.

You need to be able to express this information in one place. When you make a configuration change, your running machines will synchronize to reflect that change.

A configuration management system should include:

  • A procedure to express configuration in this fashion, and
  • The tools for your machines to apply the configuration.

Puppet and Cfengine are contenders for this position.

Add a client for Puppet or Cfengine to your standard machine installation image. Your new machines will automatically configure themselves.

  • Installing a machine from your installation image results in a plain, stripped-down configuration.
  • The configuration management system updates the new machine's configuration to whatever it needs to be, as you define in the managed configuration.

You'll also want to log, timestamp, and document your changes to the managed configuration. You'll want to see which of your admins made what change, and why. Which files were touched? What were the lines that were altered? What did the file look like before the change?

Applying a revision control system—such as Git, Darcs, or Subversion—to your managed configuration makes this possible. When you do this, you have a complete history of configuration changes, and can roll-back changes that turn out to be mistakes.

4. The golden rule.

Now that you have a configuration management system in place, follow our golden rule for infrastructures:

Never, ever change the configuration data or system data of a machine in a way which is outside the control of your configuration management system.

(...also known as "cowboy sysadmining.") It's very tempting to do, because it's the quick and lazy thing. It's also quite common in many (if not most) IT shops. However, this causes long-term damage.

A violation of this rule could be considered a defect or an error in the machine. When you break this rule, a machine diverges from your ideal vision. One of two things can happen:

  1. The configuration management system detects the error and corrects it, undoing your work.
  2. The error goes undetected, and the machine is permanently different from all the other machines in your infrastructure.

Possibility #2 is the most dangerous of the two. In this case:

  • The machine behaves unpredictably. It's different from all the other machines of its type, so it's hard to reason about its behavior.
  • Personnel trained to Do The Right Thing by using the configuration management system will probably not be aware of the variation. There's no automatic documentation or audit trail, like there is with a revision-controlled configuration management system.
  • If the machine needs to be replaced, the new machine will not carry this change, since the change isn't tracked. So, the replacement of a machine is now a dangerous and risky undertaking.

I would go so far as to say it's not worth doing configuration management if you won't stick to this golden rule. After all, you wanted predictability in your infrastructure, right? You wanted to keep the important stuff in common, right?

If you find yourself with a compelling reason to compromise on this, think carefully about how to do it safely. Develop a procedure comparable to how you would update the managed configuration, to make sure that your infrastructure doesn't gradually descend into chaos once again.

5. Reap the rewards.

Need to add a new web server? It's as simple as:

  • Imaging a machine,
  • Updating your managed configuration to say "the machine with MAC address <foo> has IP address <bar>, and it's a web server,"
  • And letting the machine configure itself.

Now just drop in your web content.

Need to replace the failed hard disk of a critical server? This, too, should be pretty easy:

  • Image a machine.
  • Restore the backup you made of the local server data onto the new machine.
  • Update your managed configuration to say "this particular critical server no longer has the MAC address <foo>. Now it's <bar>."
  • Watch the machine configure itself.

If you did everything right, you'll see your replacement machine come up and take off running. Just like that.

Welcome to the future.

A. "But that would be putting all my eggs in one basket. And then someone pwns my basket."

It's true. Implementing a configuration management system on a centralized server is a security risk. That's because each of your managed machines will do whatever your configuration management server tells them to do, automatically. If a bad guy cracks your management server, they could change the configuration to include "please wget r00tsh3ll-3.1.337.tar.gz from my web server, untar it to /tmp, and move this executable file to this system binaries directory, and run it."

But this is a risk that can be managed. And if you manage it well, it's probably less risky than what you're doing now.

You have to log into your servers for maintenance somehow. In non-managed situations, that "somehow" is probably "with SSH from a system administrator's workstation or laptop." If a cracker gets access to the admin's computer, they can spread their compromise to each other machine that it has access to.

With a centralized, formal configuration management server, you'll be able to lock that server down and make sure nobody's visiting Flash-laden porn web sites on the critical server.

See also: http://www.infrastructures.org/


Slashdot eternal redux

Every year:

  • "Digital Dark Age" – preservation of information on aging media
  • The Internet is Broken – lack of authentication/trust considerations in major protocols

Every 3 months:

  • Microsoft is doing something – how can we make them look bad and accuse them of monopolization?

Every 4 years:

  • E-voting – it's about more than just killing puppies!

What am I missing?

  • Current Music
    Rob N Tug / Fabric 30 / Lifelike and Chris Menace Discopolis: Defected
  • Tags

Yost serial wiring

Warning: only the most pathologically geeky need continue.

Today I got my serial situation sorted out.

I use RS-232 (serial ports) for machine management on an out-of-band medium. If the network is unavailable and I can stick a serial cable into some machine, I can log into it, poke around, fix whatever needs to be fixed, and watch most bootup messages (after POST, beginning with the bootloader), all without needing a video signal or a keyboard. (Expensive motherboards and expensive add-on cards will also capture BIOS-printed text to a serial port; of course Suns and serial-equipped NewWorld Macs do this as a matter of course. I think the most Enterprisey versions of Windows support a serial console, but I don't have any experience with that.)

That's handy, because video only goes so far. If I have a serial cable between machine X and machine Y already, I can log into and monitor machine Y by SSHing into machine X, nomatter where I am physically.

It's a little known secret that serial cables you'll find at RadioShack or Best Buy are outrageously overpriced. Actually, it's a well-known fact. Today I received my order of RJ45 to DB-9 and RJ45 to DB-25 adapters, 34 in all. That plus the abundance of free cable I'm sitting on and admittedly more time than I was expecting adds up to hundreds of dollars worth of serial cable. At whatever arbitrary lengths I need. The whole thing came to under $25 shipped from monoprice.com.

Here's how it works:

  • First I wired the adapters I'll be using. These take an RJ45 plug (like Ethernet) and give you some sort of DB-9 or DB-25 connector (like most serial ports use).

    The wiring is fixed per device, and considered a semi-permanent fixture. Therefore, I screw each of the adapters into each port that they'll live on for the foreseeable future. Even if I'm not using that port now; even if I change which other device a cable on that port runs to, the adapter never changes.

  • Then I cut off lengths of 8-conductor cable. It can be just about anything; I have mostly Cat-5 lying around (as do most people (that have read this far)), but I also happen to have some 4-line telephone cable, which is non-twisted and therefore useless for even 10BaseT. Perfect for this.

    The pinouts I use for the cables are always the same for this system.

  • Voilà! I plug any cable in-between any two devices with these adapters hanging off of them. (It turns out the tabs on RJ45 connectors are exponentially more convenient than the thumbscrews of DB-9 and DB-25.)

Did you notice how I didn't mention anything about null-modem cables? Under this scheme, every cable is a null-modem cable, and every adapter is wired for the kind of device it connects to, and expects this kind of cross-over/null-modem cable. So servers and terminals are wired one way; modems are wired another. That's my favorite part about the scheme. As long as you stick to it, you don't have to worry about whether a particular cable is a null-modem cable. It's quite elegant.

What is the scheme? It's the "Yost Serial Device Wiring Standard," 21 years old now. Anyone that's doing anything with serial needs to be using this. It's loads more convenient and cheaper than ready-made serial cables.

So, instead of scrounging for one of the two serial cables I own, I've now got five cables permanently in-place. One of them is 25 feet long, strung over a doorway, instead of 10 feet long, with a bulky DB-25 null-modem in the middle connecting two segments and strewn about the floor, ready to be tripped over. I can SSH into my most trusted system to access the console of any other in my closet.

And, I've now got a UPS cable. Apparently APC considers their cables business (WTF?) to be Serious Business, and protects their trade secrets (pinouts) vigilantly.


I found this diagram from the makers of my UPS software.

Following this pinout, I was pleased to find that I can make Yost DB-9 M adapters for my APC UPSes and use the same serial cables with them. Assuming standard coloring (which is also the coloring as is used in the Yost standard), cut off the pins for brown, orange, blue, and white. Solder the brown and orange leads together, and (separately) blue and white. Cut off green's pin and solder it onto red's pin (or vice-versa) as you would any other Yost adapter. Then push the black pin into position 1 on the DB-9 connector, the yellow pin into position 2, and the red/green pin into position 9. There's your $30 cable.

Needless to say, the biggest downside of going this route was that it took me about 4 hours to wire just 23 adapters. But that's a one-time cost; making new cables (as I need them) takes just a few minutes.

The next-biggest downside was that the adapters I purchased had the pins pre-crimped on the leads. Which was helpful, considering I don't have the pins or crimper for these adapters, until I realized that for each adapter, I'd have to cut off the pin of one wire, strip that wire, and solder it onto the pin of another wire: signal ground, which is green and red on the RJ-45 side, connects to just one pin on the DB-9/DB-25 side. That's what took most of the time. And mashing those now-oversized pins into the connectors.

But I'm still pleased, and glad to get this years-long todo item crossed off.


Correctness doesn't matter with Python

c.f. Issue 1085283. Python's binascii module wouldn't correctly encode binary data to MIME Quoted-Printable format. Python's quopri module depends on binascii's implementation of Quoted-Printable.

It took two years to fix the two of the buglets, but they still refuse to fix the other: correctly encoding octet 0.

Python is supposed to be a high-level language, but their use of C has led to yet another stupid bug and they refuse to fix it. This is frustrating. I'm going to have to copy-and-paste the backup Python implementation of Quoted-Printable out of quopri just to do the right thing.


En Route now supports GTFS

En Route now supports Google Transit Feed Specification for transit schedule times and vehicle paths, and my STA→En Route Spokane conversion scripts now output data in GTFS format.

This means:

  1. I have an external data model instead of my ad-hoc STA data formats. Maintenance becomes an order of magnitude easier.
  2. I can submit STA transit data to Google to include Spokane Transit with Google Transit's trip planner.
  3. Once I start importing street GIS data from OpenStreetMap, it becomes possible for other transit systems to run on my trip planner.
    • I can import data from agencies that make their Google feeds public, like Portland's TriMet.
    • I can market my trip planner to the few and possibly growing number of agencies that do Google Transit, but don't have their own trip planner. They get a trip planner for only the cost of the software, since they already put together the data (which is the expensive part).
  4. I can use standard GTFS tools to view and manipulate STA transit data. Like schedule_viewer:

  • Current Music
    Sasha & John Digweed / Northern Exposure / 2: 0°/South / Underworld: Dark & Long
  • Tags