Wednesday, February 24, 2010

LDAP redundancy through proxy servers

Problem 1: Failover

The problem

Many applications only allow you to configure a single LDAP server. This can lead to unnecessary service outages if your directory service infrastructure is highly available (e.g., you are running Active Directory) and your application cannot take advantage of this fact.

A solution

We can provide a level of redundancy by passing the LDAP connections through a load balancing proxy. While this makes the proxy a single point of failure, it is (a) a very simple tool and thus less prone to complex failure modes, (b) running on the same host as the web application, and (c) is completely under our control.

For this example, I will use Balance, a simple TCP load balancer from Inlab Software GmbH. There are packages available for most major Linux distributions, including Fedora and CentOS.

Balance is configured completely on the command line. To provide round-robin access to a suite of three directory servers running LDAP over SSL, you might use the following command line:

balance -b 127.0.0.1 636 10.1.1.1 10.1.1.2

Using balance's terminology, this creates one group of two channels. Balance will round-robin among the channels in this group. Note that here and in subsequent examples we are binding the proxy to the loopback interface so that it is only available to applications running on the same host.

If you would prefer to preferentially send all the requests to the first server, and only use the second server if the first is unavailable, you could use a configuration like this:

balance -b 127.0.0.1 636 10.1.1.1 \! 10.1.1.2

While you can run balance from a standard init (/etc/rc.d/...) script, I prefer to use a service manager such as runit which takes care of restarting the service if it should exit unexpectedly. You could achieve the same thing in a slightly less flexible fashion by putting your balance command line in /etc/inittab. In either case you need to add the -f option to the command line, which causes balance to stay in the foreground.

Problem 2: Debugging LDAP over SSL

The problem

It is convenient to use a packet tracer such as Wireshark to debug LDAP protocol errors. This is often more informative than the debugging information that will be available to you on the client side, and may be more useful than server side debugging in many cases, even supposing that you have administrative access to the directory servers.

A solution

You can use Stunnel, a general purpose SSL proxy tool, to intercept unencrypted client connections on the local machine and then forward them over an SSL channel to a remote server. This makes the unencrypted LDAP traffic available on the loopback interface while still ensuring that it is encrypted on the wire.

Stunnel can operate both as an SSL server and as an SSL client. In this case, we will be running it in client mode, connecting to a remote SSL server (or to the proxy configured in our previous example). Stunnel is configured by means of a simple INI-style configuration file. To achieve the goals of this example we might put the following configuration in a file (say, stunnel.conf):

[ldap]

accept = 127.0.0.1:389
client = yes
connect = localhost:636

We would run stunnel like this:

stunnel /path/to/stunnel.conf

Again, I would run this under the control of a service supervisor. To keep stunnel in the foreground we would add the following to the global section of the configuration file (i.e., before the [ldap] section marker):

foreground = yes

With both of these solutions in place, we have achieved the following:

  • High availability.

    Our application will transparently make use of multiple directory servers. If a server fails, our application will continue to operate.

  • Security

    Our traffic is encrypted on the wire, regardless of whether the application has support for LDAP over SSL.

  • Visibility

    We are free to examine unencrypted traffic with a packet sniffer running on the local host.

Friday, February 19, 2010

Apache virtual host statistics

As part of a project I'm working on I wanted to get a rough idea of the activity of the Apache virtual hosts on the system. I wasn't able to find exactly what I wanted, so I refreshed my memory of curses to bring you vhoststats.

This tools reads an Apache log file (with support for arbitrary formats) and generates a dynamic bar chart showing the activity (in number of requests and bytes transferred) of hosts on the system. The output might look something like this (but with colors):

[2010/02/19 20:21:32] Hosts: 7 [Displayed: 7] Requests: 104

host1.companyA.com   [R:1         ]  #
                     [B:3         ]
devel.internal       [R:1         ]  #
                     [B:208       ]
host2.companyA.com   [R:1         ]  #
                     [B:4499      ]
A-truncated-host-nam [R:10        ]  ############
                     [B:65380     ]  #
host1.companyB.com   [R:21        ]  ##########################
                     [B:166715    ]  ####
www.google.com       [R:32        ]  #################################
                     [B:1566614   ]  ####################################

The tool keeps running totals over a five minute window, but you can change the window size on the command line. You can tail your active access log to see live results, or for a more exciting display you can just pipe in an existing log.

It's not pong, but I've found it useful.

You can download the code from the project page on GitHub.

Wednesday, February 17, 2010

Group filters for Gmail

I've recently spent some time trying to make my Gmail filters more useful. I was frustrated (as are some other people) that it's not possible to refer to contact groups in email filters.

Rather than laboriously duplicate my contact groups as hand-written filter expressions, I have written a short script that will query Google Contacts for all of your contact groups and then emit an XML document suitable for import using Gmail's "import filters" feature.

The code is available in the gmail-contact-filters project on GitHub.

If I am sufficiently motiviated I may turn this into a web service.

Tuesday, February 16, 2010

Merging directories with OpenLDAP's Meta backend

This document provides an example of using OpenLDAP's meta backend to provide a unified view of two distinct LDAP directory trees. I was frustrated by the lack of simple examples available when I went looking for information on this topic, so this is my attempt to make life easier for the next person looking to do the same thing.

The particular use case that motiviated my interest in this topic was the need to configure web applications to (a) authenticate against an existing Active Directory server while (b) also allowing new accounts to be provisioned quickly and without granting any access in the AD environment. A complicating factor is that the group managing the AD server(s) was not the group implementing the web applications.

Assumptions

I'm making several assumptions while writing this document:

  • You have root access on your system and are able to modify files in /etc/openldap and elsewhere on the filesystem.
  • You are somewhat familiar with LDAP.
  • You are somewhat familiar with OpenLDAP.

Set up backend directories

Configure slapd

We'll first create two "backend" LDAP directories. These will represent the directories you're trying to merge. For the purposes of this example we'll use the ldif backend, which stores data in LDIF format on the filesystem. This is great for testing (it's simple and easy to understand), but not so great for performance.

We define one backend like this in /etc/openldap/slapd-be-1.conf:

database        ldif
suffix          "ou=backend1"
directory       "/var/lib/ldap/backend1"
rootdn          "cn=ldif-admin,ou=backend1"
rootpw          "LDIF"

And a second backend like this in /etc/openldap/slapd-be-2.conf:

database        ldif
suffix          "ou=backend2"
directory       "/var/lib/ldap/backend2"
rootdn          "cn=ldif-admin,ou=backend2"
rootpw          "LDIF"

Now, we need to load these configs into the main slapd configuration file. Open slapd.conf, and look for the following comment:

#######################################################################
# ldbm and/or bdb database definitions
#######################################################################

Remove anything below this comment and then add the following lines:

include /etc/openldap/slapd-be-1.conf
include /etc/openldap/slapd-be-2.conf

Start up slapd

Start up your LDAP service:

# slapd -f slapd.conf -h ldap://localhost/

And check to make sure it's running:

# ps -fe | grep slapd
root 15087 1 0 22:48 ? 00:00:00 slapd -f slapd.conf -h ldap://localhost/

Populate backends with sample data

We need to populate the directories with something to query.

Put this in backend1.ldif:

dn: ou=backend1
objectClass: top
objectClass: organizationalUnit
ou: backend1

dn: ou=people,ou=backend1
objectClass: top
objectClass: organizationalUnit
ou: people

dn: cn=user1,ou=people,ou=backend1
objectClass: inetOrgPerson
cn: user1
givenName: user1
sn: Somebodyson
mail: user1@example.com

And this in backend2.ldif:

dn: ou=backend2
objectClass: top
objectClass: organizationalUnit
ou: backend2

dn: ou=people,ou=backend2
objectClass: top
objectClass: organizationalUnit
ou: people

dn: cn=user2,ou=people,ou=backend2
objectClass: inetOrgPerson
cn: user2
givenName: user2
sn: Somebodyson
mail: user2@example.com

And then load the data into the backends:

ldapadd -x -H ldap://localhost -D cn=ldif-admin,ou=backend1 \
  -w LDIF -f backend1.ldif

And:

ldapadd -x -H ldap://localhost -D cn=ldif-admin,ou=backend2 \
  -w LDIF -f backend2.ldif

You can verify that the data loaded correctly by issuing a query to the backends. E.g.:

ldapsearch -x -H ldap://localhost -b ou=backend1 -LLL

This should give you something that looks very much like the contents of backend1.ldif. You can do the same thing for backend2.

Set up meta database

We're now going to configure OpenLDAP's meta backend to merge the two directory trees. Complete documentation for the meta backend can be found in the slapd-meta man page.

Put the following into a file called slapd-frontend.conf (we'll discuss the details in moment):

database        meta
suffix          "dc=example,dc=com"

uri             "ldap://localhost/ou=backend1,dc=example,dc=com"
suffixmassage   "ou=backend1,dc=example,dc=com" "ou=backend1"

uri             "ldap://localhost/ou=backend2,dc=example,dc=com"
suffixmassage   "ou=backend2,dc=example,dc=com" "ou=backend2"

And then add to slapd.conf:

include /etc/openldap/slapd-frontend.conf

Restart slapd. Let's do a quick search to see exactly what we've accomplished:

$ ldapsearch -x -H 'ldap://localhost/' \
  -b dc=example,dc=com objectclass=inetOrgPerson -LLL
dn: cn=user1,ou=people,ou=backend1,dc=example,dc=com
objectClass: inetOrgPerson
cn: user1
givenName: user1
sn: Somebodyson
mail: user1@example.com

dn: cn=user2,ou=people,ou=backend2,dc=example,dc=com
objectClass: inetOrgPerson
cn: user2
givenName: user2
sn: Somebodyson
mail: user2@example.com

As you can see from the output above, a single query is now returning results from both backends, merged into the dc=example,dc=com hierarchy.

A closer look

Let's take a closer look at the meta backend configuration.

database        meta
suffix          "dc=example,dc=com"

The database statement begins a new database definition. The suffix statement identifies the namespace that will be served by this particular database.

Here is the proxy for backend1 (the entry for backend2 is virtually identical):

uri             "ldap://localhost/ou=backend1,dc=example,dc=com"
suffixmassage   "ou=backend1,dc=example,dc=com" "ou=backend1"

The uri statement defines the host (and port) serving the target directory tree. The full syntax of the uri statement is described in the slapd-meta man page; what we have here is a very simple example. The naming context of the URI must fall within the namespace defined in the suffix statement at the beginning of the database definition.

The suffixmassage statement performs simple rewriting on distinguished names. It directs slapd to replace ou=backend1,dc=example,dc=com with ou=backend1 when communicating with the backend directory (and vice-versa).

You can perform simple rewriting of attribute and object classes with the map statement. For example, if backend1 used a sAMAccountName attribute and our application was expecting a uid attribute, we could add this after the suffixmassage statement:

map attribute uid sAMAccountName

Conclusion

The sample configuration files, data, and code referenced in this post are available online in a github repository:

http://github.com/larsks/OpenLDAP-Metadirectory-Example

I hope you've found this post useful, or at least informative. If you have any comments or questions regarding this post, please log them as issues on GitHub. This will make them easier for me to track.

Sunday, February 14, 2010

Review: AppBrain Market Sync

AppBrain is a web site that lets you browse applications available from the Android Market. AppBrain Market Sync is a companion application that synchronizes your Android phone with your activities on the web site. This means you can:

  • Select mutliple applications to install on the website, then use AppBrain Market Sync on your phone to install them in one operation.
  • Select multiple applications to uninstall on the website, then use AppBrain Market Sync to uninstall them.

I like the model -- I find it easier browse the market from my laptop than on my phone, so this is generally a more convenient way to find applications. The fact that applications are actually installed from the Market avoids the security/reliability questions involved when installing applications from "alternative" sources.

http://chart.apis.google.com/chart?cht=qr&chs=150x150&chl=market://search?q=pname:com.appspot.swisscodemonkeys.apps

Review: ShapeWriter Input Method

http://lh6.ggpht.com/_JkU-TzQ-Co0/S3iqD3gu5gI/AAAAAAAAAE0/gg69njeoINE/s512/shapewriter.jpg

ShapeWriter is a novel input mechanism for mobile devices that uses the shape you create while sliding your fingers from letter to letter to accurately recognize words. Instead of pecking out letters on a virtual keyboard, you slide your finger (or thumb, more likely) from letter to letter in a smooth, continuous motion. While on the iPhone it is packaged as a note-taking application, on Android devices it is implemented as an input method, which means it is available pretty much anywhere you can enter text.

I've been using ShapeWriter on my Droid for several days now, and I've come to like it quite a bit. It's advantages become increasingly apparent on longer words: while mistakes can add up while "thumb typing" with a regularly keyboard, the unique shape formed as you slide along the letters in longer words greatly increases ShapeWriter's accuracy. ShapeWriter doesn't attempt to automatically capitalize words; a "cycle" button on the right of the keyboard cycles between all lower case, first letter capital, or all caps.

Since our fingers are seldom perfectly accurate, ShapeWriter uses a dictionary of words to help disambiguate among multiple possibilities. This is usually useful, although occasionally it makes inexplicably bad choices.

I've found that, overall, ShapeWriter offers increased speed and accuracy over the default virtual keyboard.

http://chart.apis.google.com/chart?cht=qr&chs=150x150&chl=market://search?q=pname:com.shapewriter.android.softkeyboard

Wednesday, February 10, 2010

Ksplice for RedHat

Ksplice, which offers "rebootless" kernel upgrades by performing in-memory patches to your running kernel, now offers their services for RHEL 4 & 5 and the equivalent CentOS variants.

Ksplice was originally developed at MIT, and was later turned into a for-profit venture. Ksplice has raised a small amount of controversy on the linux-kernel mailing list because it is, after all, patching your running kernel. In memory. This makes people a little nervous, but the author, Jeff Arnold, has tried to address many of the concerns.

I think the technology looks interesting. Certainly in the case of known kernel exploits it seems like it offers a happy medium between downtime for maintenance vs. not patching the system at all.

Filtering Blogger feeds

After encountering a number of problems trying to filter Blogger feeds by tag (using services like Feedrinse and Yahoo Pipes), I've finally put together a solution that works:

  • Shadow the feed with Feedburner.
  • Enable the Convert Format Burner, and convert your feed to RSS 2.0.
  • Use Yahoo Pipes to filter the feed (because Feedrinse seems to be broken).

This let me create a feed that excluded all my posts containing the fbpost tag, thus allowing me to avoid yet another postgasm in Facebook when adding new import URL to notes.

While fiddling with this I came across this article that discusses a number of tools (some no longer available) for processing RSS feeds.

Monday, February 8, 2010

Funny usage message

I was poking around in a command shell on my Droid to see what was available. While it's a pretty restricted environment, there's a number of commands available in /system/bin, including dexopt.

Apparently dexopt isn't something I'm supposed to poke at:

$ dexopt
Usage: don't use this

Hah.

Review: RealCalc calculator for Android

http://lh5.ggpht.com/_JkU-TzQ-Co0/S3BUMQb769I/AAAAAAAAAD8/XxzILqPo14M/realcalc_infix.jpg

RealCalc is a scientific calculator for your Android powered phone. It offers a well designed interface that looks pretty much like what you would expect from a scientific calculator. The keys are large enough to not feel cramped.

I spent my college years with an HP-48 calculator, which means I learned to love RPN notation. RealCalc has an RPN mode that you can turn on from the settings screen:

http://lh5.ggpht.com/_JkU-TzQ-Co0/S3BUMSC_u0I/AAAAAAAAAEA/vgcSmrIVVic/realcalc_rpn.jpg

It's a substantial improvement on the native Calculator application. My only complaint is that, once installed, I have two applications named "Calculator" and no easy way to differentiate them.

RealCalc is available for free from the Market.

http://chart.apis.google.com/chart?cht=qr&chs=150x150&chl=market://search?q=pname:uk.co.nickfines.RealCalc

Review: Bedside clock for Android

http://lh3.ggpht.com/_JkU-TzQ-Co0/S3BUMIoec5I/AAAAAAAAAD4/AAEOYQsz65c/bedtime_landscape.jpg

Bedside is a simple nighttime clock for your Android based phone. What differentiates it from other similar applications are its elegeant implementation and one or two particularly useful features that really make it stand apart.

My favorite feature is the simple fact that it can override your phone's default lock screen. With this option enabled, I can turn off my phone, and turn it on again later and still see the pleasantly dim clock, rather than the full brightness of the normal "slide the lock to use the phone" screen. This behavior is exactly what I want in something I'm using as a nighttime clock.

Bedtime can also make sure your phone is silent when the clock is activated. Again, this seems like a no-brainer, at least as an option.

There are number of other useful features:

  • Bedtime works great in both portrait and landscape mode.
  • You can select different options for font and color.
  • There is a text-to-speech option; when enabled, a long press on the screen will speak the time.

Bedtime costs $1.79 from the Market.

http://chart.apis.google.com/chart?cht=qr&chs=150x150&chl=market://search?q=pname:net.geekherd.bedsidepro2

Sunday, February 7, 2010

MBTA realtime XML feed

The MBTA has a trial web service interface that provides access to realtime location information for select MBTA buses, as well as access to route information, arrival prediction, and other features. More information can be found here:

http://www.eot.state.ma.us/developers/realtime/

The service is provided by NextBus, which specializes in real-time location information for public transit organizations. The API (sorry, PDF) is very simple and does not require any sort of advance registration.

At the moment, the service only provides coverage for a small number of routes (39, 111, 114, 116, 117). I hope they expand the coverage of this service in the near future!

Thursday, February 4, 2010

Blocking VNC with iptables

VNC clients use the RFB protocol to provide virtual display capabilities. The RFB protocol, as implemented by most clients, provides very poor authentication options. While passwords are not actually sent "in the clear", it is possible to brute force them based on information available on the wire. The RFB 3.x protocol limits passwords to a maximum of eight characters, so the potential key space is relatively small.

It's possible to securely connect to a remote VNC server by tunneling your connection using ssh port forwarding (or setting up some sort of SSL proxy). However, while this ameliorates the password problem, it still leaves a VNC server running that, depending on the local system configuration, may accept connections from all over the world. This leaves open the possibility that someone could brute force the password and gain access to the systsem. The problem is exacerbated if a user is running a passwordless VNC session.

My colleague and I looked into the options for blocking VNC connections using layer 7 packet classification. This means identifying the protocol in use by inspecting packet payloads, rather than relying exclusively on port numbers (this prevents clever or malicious users from circumventing the restrictions by running a service on a non-standard port). Unfortunately, the actual l7 netfilter module is not available in CentOS (or Fedora). But wait, all is not lost!

First, a brief digression into the RFB protocol used by VNC. After completing a standard TCP handshake, the client and server engage in a RFB handshake. The server first sents the string "RFB " followed by the RFB protocol version supported by the server. The client responds with a similar message.

The initial handshake packet from the server:

0000  00 00 0c 07 ac 34 00 21 86 14 e8 aa 08 00 45 00   .....4.!......E.
0010  00 40 e8 b7 40 00 40 06 b6 51 8c f7 34 e0 62 76   .@..@.@..Q..4.bv
0020  77 61 17 0d da ad ae 06 16 3f 22 48 92 cc 80 18   wa.......?"H....
0030  00 5b 9b e1 00 00 01 01 08 0a e8 b1 fe 88 24 f1   .[............$.
0040  e3 56 52 46 42 20 30 30 33 2e 30 30 38 0a         .VRFB 003.008.

And the response from the client:

0000  00 21 86 14 e8 aa 00 1a 30 4d 0c 00 08 00 45 40   .!......0M....E@
0010  00 40 e7 15 40 00 34 06 c3 b3 62 76 77 61 8c f7   .@..@.4...bvwa..
0020  34 e0 da ad 17 0d 22 48 92 cc ae 06 16 4b 80 18   4....."H.....K..
0030  ff ff 20 56 00 00 01 01 08 0a 24 f1 e3 57 e8 b1   .. V......$..W..
0040  fe 88 52 46 42 20 30 30 33 2e 30 30 38 0a         ..RFB 003.008.

Ergo: if we can match the string "RFB " at the beginning of the TCP payload on inbound packets, we have a reliable way of blocking VNC packets ergardless of port.

Looking through the iptables man page, we find:

u32
    U32  tests  whether quantities of up to 4 bytes extracted from
    a packet have specified values. The specification of what to
    extract is  general enough to find data at given offsets from
    tcp headers or payloads.

This looks especially appropriate, since our target match is exactly four bytes. Unfortunately, the syntax of the u32 module is a little baroque:

Example:

       match IP packets with total length >= 256
       The IP header contains a total length field in bytes 2-3.

       --u32 "0 & 0xFFFF = 0x100:0xFFFF"

Fortunately, the internet is our friend:

http://www.stearns.org/doc/iptables-u32.v0.1.7.html

This document provides a number of recipes designed for use with u32 module, including one that matches content at the beginning of the TCP payload. This gives us, ultimately:

iptables -A INPUT -p tcp \
  -m connbytes --connbytes 0:1024 \
    --connbytes-dir both --connbytes-mode bytes \
  -m state --state ESTABLISHED \
  -m u32 --u32 "0>>22&0x3C@ 12>>26&0x3C@ 0=0x52464220" \
  -j REJECT --reject-with tcp-reset

This means:

  • Match tcp packets only (-p tcp)
  • Match only during the first 1024 bytes of the connection (-m connbytes --connbytes 0:1024 --connbytes-dir both --connbytes-mode bytes)
  • Match only ESTABLISHED connections (-m state --state ESTABLISHED)
  • Match bytes "0x52464240" ("RFB ") at the beginning of the TCP payload (-m u32 --u32 "0>>22&0x3C@ 12>>26&0x3C@ 0=0x52464220")
  • Upon a match, force-close the connection with a RST packet. (-j REJECT --reject-with tcp-reset)

With this rule in place, all unenrypted VNC connections will be forcefully disconnected by the server.

Our original plan had been to try redirecting VNC traffic so that we could display a big "DON'T DO THAT" message, but this isn't possible -- by the time we match the client payload, the connection has already been established and is not amendable to redirection.

Update

We modified this rule to use the iptables string module to make the match more specific to further reduce the chances of false positives. The rule now looks like this:

iptables -A INPUT -p tcp \
  -m connbytes --connbytes 0:1024 \
    --connbytes-dir both --connbytes-mode bytes \
  -m state --state ESTABLISHED \
  -m u32 --u32 "0>>22&0x3C@ 12>>26&0x3C@ 0=0x52464220" \
  -m string --algo kmp --string "RFB 003." --to 130 \
  -j REJECT --reject-with tcp-reset

We thought about using the string module exclusively, but unlike the u32 module it is not possible to anchor the string match to the beginning of the TCP payload (since the ip and tcp headers may both be variable length).

GitBlogger is now on Freshmeat

I took a few minutes today to create a Freshmeat project for GitBlogger. It remains to be seen if anyone other than me things this is a good idea :).

Tuesday, February 2, 2010

NFS and the 16-group limit

I learned something new today: it appears that the underlying authorization mechanism used by NFS limits your group membership to 16 groups. From http://bit.ly/cBhU8N:

NFS is built on ONC RPC (Sun RPC). NFS depends on RPC for authentication and identification of users. Most NFS deployments use an RPC authentication flavor called AUTH_SYS (originally called AUTH_UNIX, but renamed to AUTH_SYS).

AUTH_SYS sends 3 important things:

  • A 32 bit numeric user identifier (what you'd see in the UNIX /etc/passwd file)
  • A 32 bit primary numeric group identifier (ditto)
  • A variable length list of up to 16 32-bit numeric supplemental group identifiers (what'd you see in the /etc/group file)

We ran into this today while diagnosing a weird permissions issue. Who knew?