Thursday, December 13, 2012

Multiple interface on same network are using same NIC for communication


Multiple interfaces on the same subnet

In the Linux implementation of the IP stack a IP address belongs to the host event though the administrator configures it on a devices. This can cause somewhat unexpected behaviour when multiple interfaces are configured to use the same network.

The network

 {network A}
            \       +--------------+
             -(eth0)| Linux server |
                    +--------------+
                     (eth2)  (eth3)
                       |       | 
                      {Network B}
When multiple interfaces are configured for the same network you must use policy routing to make the internal IP stack route the packages out on the designated interface. This is done by using the "ip route" command.

Prerequisites

The following options must be enabled in the kernel.
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_NETLINK_DEV=y
You also need the iproute suite, also known as iproute2. In Debian (and Debian derivatives) this is the iproute package.

Example configuration for two interfaces on the same IP subnet


In Debian (and Debian derivatives) the easiest way to add the additional routes on start-up is to use the up option in /etc/network/interfaces.
auto eth2
iface eth2 inet static
   address 192.168.1.20
   network 1192.168.1.0
   netmask 255.255.255.0
   broadcast 192.168.1.255
   up ip route add 192.168.1.0/24 dev eth2 proto kernel scope link src 192.168.61.20 table 20
   up ip route add default via 192.168.1.1 dev eth2 table 20
   up ip rule add from 192.168.1.20 lookup 20
 
auto eth3
iface eth3 inet static
   address 192.168.1.21
   network 192.168.1.0
   netmask 255.255.255.0
   broadcast 192.168.1.255
   up ip route add 192.168.1.0/24 dev eth3 proto kernel scope link src 192.168.61.21 table 30
   up ip route add default via 192.168.1.1 dev eth3 table 30
   up ip rule add from 192.168.1.21 lookup 30
Note:The table id is just a positive integer in the range 0-255 that identifies a unique table. When setting up multiple interfaces on the same subnet this id needs to be unique for each interface. In the example the interface number times 10 is used. Table id 0 and 253-255 are reserved for internal use and may not be used for this configuration.
For more information about advanced Linux routing please see read the Linux Advanced Routing & Traffic Control HOWTO.
Book tip: "Linux Network Internals"


another example:

Multiple interface on same subnet same machine can work fine. 
we need to use advance routing concepts with arp_filter option, make table for each interface and configure default routes and lookups. following three magic lines solved my problem.

Code:
ip route add 10.209.192.0/22 dev eth1 proto kernel scope link src 10.209.193.131 table tlb_1
ip route add default via 10.209.192.1 dev eth1 table tlb_1
ip rule add from 10.209.193.131 lookup tlb_1
ip route flush cache

Thursday, November 29, 2012

install torque client

 cd /software/
  934  ls
  935  cd torque/
  936  ls
  937  cd 4.1.2/
  938  ls
  939  ./torque-package-mom-linux-x86_64.sh --help
  940  ./torque-package-mom-linux-x86_64.sh --listfiles
  941  ls
  942  ./torque-package-clients-linux-x86_64.sh --listfiles
  943  ./torque-package-clients-linux-x86_64.sh --install
  944  clear
  945  qsub
  946  exit
  947  df -h
  948  cd /software/
  949  ls
  950  cd torque/
  951  la
  952  ls
  953  cd 4.1.2/
  954  ls
  955  ls -ltr
  956  pwd
  957  cd /software/toolss
  958  cd /software/tools
  959  ls
  960  cd ..
  961  ls
  962  cd torque/
  963  ls
  964  cd 4.1.2/
  965  ls
  966  ls -ltr
  967  sh ./torque-package-clients-linux-x86_64.sh
  968  sh ./torque-package-clients-linux-x86_64.sh --install --verbose
  969  ls
  970  cd ..
  971  ls
  972  cd ..
  973  ls
  974  pwd
  975  cd src/
  976  ls
  977  cd torque-4.1.2
  978  ls
  979  cd con
  980  ls
  981  cd contrib/
  982  ls
  983  cd init.d
  984  ls
  985  ls -l /etc/init.d|grep trq
  986  cp -a trqauthd /etc/init.d/
  987  cd /etc/init.d
  988  ls -ltr
  989  getent passwd
  990  ls
  991  service trqauthd start
  992  qstat
  993  pbsnodes -l
  994  chkconfig --add trqauthd
  995  chkconfig trqauthd
  996  chkconfig --list|grep trqauthd
  999  vim /etc/hosts



//modify the torquemaster, and add new host to the manager by "qmgr -c"

Tuesday, November 20, 2012

specified unpartitioned disk sda in partition command

When getting this error message in a kickstart;
specified unpartitioned disk sda in partition command
Its possibly due to dmraid information still present on the disk from an earlier installation. Verify by running (press F2 after kickstart halts):
$ dmraid -r -D /dev/sda
You should get information about the disk being member of a raidset.

The inforamation can be stored in different locations on the harddrive, to my knowledge its commonly in the end though. If you dont want to consider this further, you can delete the entire disk by running;
$ dd if=/dev/zero of=/dev/sda
It will take some time as the entire disk is being written to. If you want to make a more surgical approach, just erase the last couple of sectors. First run fdisk;
$ fdisk -l
Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 byte
Then run the following command. Adapt the values for ”bs” and ”seek” to suit your actual harddrive. You get the values from the fdisk output above. In the below example, i’m deleting the last 10 sectors of the disk (91201 – 10 = 91191):
dd if=/dev/zero of=/dev/sda bs=8225280 seek=91191 count=1
Verify that the dmraid information is actually gone;
dmraid -r -D /dev/sda
The output should now be different from before, saying the disk is not a member. If so, you should be able to use it to kickstart on. 

specified unpartitioned disk sda in partition command

When getting this error message in a kickstart;
specified unpartitioned disk sda in partition command
Its possibly due to dmraid information still present on the disk from an earlier installation. Verify by running (press F2 after kickstart halts):
$ dmraid -r -D /dev/sda
You should get information about the disk being member of a raidset.

The inforamation can be stored in different locations on the harddrive, to my knowledge its commonly in the end though. If you dont want to consider this further, you can delete the entire disk by running;
$ dd if=/dev/zero of=/dev/sda
It will take some time as the entire disk is being written to. If you want to make a more surgical approach, just erase the last couple of sectors. First run fdisk;
$ fdisk -l
Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 byte
Then run the following command. Adapt the values for ”bs” and ”seek” to suit your actual harddrive. You get the values from the fdisk output above. In the below example, i’m deleting the last 10 sectors of the disk (91201 – 10 = 91191):
dd if=/dev/zero of=/dev/sda bs=8225280 seek=91191 count=1
Verify that the dmraid information is actually gone;
dmraid -r -D /dev/sda
The output should now be different from before, saying the disk is not a member. If so, you should be able to use it to kickstart on. 

Tuesday, October 30, 2012

Predefined Macros

There are predefined macros that are used by most compilers, you can find the list [here]. GCC compiler predefined macros can be found [here].

http://sourceforge.net/p/predef/wiki/OperatingSystems/

http://gcc.gnu.org/onlinedocs/cpp/Predefined-Macros.html


#ifdef _WIN64
   //define something for Windows (64-bit)
#elif _WIN32
   //define something for Windows (32-bit)
#elif __APPLE__
    #include "TargetConditionals.h"
    #if TARGET_OS_IPHONE    
         // iOS device
    #elif TARGET_IPHONE_SIMULATOR
        // iOS Simulator
    #elif TARGET_OS_MAC
        // Other kinds of Mac OS
    #else
        // Unsupported platform
    #endif
#elif __linux
    // linux
#elif __unix // all unices not caught above
    // Unix
#elif __posix
    // POSIX
#endif

Thursday, October 11, 2012

centos 6.3 enter single user mode and fix lvm partiton

1) enter grub.
2) add "single" as kernel paramters
3) boot into single user mode.
4) if file system is read-only, do:
  mount -o remount,rw /
5) recreate the volume group:
 vgcreate new_vol_name /dev/sd*
6) create logical volume and use full volume groupt
lvcreate -l 100%FREE -n new_lv_name  your_vg_name
7)format the logical volume
a, http://blog.tsunanet.net/2011/08/mkfsxfs-raid10-optimal-performance.html
b,http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199

//comment: use the second (autodesk one) method is fast and stable.
remeber, if your disk has BBU setup. you need to disable barrier.
just do it while mount the xfs partion with "-o nobarrier". please google centos documents.

reference:
1) http://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/LV_create.html

2:

Root your box, and mount LVM partitions

I was teaching a friend how to root a box by adding:

init=/bin/bash
to the kernel line in grub, and then wanted to show him how to install some apps from the command line. I had never done this with LVM partitions, and was surprised when I got the following error:

File based locking initialization failed
Doh! I forgot to remount root read/write:

mount -o remount,rw /
Finally, I was able to mount all my LVM partitions with:
lvm vgscan
lvm vgchange -ay
lvm lvs
mount -a
haha! I'm getting rusty :P

Thursday, September 6, 2012

centos kickstarts configure file example


Linux Documentation Sucks

Every time I try to lookup how to do something in Linux, I get a deluge of out of date, incomplete, and just plain wrong documentation. This is the PXE/Kickstart guide I wish I would have read before I wasted 3 days. Thanks for nothing, RedHat documentation team.

Outline of the steps

* Obtain installation media
* Create Kickstart config file
* Setup NFS server
* Obtain PXE bootloader
* Create PXE config file
* Setup TFTP server
* Setup DHCP server

Installation Media

I was installing CentOS 5.5/x86_64 during this process, so I downloaded the two DVD images via torrent onto my NFS server. My BitTorrent client created the directory CentOS-5.5-x86_64-bin-DVD with the files:
CentOS-5.5-x86_64-bin-DVD-1of2.iso  md5sum.txt      sha1sum.txt      sha256sum.txt
CentOS-5.5-x86_64-bin-DVD-2of2.iso  md5sum.txt.asc  sha1sum.txt.asc  sha256sum.txt.asc
I moved this directory to /share/images to make it available via NFS.

Next I mounted the first ISO file as a loop image and copied the initrd and kernel to my DHCP server:
$ sudo mount /share/images/CentOS-5.5-x86_64-bin-DVD/CentOS-5.5-x86_64-bin-DVD-1of2.iso /mnt/dvd/ -t iso9660 -o loop
$ scp /mnt/dvd/images/pxeboot/*i* root@dhcp-server:/tftpboot

Kickstart File

I created the directory /share/kickstart for Kickstart config files on my NFS server.

I created the Kickstart file (test64-ks) using a previous CentOS install as a basis, and editing it based on snippets I found scattered around the 'Web.
# Kickstart file automatically generated by anaconda.
# Modified substantially by chort

install
nfs --server 10.25.0.129 --dir /share/images/CentOS-5.5-x86_64-bin-DVD/
#url --url http://mirror.centos.org/centos/5.4/os/x86_64
lang en_US.UTF-8
keyboard us

# don't define more NICs than you have, the install will bomb if you do
network --device eth0 --onboot yes --bootproto static --ip 10.25.42.139 --netmask 255.255.0.0 --gateway 10.25.0.1 --nameserver 10.25.0.5
#network --device eth1 --onboot no --bootproto dhcp
#network --device eth2 --onboot no --bootproto dhcp
#network --device eth3 --onboot no --bootproto dhcp

# grab the hash from an account in /etc/shadow that has the password you want to use
rootpw --iscrypted $1$fi0JeZ1p$Il0CxFxe0jqpNnkrOqC.0.
firewall --enabled --port=22:tcp
authconfig --enableshadow --enablemd5
selinux --disabled
timezone --utc America/Los_Angeles

bootloader --location=mbr --driveorder=sda
# The following is the partition information you requested
# Note that any partitions you deleted are not expressed
# here so unless you clear all partitions first, this is
# not guaranteed to work
clearpart --all --drives=sda
# 100MB /boot partition
part /boot --fstype ext3 --size=100 --ondisk=sda
# everything else goes to LVM
part pv.4 --size=0 --grow --ondisk=sda
volgroup VolGroup00 --pesize=32768 pv.4
# 2GB swap fs
logvol swap --fstype swap --name=LogVol01 --vgname=VolGroup00 --size=2048
# 5GB / fs
logvol / --fstype ext3 --name=LogVol00 --vgname=VolGroup00 --size=5120
# 10GB + remaining space for /opt fs
logvol /opt --fstype ext3 --name=LogVol02 --vgname=VolGroup00 --size=10240 --grow

%packages
@base
@core
@dialup
@editors
@text-internet
keyutils
trousers
fipscheck
device-mapper-multipath
bind
bind-chroot
bind-devel
caching-nameserver
compat-libstdc++-33
compat-glibc
gdb
ltrace
ntp
OpenIPMI-tools
screen
sendmail-cf
strace
sysstat
-bluez-utils

%post
/usr/bin/yum -y update >> /root/post_install.log 2>&1
/sbin/chkconfig --del bluetooth
/sbin/chkconfig --del cups
/sbin/chkconfig ntpd on
/sbin/chkconfig named on

NFS Server

Make sure NFS is enabled:
$ for i in nfs nfslock portmap ; do sudo chkconfig --list $i ; done

Edit /etc/exports to enable access to the share for the machines that will PXE boot:
# sample /etc/exports file
#/               master(rw) trusty(rw,no_root_squash)
#/projects       proj*.local.domain(rw)
#/usr            *.local.domain(ro) @trusted(rw)
#/home/joe       pc001(rw,all_squash,anonuid=150,anongid=100)
#/pub            (ro,insecure,all_squash)
#/pub            (ro,insecure,all_squash)

/share  *.bkeefer.se.example.com(ro,no_root_squash)

I restart the nfs service after I edit /etc/exports
$ sudo service nfs restart

Bootloader

Next, on the DHCP server, I grabbed the PXE bootloader from the syslinux package. You should be able to install this through yum:
$ sudo yum install syslinux

Copy the bootloader to the TFTP server directory:
$ sudo cp /usr/lib/syslinux/pxelinux.0 /tftpboot

Create the pxelinux.cfg directory in /tftpboot and edit the default file:
# You can have multiple kernels, if so name each with it's version
# This configuration only has one possible kernel so I didn't rename it
default linux
label linux
  kernel vmlinuz
  append ksdevice=eth0 load_ramdisk=1 initrd=initrd.img network ks=nfs:10.25.0.129:/share/kickstart/test64-ks

TFTP Server

Configure the TFTP server by editing /etc/xinetd.conf/tftp file:
# default: off
# description: The tftp server serves files using the trivial file transfer \
# protocol.  The tftp protocol is often used to boot diskless \
# workstations, download configuration files to network-aware printers, \
# and to start the installation process for some operating systems.
service tftp
{
 socket_type  = dgram
 protocol  = udp
 wait   = yes
 user   = root
 server   = /usr/sbin/in.tftpd
 server_args  = -vvs /tftpboot
 disable   = no
 per_source  = 11
 cps   = 100 2
 flags   = IPv4
}
I changed "disable = yes" -> "disable = no" and "server_args = -s /tftpboot" -> "server_args = -vvs /tftpboot". xinetd probably doesn't need to be restarted, but I did any way:
$ sudo service xinetd restart

I had only a single machine to boot, so I used a fixed IP base on the Ethernet address. Make sure you edit /var/lib/dhcp.lease* to erase references to the MAC and restart dhcpd. Here's the /etc/dhcpd.conf
shared-network SE-NET {

 subnet 10.25.42.0 netmask 255.255.255.0 {
  authoritative;
  allow booting;
  option routers   10.25.0.1;
  option subnet-mask  255.255.0.0;
  option domain-name  "bkeefer.se.example.com";
  option domain-name-servers 10.25.0.5;
  option time-offset  -28800;
  option ntp-servers  ntp.example.com;

  host test64 {
   hardware ethernet 00:0c:29:b3:81:99;
   fixed-address 10.25.42.139;
   next-server 10.25.0.5;
   filename "pxelinux.0";
  }
 }
}

I haven't had any luck with restarting dhcpd, so I do stop followed by start:
$ sudo service dhcpd stop && sudo service dhcpd start

Note that there are also forward and reverse DNS entries to match 10.25.42.139 to test64.bkeefer.se.example.com .

Final Step

At this point you should be able to edit the BIOS for the machine you're booting to make sure the network card is in the boot order (as long as there's no OS installed, it should boot off the NIC no matter where it is in the order).

Conclusion

There, was that so hard? You'd think with the hundreds of millions of dollars RedHat takes in every year they could afford to test their documentation, and maybe even write start-to-finish guides instead of disconnected snippets.

Please e-mail me at with any suggestions or feedback.  Thanks!






This site © copyright 2003-2010 Brian Keefer.  Opinions expressed on this site are my own and do not reflect those of my employer.

Wednesday, August 8, 2012

Perl CPAN install at user directory

Thursday, July 26, 2012

REST


Something to be careful about when designing a RESTful API is the conflation of GET and POST, as if they were the same thing. It's easy to make this mistake with Django's function-based views andCherryPy's default dispatcher, although both frameworks now provide a way around this problem (class-based views and MethodDispatcher, respectively).
HTTP-verbs are very important in REST, and unless you're very careful about this, you'll end up falling into a REST anti-pattern.
Some frameworks that get it right are web.pyFlask and Bottle. When combined with the mimerenderlibrary (full disclosure: I wrote it), they allow you to write nice RESTful webservices:
import webimport jsonfrom mimerender import mimerender

render_xml = lambda message: '<message>%s</message>'%message
render_json = lambda **args: json.dumps(args)
render_html = lambda message: '<html><body>%s</body></html>'%message
render_txt = lambda message: message

urls = (
    '/(.*)', 'greet'
)
app = web.application(urls, globals())
class greet:
    @mimerender(
        default = 'html',
        html = render_html,
        xml  = render_xml,
        json = render_json,
        txt  = render_txt
    )
    def GET(self, name):
        if not name: 
            name = 'world'
        return {'message': 'Hello, ' + name + '!'}
if __name__ == "__main__":
    app.run()
The service's logic is implemented only once, and the correct representation selection (Accept header) + dispatch to the proper render function (or template) is done in a tidy, transparent way.
$ curl localhost:8080/x<html><body>Hello, x!</body></html>

$ curl -H "Accept: application/html" localhost:8080/x<html><body>Hello, x!</body></html>

$ curl -H "Accept: application/xml" localhost:8080/x<message>Hello, x!</message>

$ curl -H "Accept: application/json" localhost:8080/x{'message':'Hello, x!'}

$ curl -H "Accept: text/plain" localhost:8080/xHello, x!
Update (April 2012): added information about Django's class-based views, CherryPy's MethodDispatcher and Flask and Bottle frameworks. Neither existed back when the question was asked

REST anti-patterns


When people start trying out REST, they usually start looking around for examples – and not only find a lot of examples that claim to be “RESTful”, or are labeled as a “REST API”, but also dig up a lot of discussions about why a specific service that claims to do REST actually fails to do so.

The usual standard disclaimer applies: REST, the Web, and HTTP are not the same thing; REST could be implemented with many different technologies, and HTTP is just one concrete architecture that happens to follow the REST architectural style. So I should actually be careful to distinguish “REST” from “RESTful HTTP”. I’m not, so let’s just assume the two are the same for the remainder of this article.
Why does this happen? HTTP is nothing new, but it has been applied in a wide variety of ways. Some of them were in line with the ideas the Web’s designers had in mind, but many were not. Applying REST principles to your HTTP applications, whether you build them for human consumption, for use by another program, or both, means that you do the exact opposite: You try to use the Web “correctly”, or if you object to the idea that one is “right” and one is “wrong”: in a RESTful way. For many, this is indeed a very new approach.
As with any new approach, it helps to be aware of some common patterns. In the firsttwo articles of this series, I’ve tried to outline some basic ones – such as the concept of collection resources, the mapping of calculation results to resources in their own right, or the use of syndication to model events. A future article will expand on these and other patterns. For this one, though, I want to focus on anti-patterns – typical examples of attempted RESTful HTTP usage that create problems and show that someone has attempted, but failed, to adopt REST ideas.
Let’s start with a quick list of anti-patterns I’ve managed to come up with:
  1. Tunneling everything through GET
  2. Tunneling everything through POST
  3. Ignoring caching
  4. Ignoring response codes
  5. Misusing cookies
  6. Forgetting hypermedia
  7. Ignoring MIME types
  8. Breaking self-descriptiveness
Let’s go through each of them in detail.

Tunneling everything through GET

To many people, REST simply means using HTTP to expose some application functionality. The fundamental and most important operation (strictly speaking, “verb” or “method” would be a better term) is an HTTP GET. A GET should retrieve a representation of a resource identified by a URI, but many, if not all existing HTTP libraries and server programming APIs make it extremely easy to view the URI not as a resource identifier, but as a convenient means to encode parameters. This leads to URIs like the following:
http://example.com/some-api?method=deleteCustomer&id=1234
The characters that make up a URI do not, in fact, tell you anything about the “RESTfulness” of a given system, but in this particular case, we can guess the GET will not be “safe”: The caller will likely be held responsible for the outcome (the deletion of a customer), although the spec says that GET is the wrong method to use for such cases.
The only thing in favor of this approach is that it’s very easy to program, and trivial to test from a browser – after all, you just need to paste a URI into your address bar, tweak some “parameters”, and off you go. The main problems with this anti-patterns are:
  1. Resources are not identified by URIs; rather, URIs are used to encode operations and their parameters
  2. The HTTP method does not necessarily match the semantics
  3. Such links are usually not intended to be bookmarked
  4. There is a risk that “crawlers” (e.g. from search engines such as Google) cause unintended side effects
Note that APIs that follow this anti-pattern might actually end up being accidentally restful. Here is an example:
http://example.com/some-api?method=findCustomer&id=1234
Is this a URI that identifies an operation and its parameters, or does it identify a resource? You could argue both cases: This might be a perfectly valid, bookmarkable URI; doing a GET on it might be “safe”; it might respond with different formats according to the Accept header, and support sophisticated caching. In many cases, this will be unintentional. Often, APIs start this way, exposing a “read” interface, but when developers start adding “write” functionality, you find out that the illusion breaks (it’s unlikely an update to a customer would occur via a PUT to this URI – the developer would probably create a new one).

Tunneling everything through POST

This anti-pattern is very similar to the first one, only that this time, the POST HTTP method is used. POST carries an entity body, not just a URI. A typical scenario uses a single URI to POST to, and varying messages to express differing intents. This is actually what SOAP 1.1 web services do when HTTP is used as a “transport protocol”: It’s actually the SOAP message, possibly including some WS-Addressing SOAP headers, that determines what happens.
One could argue that tunneling everything through POST shares all of the problems of the GET variant, it’s just a little harder to use and cannot explore caching (not even accidentally), nor support bookmarking. It actually doesn’t end up violating any REST principles so much – it simply ignores them.

Ignoring caching

Even if you use the verbs as they are intended to be used, you can still easily ruin caching opportunities. The easiest way to do so is by simply including a header such as this one in your HTTP response:
Cache-control: no-cache
Doing so will simply prevent caches from caching anything. Of course this may be what you intend to do, but more often than not it’s just a default setting that’s specified in your web framework. However, supporting efficient caching and re-validation is one of the key benefits of using RESTful HTTP. Sam Ruby suggests that a key question to ask when assessing somethings RESTfulness is “do you support ETags”? (ETags are a mechanism introduced in HTTP 1.1 to allow a client to validate whether a cached representation is still valid, by means of a cryptographic checksum). The easiest way to generate correct headers is to delegate this task to a piece of infrastructure that “knows” how to do this correctly – for example, by generating a file in a directory served by a Web server such as Apache HTTPD.
Of course there’s a client side to this, too: when you implement a programmatic client for a RESTful service, you should actually exploit the caching capabilities that are available, and not unnecessarily retrieve a representation again. For example, the server might have sent the information that the representation is to be considered “fresh” for 600 seconds after a first retrieval (e.g. because a back-end system is polled only every 30 minutes). There is absolutely no point in repeatedly requesting the same information in a shorter period. Similarly to the server side of things, going with a proxy cache such as Squid on the client side might be a better option than building this logic yourself.
Caching in HTTP is powerful and complex; for a very good guide, turn to Mark Nottingham’s Cache Tutorial.

Ignoring status codes

Unknown to many Web developers, HTTP has a very rich set of application-level status codes for dealing with different scenarios. Most of us are familiar with 200 (“OK”), 404 (“Not found”), and 500 (“Internal server error”). But there are many more, and using them correctly means that clients and servers can communicate on a semantically richer level.
For example, a 201 (“Created”) response code signals that a new resource has been created, the URI of which can be found in a Location header in the response. A 409 (“Conflict”) informs the client that there is a conflict, e.g. when a PUT is used with data based on an older version of a resource. A 412 (“Precondition Failed”) says that the server couldn’t meet the client’s expectations.
Another aspect of using status codes correctly affects the client: The status codes in different classes (e.g. all in the 2xx range, all in the 5xx range) are supposed to be treated according to a common overall approach – e.g. a client should treat all 2xx codes as success indicators, even if it hasn’t been coded to handle the specific code that has been returned.
Many applications that claim to be RESTful return only 200 or 500, or even 200 only (with a failure text contained in the response body – again, see SOAP). If you want, you can call this “tunneling errors through status code 200”, but whatever you consider to be the right term: if you don’t exploit the rich application semantics of HTTP’s status codes, you’re missing an opportunity for increased re-use, better interoperability, and looser coupling.

Misusing cookies

Using cookies to propagate a key to some server-side session state is another REST anti-pattern.
Cookies are a sure sign that something is not RESTful. Right? No; not necessarily. One of the key ideas of REST is statelessness – not in the sense that a server can not store any data: it’s fine if there is resource state, or client state. It’s session state that is disallowed due to scalability, reliability and coupling reasons. The most typical use of cookies is to store a key that links to some server-side data structure that is kept in memory. This means that the cookie, which the browser passes along with each request, is used to establish conversational, or session, state.
If a cookie is used to store some information, such as an authentication token, that the server can validate without reliance on session state, cookies are perfectly RESTful – with one caveat: They shouldn’t be used to encode information that can be transferred by other, more standardized means (e.g. in the URI, some standard header or – in rare cases – in the message body). For example, it’s preferable to use HTTP authentication from a RESTful HTTP point of view.

Forgetting hypermedia

The first REST idea that’s hard to accept is the standard set of methods. REST theory doesn’t specify which methods make up the standard set, it just says there should be a limited set that is applicable to all resources. HTTP fixes them at GET, PUT, POST and DELETE (primarily, at least), and casting all of your application semantics into just these four verbs takes some getting used to. But once you’ve done that, people start using a subset of what actually makes up REST – a sort of Web-based CRUD (Create, Read, Update, Delete) architecture. Applications that expose this anti-pattern are not really “unRESTful” (if there even is such a thing), they just fail to exploit another of REST’s core concepts: hypermedia as the engine of application state.
Hypermedia, the concept of linking things together, is what makes the Web a web – a connected set of resources, where applications move from one state to the next by following links. That might sound a little esoteric, but in fact there are some valid reasons for following this principle.
The first indicator of the “Forgetting hypermedia” anti-pattern is the absence of links in representations. There is often a recipe for constructing URIs on the client side, but the client never follows links because the server simply doesn’t send any. A slightly better variant uses a mixture of URI construction and link following, where links typically represent relations in the underlying data model. But ideally, a client should have to know a single URI only; everything else – individual URIs, as well as recipes for constructing them e.g. in case of queries – should be communicated via hypermedia, as links within resource representations. A good example is the Atom Publishing Protocol with its notion of service documents, which offer named elements for each collection within the domain that it describes. Finally, the possible state transitions the application can go through should be communicated dynamically, and the client should be able to follow them with as little before-hand knowledge of them as possible. A good example of this is HTML, which contains enough information for the browser to offer a fully dynamic interface to the user.
I considered adding “human readable URIs” as another anti-pattern. I did not, because I like readable and “hackable” URIs as much as anybody. But when someone starts with REST, they often waste endless hours in discussions about the “correct” URI design, but totally forget the hypermedia aspect. So my advice would be to limit the time you spend on finding the perfect URI design (after all, their just strings), and invest some of that energy into finding good places to provide links within your representations.

Ignoring MIME types

HTTP’s notion of content negotiation allows a client to retrieve different representations of resources based on its needs. For example, a resource might have a representation in different formats such as XML, JSON, or YAML, for consumption by consumers implemented in Java, JavaScript, and Ruby respectively. Or there might be a “machine-readable” format such as XML in addition to a PDF or JPEG version for humans. Or it might support both the v1.1 and the v1.2 versions of some custom representation format. In any case, while there may be good reasons for having one representation format only, it’s often an indication of another missed opportunity.
It’s probably obvious that the more unforeseen clients are able to (re-)use a service, the better. For this reason, it’s much better to rely on existing, pre-defined, widely-known formats than to invent proprietary ones – an argument that leads to the last anti-pattern addressed in this article.

Breaking self-descriptiveness

This anti-pattern is so common that it’s visible in almost every REST application, even in those created by those who call themselves “RESTafarians” – myself included: breaking the constraint of self-descriptiveness (which is an ideal that has less to do with AI science fiction than one might think at first glance). Ideally, a message – an HTTP request or HTTP response, including headers and the body – should contain enough information for any generic client, server or intermediary to be able to process it. For example, when your browser retrieves some protected resource’s PDF representation, you can see how all of the existing agreements in terms of standards kick in: some HTTP authentication exchange takes place, there might be some caching and/or revalidation, the content-type header sent by the server (“application/pdf”) triggers the startup of the PDF viewer registered on your system, and finally you can read the PDF on your screen. Any other user in the world could use his or her own infrastructure to perform the same request. If the server developer adds another content type, any of the server’s clients (or service’s consumers) just need to make sure they have the appropriate viewer installed.
Every time you invent your own headers, formats, or protocols you break the self-descriptiveness constraint to a certain degree. If you want to take an extreme position, anything not being standardized by an official standards body breaks this constraint, and can be considered a case of this anti-pattern. In practice, you strive for following standards as much as possible, and accept that some convention might only apply in a smaller domain (e.g. your service and the clients specifically developed against it).

Summary

Ever since the “Gang of Four” published their book, which kick-started the patterns movement, many people misunderstood it and tried to apply as many patterns as possible – a notion that has been ridiculed for equally as long. Patterns should be applied if, and only if, they match the context. Similarly, one could religiously try to avoid all of the anti-patterns in any given domain. In many cases, there are good reasons for violating any rule, or in REST terminology: relax any particular constraint. It’s fine to do so – but it’s useful to be aware of the fact, and then make a more informed decision.
Hopefully, this article helps you to avoid some of the most common pitfalls when starting your first REST projects.
Many thanks to Javier Botana and Burkhard Neppert for feedback on a draft of this article.
Stefan Tilkov is the lead editor of InfoQ’s SOA community and co-founder, principal consultant and lead RESTafarian of Germany/Switzerland-basedinnoQ.