(Or "robot", "crawler") A program that automatically explores the web by retrieving a document and recursively retrieving some or all the documents that are referenced in it. This is in contrast with a normal web browser operated by a human that doesn't automatically follow links other than inline images and URL redirection.
The algorithm used to pick which references to follow strongly depends on the program's purpose. Index-building spiders usually retrieve a significant proportion of the references. The other extreme is spiders that try to validate the references in a set of documents; these usually do not retrieve any of the links apart from redirections.
The standard for robot exclusion is designed to avoid some problems with spiders.