I was involved in a conversation over the past weekend where I found it necessary to be specific about defining how “clusters” and “grids” differ.

As terms used in the big data world, they are often interchanged, so I figured – what the heck – I’ll knock up a post about it.

It’s actually pretty simple stuff:

Cluster – interconnected compute resources on the same physical network that are made up of hardware nodes that are of similar physical characteristic and configuration. Microsoft people can think of Windows Server and SQL Server clusters where it is a requirement for the clustered machines (whether they are physical or virtual) to be of very similar types.

Grid – interconnected compute resources that are typically distributed geographically, therefore unlikely to be on the same physical network, and that have a varied range of characteristic and configuration. Techie Geeks can think of the SETI project – many folks around the world with all kinds of computers are grinding away with a little piece of software that is then passing data back to the mothership for further processing.

