3D means three-dimensional, i.e. something that has width, height and depth (length). Our physical environment is three-dimensional and we move around in 3D every day.
Humans are able to perceive the spatial relationship between objects just by looking at them because we have 3D perception, also known as depth perception. As we look around, the retina in each eye forms a two-dimensional image of our surroundings and our brain processes these two images into a 3D visual experience.
However it's important to note that having vision in both eyes (stereoscopic or binocular vision) is not the only way to see in 3D. People who can only see with one eye (monocular vision) can still perceive the world in 3D, and may even be unaware that they are stereo blind. They are simply missing one of the tools to see in 3D, so they rely on others without thinking about it.
Here are some of the tools humans use for depth perception:
Stereoscopic vision: Two eyes provide slightly separate images; closer objects appear more separated than distant ones.
Accommodation: As you focus on a close or distant object, the lenses in your eyes physically change shape, providing a clue as to how far away the object is.
Parallax: As your head moves from side to side, closer objects appear to move more than distant ones.
Size familiarity: If you know the approximate size of an object, you can tell approximately how far away it is based on how big it looks. Similarly, if you know that two objects are a similar size to each other but one appears larger than the other, you will assume the larger object is closer.
Aerial perspective: Because light is scattered randomly by air, distant objects appear to have less contrast than nearby objects. Distant objects also appear less color-saturated and have a slight color tinge similar to the background (usually blue).
In order to represent the 3D world on a flat (2D) surface such as a display screen, it's desirable to simulate as many of these perception tools as possible. Although there is currently no way to simulate all of them at the same time, video does use a combination. For example, aerial perspective and size familiarity are automatically captured by the video camera. In CGI scenes, aerial perspective must be added so that distant objects appear less clearly (this is called distance fog).
Of course the addition of stereoscopic images (a separate image for each eye) is a significant improvement—so much so that most people think of stereoscopic films as being 3D, and all others as being 2D.