Try to realize how do you recognize some faces. First, you grasp the entire picture at once. Second, you recognize not specific face but rather difference between faces. Think about why we think all people of other race than you are "similar". Think about why if even we don't know what we see we try to classify it and get answer "what it is". Third, we recognize not images but some "letters" which is present on an image. Then we collect these "letters" to form some "words", etc, etc.
How letter "A" could be recognized? Its description is "two lines which join in one point or go very close in one point, and which crossed by the third line". "line", "join", "point", "cross", etc are "words" or image recognition.