Apparently. If so, those perceptions are interpreted by the observer, of course, but in an evolved manner implicitly useful for surviving and breeding.
If A looks at a park, qualities such as that it's a park, that it has order, beauty, the exotic, convenience &c are all judgments made by A. That it has a stand of five palms is also an interpretation, in that A is the one who defines the frame of reference, chooses to include as relevant that all the tall vegetation is of a kind, that the kind is palms, that the field of reference is the immediate vicinity, and that within that field the number of palms is five. The same would be true if A were looking at a photo, though the options would likely be more limited.
The shapes are real, and out there. A interprets the sensory input via the eye (and where relevant the other senses) and can express the result in the common language and culture of A's peers.