Visual Neurons and Machines: Information Processing: top-down vs. bottom up

Sunday, April 13, 2014

Information Processing: top-down vs. bottom up

Bar 2006, Top-down facilitation of visual recognition

27 comments:

UnknownApril 13, 2014 at 11:31 AM
I think one interesting question that's worth considering, irrespective of the experimental content is the idea from the intro:

"The gist of this proposal is that a partially analyzed version of the input image (i.e., a blurred image) .... is projected rapidly from early visual areas directly to the prefrontal cortex ... This coarse representation is subsequently used to activate predictions about the most likely interpretations of the input image in recognition-related regions within the temporal cortex. Combining this top-down ‘‘initial guess’’ with the bottom-up systematic analysis facilitates recognition by substantially __limiting__ the number of object representations that need to be considered (Fig. 1)."

One interesting question worth considering is: irrespective of time constraints (remember: things that were impossible a decade ago are now real-time), is there merit in this idea for computer vision? In particular, I think the term "limit" here is key: this isn't some huge graphical model where you look for zebras, policemen, and whether you're on 42nd street in Manhattan, and turn the crank on inference. This analyze-all-the-evidence approach, if you can factorize the model properly and get enough data, seems to be common-sense (although it might not help, or might require the right representation). Instead, I see this as not running the zebra detectors, based on the gist of the scene. Will this help? I'm curious where people stand on this.
ReplyDelete
Replies
AllieApril 13, 2014 at 11:49 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownApril 13, 2014 at 12:13 PM
This paper suggested a model for top-down processing in object recognition, and detailed the experiments performed to verify the model. The time dimension is added to the experiments, and I find this the most interesting aspect of this paper.

Something which I have been wondering about is what information should be sent to the higher areas to be processed. Intuitively, we need to have something which is cheap to compute and communicate, but strongly indicative of what the object being viewed is. In this paper it is suggested that this information is a blurred/low spatial frequency version of the image. I wonder if high spatial frequency features (and other things) might be sent over as well.

My reason for raising this is that if we just take the blurred image, then the top down would perform very badly on objects which are almost entirely obscured. However, in many cases humans are able to mentally hallucinate to fill in the gaps, even when the overall shape of the object is not visible (but a few distinctive parts of the object are). This seems to suggest to me that some top-down is happening, and the top-down is relying on high-frequency details.
ReplyDelete
Replies
UnknownApril 13, 2014 at 12:56 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownApril 13, 2014 at 8:17 PM
It seems people have used 'context' to study top-down information. Is there some other approach/way, we can come out to understand this thing?
ReplyDelete
Replies
UnknownApril 13, 2014 at 10:10 PM
It's not addressed as much here, but I wonder what happens when we start to think of this happening in multiple stages.

You might imagine a process where the texture indication of cloth causes your brain to hallucinate different types of cloth, find the one that matches best, then use that type of cloth to hallucinate known objects made of that kind of cloth, etc.

In this way, every feature/attribute/object has some implication which suggests a set of other things that may or may not exist. Each of these other things has a model that can be used to verify that suggestion. Once verified, that object has it's own implications that suggest objects even further up the chain.
ReplyDelete
Replies
GauravApril 14, 2014 at 5:10 AM
I find the OFC role in vision fascinating as it is most probably a survival mechanism for humans to react fast based on incomplete information.
Apart from providing a prior for object recognition, OFC might be the brain part that directs our attention to things in the environment for closer inspection. I believe this is a good thing for a robot which can be primed for a task and can quickly recognize areas of interest in an image.
ReplyDelete
Replies
M AravindhApril 14, 2014 at 6:16 AM
The authors describe top down information flowing from OFC to temporal cortex as object proposals. But our readings from the first few weeks suggest the signals in our temporal cortex form a continuous semantic space. In this context, I feel we should interpret top down information from OFC to temporal cortex as a scaling matrix along the principle components of the semantic space. This mechanism is possibly less prone to errors - pruning too much too early will reduce recall. This mechanism is also more possible - its might be much easier to regress the scaling matrix from LSF when compared to generating object identity guesses.
ReplyDelete
Replies
Kumar ShauryaApril 14, 2014 at 8:38 AM
One thing I found mildly amusing was that for all their claims of top down processing being beneficial for processing more complex scenes, in the experiments at best they used grayscale images of objects without a background. Wouldn't it have been a more compelling story had they used natural scenes and colour?
ReplyDelete
Replies
UnknownApril 14, 2014 at 3:38 PM
One takeaway from today's class: the top-down information is sent all the way to the prefrontal cortex (OFC) as it can be transmitted for decision making to not just visual but other areas of the brain, such as auditory, speech, motor regions.
ReplyDelete
Replies

Add comment