Visual Neurons and Machines: March 3 Reading

Wednesday, February 26, 2014

March 3 Reading

[Mahon] Action-Related Properties Shape Object Representations in the Ventral Stream, Neuron 2007.

33 comments:

UnknownMarch 2, 2014 at 1:19 PM
Until now we have focused on semantic categories and attributes being the way how visual information is organized in the brain...in this paper we focus on the alternative specifically with respect to manipulable objects. The idea is that when it comes to finer categories in manipulable world...the arrangement in the brain might be with respect to the motor commands [this will make it efficient].

One thing I found interesting in this paper was that in the area of medial fusiform gyrus [the area this study is concerned with] there were high responses for non maipulable objects but no RS. RS was only observed for manipulable objects...The authors point out that RS means more finer level analysis which is what happens in this area.

I did not understand Fig 3C,D,E specifically what does it mean that region defined by tool vs. animals...did anyone else understand what they meant?
ReplyDelete
Replies
UnknownMarch 2, 2014 at 3:47 PM
I find figure 5 interesting, as it suggest a difference between manipulated objects and tools. This leads me to wonder if the BOLD signal changes in these areas reflect a high level conscious recognition that the object being viewed is a tool? I wonder if there are experiments that deal with objects which are visually ambiguous in terms of being a tool or not. For example, there might be tools that are shaped like other objects, or objects which are not normally identified as a tool, but which the subject has used extensively as a tool. (e.g. Someone who used a pencil to rewind cassette tapes might see it as more "toolish" than someone who has no knowledge of this function)
ReplyDelete
Replies
UnknownMarch 2, 2014 at 7:00 PM
I like the paper especially because of the supporting neuropsychological study. However there are some issues that I couldn't understand well.

Previously we talked about lesions of people who cannot identify objects visually but can identify them when they touch it.
Especially for the tools it seems that touching is more important in identification than seeing the tool.
Are the responses of those people with such a lesion similar to the normal cases?
That is, is RS more related to visual identification or identification by touching (even if we are not familiar with using the tool/manupulating it)?

While tools (such as hammers) are shown to be different from other objects (such
as books), they still are close to each other compared to animals. In this case I wonder how the results would change if the animals were selected only from the category of pets (cats, dogs) , that we can touch and the tests would be performed especially among the people who owns a pet.

I also wonder what would happen if the tool is changed so that its functionality is the same but visually it is different. Or what would happen if a new tool which has not been seen or used before was tested.

Going back to previous paper on visual attributes, is it possible that there are some functional attributes which allows the unseen categories to be recognised?

In this case, do we see chairs as the flat surfaces that we can sit. Or are there some categories that we use in our daily lives so often that, the recognition is more instance based, but when we see an unseen one we start by thinking about their attributes (visual or functional)?

ReplyDelete
Replies
AllieMarch 2, 2014 at 7:53 PM
Am I going to once again ask what the implications are here for the computer vision side of things: does this argument push for thinking more in terms of functional categories? I wonder whether the selectivity to tools is based on feedback from another brain area or whether visual categories are actually organized that way in the visual system. I'm not sure any of their experiments could answer that (if the brain is lesioned, connections from other areas to the lesioned area would presumably be destroyed; in the non-lesioned studies, I'm not sure there would be enough BOLD activity to identify that much activity from other relevant areas)

Also, I liked that the authors made a point about drawing conclusions from BOLD signals by reproducing the results of other studies and showing how they could generate a different conclusion (e.g. - at the bottom of page 517). They note that overall amplitude of BOLD signals can't be taken as evidence for how discriminative the regions are for categories. This kind of meta-research is probably even more helpful than the results themselves.
ReplyDelete
Replies
UnknownMarch 2, 2014 at 8:36 PM
This comment has been removed by the author.
ReplyDelete
Replies
IshanMarch 2, 2014 at 8:40 PM
This paper is organized well and shows clear experiments.

I was surprised by Fig 2. The left and right halves actually are biased in slightly different ways towards tools and arbitrary manipulable objects. I did not expect such asymmetry for a basic motor response. Then again, I think this is more due to feedback than feed-forward. The halves of the brain are more different at higher-level of information processing.

My final question::

This paper seems to suggest that there is a similarity metric over "motor space". I do not see why this should mean anything for computer vision researchers. It does not say that this similarity metric has anything to do with visual processing.
Couldn't one come up with an experiment to see if there is indeed a "visual link" at play here? What if I asked people to just think of the object rather than show it to them? If I get a similar conclusion in terms of specificity of the RS in the gyri, that does not tell me anything about visual processing? Am I correct?
ReplyDelete
Replies
UnknownMarch 2, 2014 at 8:41 PM
This comment has been removed by the author.
ReplyDelete
Replies
Kumar ShauryaMarch 2, 2014 at 9:07 PM
Well, a lot of the observations I had have already been made, so here are a few additional things I was thinking about

a. I find it very interesting that when shown a novel object a large region of neurons get activated, and under repetition apparently activity narrows down to what seems like a very specialized region - In terms of application to computer vision this seems to me a lot like firing a whole bunch of detectors in parallel to determine what subcategory of an object the brain is looking at and then using the specialized (as in our case) mushroom detector

b. I find the distinction between tools and generally manipulable objects interesting. For instance, I will generally only visualize one way of interacting with a barrel (rolling), yet it is considered as an arbitrary manipulated object, i.e. it's use is not necessarily determined just by how I can visualize using it. So it's not just the fact that I can think about moving/manipulating an object - it's use has to be completely defined by the action I would take.

As a side note, I found the language of the paper initially tough to follow because of the multiple references and general start-stop nature. The final section that compiled all the data, however, I felt was more consistent.
ReplyDelete
Replies
Abhinav ShrivastavaMarch 2, 2014 at 10:09 PM
Most of my thoughts are brought up by previous discussions, so I'll try to bring up most concerning (for me) things:

Since they point out that manipulable objects (book) are different from tools (hammer), apart from suggesting that there might be primary and secondary affordance (as Prof. Gupta pointed out), it might also suggest evolution or adaption to different categories.. i.e., since we use use (manipulate) book differently most of the time (and very often) than we use (manipulate) a hammer, we might have created a codebook of manipulations in our brain... we just use a shortcut and lookup how to manipulate that particular thing instead of computing and inferring the whole thing again and again. just a thought..

To clarify, we can test showing a very custom tool (e.g., used by surgeons or architects or construction workers) to a professional from that field and to a common man who don't use that tool/haven't used it ever or haven't seen it often, and see if we have similar neural response in the first try.. If so, it might suggest we inherently have different regions for tools and manipulable objects, if not, it might suggest that we are just adapting by doing a lookup to avoid doing the same work again and again.. -- again, just a thought...
ReplyDelete
Replies
UnknownMarch 3, 2014 at 1:00 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownMarch 3, 2014 at 1:03 AM
Lot of points have been already covered in discussion. But I will like to add one more point:-
Do we see object as whole for its functionality, or we see its smaller component and associate functionality to it. Assume if we see new object, and we don't know how to use it all, but still seeing its local features we can map it to some already seen object features and predicts its usage. For ex:- If object has sharp end it can be used for digging, or if object has hollow area it can be used for storing etc.
ReplyDelete
Replies
UnknownMarch 3, 2014 at 1:12 AM
This comment has been removed by the author.
ReplyDelete
Replies
PriyamMarch 3, 2014 at 1:16 AM
I found this paper more confusing than the ones we've read so far, more because of the heavy use of biological jargon and my lack of knowledge thereof.
Firstly I had rejected the idea of RS as being anything but helpful, but as Prof Gupta said, actually these signals indicate finer processing of the entities, thus surfacing hidden relations. It was interesting to see how data can be interpreted for more knowledge of the system.
Triggered by what Ishan said above, I think possibly computer vision researchers can see this as an example of 'priming' of objects. To attach certain meta-data after seeing them in different poses or 'employments' in an image. Which can be taken as a parallel to the 'motor space' that the paper talks about for the brain. And just thinking of this concept compels me to bring in image context as one of the features as well, I don't know how will that co-relate. Just a random thought.
ReplyDelete
Replies
UnknownMarch 3, 2014 at 8:51 AM
It is a good point to think that the recognition mechnism of very familiar objects (such as faces) can be quite different from that of unfamiliar or even unseen objects (thus unknown its functionality). For a specific instance, I noticed that a lot of papers had conjectured that familiar faces are generally recognized with some feature-based or structure-based representations, while unfamiliar faces are represented in some pictorial patterns.
ReplyDelete
Replies

Add comment