Abbott, A. L., & Ahuja, N. (1992, November). University of Illinois active vision system. In Applications in optical science and engineering (pp. 757–768). International Society for Optics and Photonics.
Ackermann, E. (2016). How google wants to solve robotic grasping by letting robots learn for themselves. IEEE Spectrum, March 28, //spectrum.ieee.org/automaton/robotics/artificial-intelligence/google-large-scale-robotic-grasping-project
Aksoy, E., Abramov, A., Dörr, J., Ning, K., Dellen, B., & Wörgötter, F. (2011). Learning the semantics of object-action relations by observation. The International Journal of Robotics Research, 30, 1229–1249.
Article Google Scholar
Allen, P. & Bajcsy, R. (1985). Two sensors are better than one: example of integration of vision and touch. Proceedings of 3rd ISRR, France, October.
Allen, P. K. (1985). Object recognition using vision and touch. PhD dissertation: University of Pennsylvania.
Aloimonos, J., Weiss, I., & Bandyopadhyay, A. (1988). Active vision. International Journal of Computer Vision, 1(4), 333–356.
Article Google Scholar
Aloimonos, J. (1990). Purposive and qualitative active vision. Proceedings of 10th IEEE International Conference on Pattern Recognition, vol. 1, pp. 346–360.
Aloimonos, Y. (Ed.). (2013). Active perception. Milton Park: Psychology Press.
Google Scholar
Alpert, S., Galun, M., Basri, R., & Brandt, A. (2007). Image segmentation by probabilistic bottom-up aggregation and cue integration. IEEE Transaction on PAMI, 34(2), 315–327.
Article Google Scholar
Alur, R. (2016). Principles of cyber-physical systems. Cambridge: MIT Press.
Google Scholar
Andreopoulos, A., & Tsotsos, J. K. (2007). A framework for door localization and door opening using a robotic wheelchair for people living with mobility impairments. In Robotics: Science and systems, Workshop: Robot manipulation: Sensing and adapting to the real world, Atlanta.
Andreopoulos, A., & Tsotsos, J. K. (2009). A theory of active object localization. In IEEE 12th international conference on computer vision, pp. 903–910.
Andreopoulos, A., Hasler, S., Wersing, H., Janssen, H., Tsotsos, J. K., & Körner, E. (2011). Active 3D object localization using a humanoid robot. IEEE Transactions on Robotics, 27(1), 47–64.
Article Google Scholar
Andreopoulos, A., & Tsotsos, J. K. (2013a). 50 Years of object recognition: Directions forward. Computer Vision and Image Understanding, 117, 827–891.
Article Google Scholar
Andreopoulos, A., & Tsotsos, J. K. (2013b). A computational learning theory of active object recognition under uncertainty. International Journal of Computer Vision, 101(1), 95–142.
MathSciNet Article MATH Google Scholar
Bajcsy, R. (1984). Shape from Touch. In G. N. Saridis (Ed.), Advances in Automation and Robotics. Stamford: JAI Press.
Google Scholar
Bajcsy, R. (1985). Active perception vs passive perception. Proceedings of 3rd IEEE workshop on computer vision: Representation and Control, October 13–16, Bellaire, MI. (Washington DC: IEEE Computer Society Press), pp 55–62.
Bajcsy, R. (1988). Active perception. Proceedings of the IEEE, 76(8), 966–1005.
Article Google Scholar
Bajcsy, R., & Campos, M. (1992). Active and exploratory Perception. CVGIP: Image Understanding, 56(1), 31–40.
Article MATH Google Scholar
Bajcsy, R., & Rosenthal, D. A. (1980). Visual and conceptual focus of attention. In S. Tanimoto & A. Klinger (Eds.), Structured computer vision (pp. 133–149). London, NY: Academic Press.
Google Scholar
Bajcsy, R. K., & Rosenthal, D. A. (1975). Visual focussing and defocussing-an essential part of the pattern recognition process. Pattern recognition and data structures: In Proceedings on IEEE Conference on Computer Graphics.
Bajcsy, R., McCarthy, M. J., & Trinkle, J. C. (1984). Feeling by Grasping. Proceedings of the IEEE international conference on robotics, Atlanta
Bajcsy, R. & Sinha, P. R. (1989). Exploration of surfaces for robot mobility In Proceedings of the Fourth international conference on CAD/CAM robotics and factories of the future, pp. 397–404, vol. III, Tata McGraw-Hill, New Delhi, India.
Ballard, D. (1991). Animate vision. Artificial Intelligence, 48(1), 57–86.
Barrow, H., & Popplestone, R. (1971). Relational descriptions in picture processing. In B. Meltzer & D. Michie (Eds.), Machine intelligence 6 (pp. 377–396). Edinburgh: Edinburgh University Press.
Google Scholar
Bestick, A.M., Burden, S.A., et al. (2015). Personalized kinematics for human-robot collaborative manipulation, IROS
Bogoni, L., & Bajcsy, R. (1994). Functionality investigation using a discrete event system approach. Journal of Robotics and Autonomous Systems, 13(3), 173–196.
Article Google Scholar
Bogoni, L., & Bajcsy, R. (1995). Interactive recognition and representation of functionality. Computer Vision and Image Understanding, 62(2), 194–214.
Article MATH Google Scholar
Bjorkman, M., & Eklundh, J. O. (2006). Vision in the real world: Finding, attending and recognizing objects. International Journal of Imaging Systems and Technology, 16(5), 189–208.
Article Google Scholar
Björkman, M. & Kragic, D. (2010). Active 3D scene segmentation and detection of unknown objects. In Proceedings of IEEE international conference on robotics and automation, pp. 3114–3120.
Borji, A., & Itti, L. (2013). State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 185–207.
Article Google Scholar
Brown, C. (1990). Prediction and cooperation in gaze control. Biological Cybernetics, 63(1), 61–70.
Article Google Scholar
Bruce, V., & Green, P. (1990). Visual perception: Physiology, psychology, and ecology (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Bruce, N., Wloka, C., Frosst, N., Rahman, S., & Tsotsos, J. K. (2015). On computational modeling of visual saliency: Understanding what’s right and what’s left, Special issue on computational models of visual attention. Vision Research, 116(Part B), 95–112.
Article Google Scholar
Burt, P. J. (1988). Attention mechanisms for vision in a dynamic world. In Proceedings on 9th international conference on pattern recognition, pp. 977–987.
Bylinksii, Z., DeGennaro, E., Rajalingham, R., Ruda, H., Jiang, J., & Tsotsos, J. K. (2015). Towards the quantitative evaluation of computational attention models, Special issue on computational models of visual attention. Vision Research, 116, 258–268.
Williams, T., Lowrance, J., Hanson, A., & Riseman, E. (1977). II. Model-building in the visions system 1. In IJCAI-77: 5th International joint conference on artificial intelligence-1977: Proceedings of the conference, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, August 22–25, 1977 (Vol. 2, p. 644).
Chen, S., Li, Y. F., Wang, W., & Zhang, J. (Eds.). (2008). Active sensor planning for multiview vision tasks (Vol. 1). Heidelberg: Springer.
Google Scholar
Chen, S., Li, Y., & Kwok, N. M. (2011). Active vision in robotic systems: A survey of recent developments. The International Journal of Robotics Research, 30(11), 1343–1377.
Article Google Scholar
Chessa, M., Solari, F., & Sabatini, S.P. (2009). A virtual reality simulator for active stereo vision systems. In VISAPP (2) (pp. 444–449).
Christensen, H. I. (1993). A low-cost robot camera head. International Journal of Pattern Recognition and Artificial Intelligence, 7(01), 69–87.
Article Google Scholar
Christensen, H. I., Bowyer, K. W., & Bunke, H. (Eds.). (1993). Active robot vision: Camera heads, model based navigation and reactive control (Vol. 7). Singapore: World Scientific Publishers.
MATH Google Scholar
Clark, J. J., & Ferrier, N. J. (1988). Modal control of an attentive vision system. In Proceeding on international conference on computer vision, pp. 514–523.
Coates, A., Abbeel, P., & Ng, A. Y. (2008). Learning for control from multiple demonstrations. In Proceedings of the 25th international conference on machine learning (pp. 144–151). New York: ACM.
Coombs, D. J., & Brown, C. M. (1990). Intelligent gaze control in binocular vision. In Proceedings of 5th IEEE international symposium on intelligent control (pp. 239–245). IEEE.
Crowley, J.L., Krotkov, E., & Brown, C. (1992). Active computer vision: A tutorial. In IEEE international conference on robotics and automation, Nice, France, May 11.
Crowley, J. L. & Christensen, H. I. (1995). Integration and control of active visual processes. Proceedings of IROS 95, Pittsburgh, August.
Crowley, J. L., Bobet, P. & Mesrabi, M. (1992). Gaze control for a binocular camera head. In Computer Vision—ECCV’92 (pp. 588–596). Springer: Berlin.
Dahiya, R. S., Metta, G., Valle, M., & Sandini, G. (2010). Tactile sensing–from humans to humanoids. IEEE Transactions on Robotics, 26(1), 1–20.
Article Google Scholar
Dantam, N., & Stilman, M. (2013). The motion grammar: Analysis of a linguistic method for robot control. Transactions on Robotics, 29, 704–718.
Article Google Scholar
Dickinson, S., Christensen, H., Tsotsos, J. K. & Olofsson, G. (1994). Active object recognition integrating attention and viewpoint control. In: Proceedings on European conference on computer vision, pp 2–14.
Du, F., Brady, M., & Murray, D. (1991). Gaze control for a two-eyed robot head. In: Proceedings on BMVC91 (pp. 193–201). London: Springer.
Ecins, C. F. & Aloimonos, Y. (2016). Cluttered Scene segmentation using the symmetry constraint. IEEE international conference on robotics and automation.
Fainekos, G. E., Kress-Gazit, H., & Pappas, G. J. (2005). Hybrid controllers for path planning: A temporal logic approach. In Proceedings of the Forty-fourth IEEE conference on decision and control (pp. 4885–4890). Philadelphia: IEEE.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Article Google Scholar
Fiala, J. C., Lumia, R., Roberts, K. J., & Wavering, A. J. (1994). TRICLOPS: A tool for studying active vision. International Journal of Computer Vision, 12(2–3), 231–250.
Article Google Scholar
Findlay, J. M., & Gilchrist, I. D. (2003). Active vision: The psychology of looking and seeing. Oxford: Oxford University Press.
Book Google Scholar
Fukushima, K. (1986). A neural network model for selective attention in visual pattern recognition. Biological Cybernetics, 55(1), 5–15.
Article MATH Google Scholar
Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119(2), 593–609.
Article Google Scholar
Garvey, T. D. (1976) Perceptual strategies for purposive vision. Doctoral Dissertation, Stanford University Stanford, CA.
Gibson, J. J. (1950). The perception of the visual world. Oxford: Houghton Mifflin.
Google Scholar
Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton, Mifflin and Company.
Google Scholar
Goldberg, K., & Bajcsy, R. (1984). Active touch and robot perception. Computation and Brain Theory, 7(2), 199–214.
Google Scholar
Kato, I., Ohteru, S., Kobayashi, H., Shirai, K., & Uchiyama, A. (1973). Information-power machine with senses and limbs. Proc (pp. 12–24). Udine, Italy: CISM-IFToMM sympoisum on theory and practice of robots and manipulators.
Kelly, M. (1971). Edge detection in pictures by computer using planning. Machine Intelligence, 6, 397–409.
Google Scholar
Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology, 4, 219–227.
Google Scholar
Kolev, S. & Todorov, E. (2015). Physically consistent state estimation and system identification for contacts. In IEEE/RAS international conference on humanoid robots.
Kollar, T., Tellex, S., Roy, D., & Roy, N. (2010). Toward understanding natural language directions. In Proceedings of human robot interaction conference. Osaka, Japan.
Košecká, J., & Bajcsy, R. (1994). Discrete event systems for autonomous mobile agents. Robotics and Autonomous Systems, 12(3), 187–198.
Google Scholar
Krotkov, E. P. (1988). Focusing. International Journal of Computer Vision, 1(3), 223–237.
Article Google Scholar
Krotkov, E. P. (1987). Exploratory visual sensing with a Agile Camera System, Ph.D. theses, TR-87-29, UPENN
Krotkov, E. P. (1989). Active computer vision by cooperative focus and stereo. Berlin: Springer.
Book MATH Google Scholar
Kuniyoshi, Y., Kita, N., Sugimoto, K., Nakamura, S. & Suehiro, T. (1995). A foveated wide angle lens for active vision. Proceedings on ICRA, pp 2982–2988
Lederman, S. J., & Klatzky, R. L. (1990). Haptic classification of common objects: Knowledge-driven exploration. Cognitive Psychology, 22(4), 421–459.
Article Google Scholar
Manikonda, V., Krishnaprasad, P. S., & Hendler, J. (1999). Languages, behaviors, hybrid architectures, and motion control. New York: Springer.
Book MATH Google Scholar
Maturana, H. R., & Varela, F. J. (1987). The tree of knowledge: The biological roots of human understanding. New Science Library/Shambhala Publications.
Meltzoff, A. N., & Moore, M. K. (1989). Imitation in newborn infants: Exploring the range of gestures imitated and the underlying mechanisms. Journal of Developmental Psychology, 25(6), 954.
Article Google Scholar
Maitin-Shepard, J., Cusumano-Towner, M., Lei, J., & Abbeel, P. (2010). Cloth grasp point detection based on multiple-view geometriccues with application to robotic towel folding. In 2010 IEEE International Conference on Robotics and Automation (ICRA) (pp. 2308–2315). IEEE.
Milios, E., Jenkin, M., & Tsotsos, J. K. (1993). Design and performance of TRISH, a binocular robot head with torsional eye movements, Special issue on active robot vision: Camera heads. International Journal of Pattern Recognition and Artificial Intelligence, 7(1), 51–68.
Article Google Scholar
Mishra, A., Aloimonos, Y. & Fah, C. L. (2009). Active segmentation with fixation. IEEE 12th international conference on computer vision, pp 468–475.
Mishra, A. K., Aloimonos, Y., Cheong, L. F., & Kassim, A. (2012a). Active visual segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 639–653.
Article Google Scholar
Mishra, A.K., Shrivastava, A., & Aloimonos, Y. (2012b). Segmenting “simple” objects using RGB-D. IEEE Proceedings on international conference on robotics and automation, pp 4406–4413.
Moravec, H. P. (1980). Obstacle avoidance and navigation in the real world by a seeing robot rover (No. STAN-CS-80-813). Stanford University CA, Dept of Computer Science.
Nilsson, N. J. (1969). A mobile automaton: An application of artificial intelligence techniques. Menlo Park, CA: Artificial Intelligence Center, SRI International.
Book Google Scholar
Pahlavan, K., & Eklundh, J.-O. (1992). A head-eye system: Analysis and deisgn. CVGIP: Image Understanding, 56(1), 41–56.
Article MATH Google Scholar
Pastore, N. (1971). Selective history of theories of visual perception (pp. 1650–1950). Oxford: Oxford University Press.
Google Scholar
Pastra, K., & Aloimonos, Y. (2012). The minimalist grammar of action. Philosophical Transactions of the Royal Society: Biological Sciences, 367, 103–117.
Article Google Scholar
Piaget, J. (1962). Play, dreams, and immitation on childhood. New York: W.W. Norton.
Google Scholar
Poletti, M., & Rucci, M. (2013). Active vision: Adapating how to look. Current Biology, 23(17), R718–R720.
Article Google Scholar
Pretlove, J. R. G., & Parker, G. A. (1993). The surrey attentive robot vision system. International Journal of Pattern Recognition and Artificial Intelligence, 7(01), 89–107.
Article Google Scholar
Rabie, T. F., & Terzopoulos, D. (2000). Active perception in virtual humans. Vision Interface (VI), pp 16–22.
Rasouli, A., & Tsotsos, J. K. (2014). Visual saliency improves autonomous visual search. In 2014 Canadian Conference on Computer and Robot Vision (CRV) (pp. 111–118). IEEE.
Rimey, R. D., & Brown, C. M. (1991). Controlling eye movements with hidden Markov models. International Journal of Computer Vision, 7(1), 47–65.
Article Google Scholar
Salas-Moreno, R., Newcombe, R., Strasdat, H., Kelly, P., & Davison, A. (2013). Slam++: Simultaneous localisation and mapping at the level of objects. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1352–1359.
Sandini, G., & Tagliasco, V. (1980). An anthropomorphic retina-like structure for scene analysis. Computer Graphics and Image Processing, 14(4), 365–372.
Article Google Scholar
Sinha, P. R. (1991). Robotic exploration of surfaces and its application to legged locomotion, Ph.D. dissertation, Mechanical Engineering and Applied Mechanics, University of Pennsylvania, Philadelphia.
Siskind, J. M. (2001). Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. Journal of Artificial Intelligence Research, 15, 31.
MATH Google Scholar
Soatto, S. (2013). Actionable information in vision. In Machine learning for computer vision (pp. 17–48). Berlin: Springer.
Soong, J. & Brown, C. M. (1991). Inverse kinematics and gaze stabilization for the Rochester robot head, TR 394, Computer Science Dept., U. Rochester.
Sprague, N., Ballard, D., & Robinson, A. (2007). Modeling embodied visual behaviors. ACM Transactions on Applied Perception (TAP), 4(2), 11.
Article Google Scholar
Sztipanovits, J., Koutsoukos, X., Karsai, G., Kottenstette, N., Antsaklis, P., Gupta, V., et al. (2012). Toward a science of cyber–physical system integration.Proceedings of the IEEE, 100(1).
Summers-Stay, D., Teo, C., Yang, Y., Fermüller, C., & Aloimonos, Y. (2013). Using a minimal action grammar for activity understanding in the real world. Proceedings of the 2013 IEEE/RSJ international conference on intelligent robots and systems (pp. 4104–4111). Vilamoura, Portugal.
Tellex, S. & Roy, D. (2009). Grounding spatial prepositions for video search. Proceedings of the eleventh international conference on multimodal interfaces (ICMI-2009). Cambridge, MA.
Tenenbaum, J. M. (1970). Accommodation in computer vision, Ph. D. Thesis, Computer Science Department Report, No. CS182, Stanford University, Stanford, CA.
Teo, C., Yang, Y., Daume, H., Fermüller, C., & Aloimonos, Y. (2012). Towards a Watson that sees: Language-guided action recognition for robots. Proceedings of the 2012 IEEE international conference on robotics and automation (pp. 374–381). Saint Paul, MN.
Teo, C. L., Fermüller, C., & Aloimonos, Y. (2015). A Gestaltist approach to contour-based object recognition: Combining bottom-up and top-down cues. The International Journal of Robotics Research, 0278364914558493.
Teo, C.L., Myers, A., Fermuller, C. & Aloimonos, Y. (2013). Embedding high-level information into low level vision: Efficient object search in clutter. In Proceedings on IEEE international conference on robotics and automation, pp. 126–132.
Thrun, S., Burgard, W., & Fox, D. (2000). Probabilistic robotics. Cambridge: MIT Press.
MATH Google Scholar
Terzopoulos, D., & Rabie, T. F. (1995). Animat vision: Active vision in artificial animals. In Proceedings on fifth international conference on computer vision, 1995 (pp. 801–808). IEEE.
Terzopoulos, D., & Rabie, T. (1997). Animat vision: Active vision in artificial animals. Videre: Journal of Computer Vision Research, 1(1), 2–19.
Google Scholar
Terzopoulos, D. (2003). Perceptive agents and systems in virtual reality. In Proceedings of the ACM symposium on Virtual reality software and technology, pp. 1–3.
Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12(1), 97–136.
Article Google Scholar
Trinkle, J. C., Tzitzouris, J. A., & Pang, J. S. (2001). Dynamic multi-rigid-body systems with concurrent distributed contacts: Theory and examples. Philosophical Transactions: Mathematical, Physical, and Engineering Sciences, 359(1789), 2575–2593.
MathSciNet Article MATH Google Scholar
Tsikos, C. J. (1987). Segmentation of 3D scenes using multimodal Interaction between machine vision and programmable mechanical Scene Manipulation, Ph. D dissertation, Department of Computer and information Science, University of Pennsylvania, December.
Tsikos, C. J. & Bajcsy, R. (1991). Segmentation via manipulation. IEEE Transaction on Robotics and Automation, 7(3).
Tsotsos, J. K. (1977). Knowledge-base driven analysis of cinecardioangiograms. In Proceedings of the 5th international joint conference on Artificial intelligence. Vol. 2 (pp. 699–699). San Mateo: Morgan Kaufmann Publishers Inc.
Tsotsos, J.K., Mylopoulos, J., Cowey, H. D., & Zucker, S. W. (1979). ALVEN: A study on motion understanding by computer. In Proceedings of the 6th international joint conference on Artificial intelligence (Vol. 2, pp. 890–892). San Mateo: Morgan Kaufmann Publishers Inc.
Tsotsos, J. K. (1980). a framework for visual motion understanding, Ph.D. Dissertation, Department of Computer Science, CSRG TR-114, University of Toronto, May.
Tsotsos, J. K. (1987). A “Complexity Level” Analysis of Vision. Proceedings of the 1st international conference on computer vision, pp. 346 – 55, London, UK.
Tsotsos, J. K. (1989). The complexity of perceptual search tasks. In IJCAI (Vol. 89, pp. 1571–1577).
Tsotsos, J. K. (1992). On the relative complexity of passive vs active visual search. International Journal of Computer Vision, 7(2), 127–141.
Article Google Scholar
Tsotsos, J. K., Verghese, G., Dickinson, S., Jenkin, M., Jepson, A., Milios, E., et al. (1998). Playbot a visually-guided robot for physically disabled children. Image and Vision Computing, 16(4), 275–292.
Article Google Scholar
Tsotsos, J. K., Itti, L., & Rees, G. (2005). A brief and selective history of attention. In L. Itti, G. Rees, & J. K. Tsotsos (Eds.), Neurobiology of attention. Amsterdam: Elsevier Press, pp. xxiii - xxxii.
Google Scholar
Tsotsos, J. K., & Shubina, K. (2007). Attention and visual search: Active robotic vision systems that search. Proceedings of The 5th international conference on computer vision systems, pp. 21–24.
Tsotsos, J. K. (2011). A computational perspective on visual attention. Cambridge, MA: MIT Press.
Book Google Scholar
Yang, Y., Fermüller, C., & Aloimonos, Y. (2013). Detection of manipulation action consequences (MAC). Proceedings of the 2013 IEEE conference on computer vision and pattern recognition (pp. 2563–2570). Portland, OR: IEEE.
Yang, Y., Fermuller, C., Aloimonos, Y., & Aksoy, E. E. (2015). Learning the semantics of manipulation action the 53rd annual meeting of the association for computational linguistics. Beijing: ACL.
Yang, Y., Guha, A., Fermuller, C., & Aloimonos, Y. (2014). A cognitive system for understanding human manipulation actions. Advances in Cognitive Sysytems, 3, 67–86.
Google Scholar
Yang, Y., Li, Y., Fermuller, C. & Aloimonos, Y. (2015). Robot learning manipulation action plans by “Watching” unconstrained videos from the world wide web, the twenty-ninth AAAI conference on artificial intelligence.
Ye Y., & Tsotsos, J. K. (1995). Where to look next in 3D object search. In Proceedings of international symposium on computer vision, pp 539–544.
Ye, Y., & Tsotsos, J. K. (1996). 3D Sensor planning: Its formulation and complexity. In Kautz, H. & Selman, B. (Eds.), Proceedings on 4th international symposium on artificial intelligence and mathematics. Fort Lauderdale, FL.
Ye, Y., & Tsotsos, J. K. (1999). Sensor planning for object search. Computer Vision and Image Understanding, 73(2), 145–168.
Article Google Scholar
Ye, Y., & Tsotsos, J. K. (2001). A complexity-level analysis of the sensor planning task for object search. Computational Intelligence, 17(4), 605–620.
MathSciNet Article Google Scholar
Yu, X., Fermüller, C., Teo, C. L., Yang, Y. & Aloimonos, Y. (2011). Active scene recognition with vision and language. IEEE International Conference on Computer Vision (ICCV), pp 810–817.
Vernon, D. (2008). Cognitive vision: The case for embodied perception. Image and Vision Computing, 26(1), 127–140.
Article Google Scholar
Weng, J. (2003). Developmental Robots. International Journal of Humanoid Robots, 1(2), 109–128.
Google Scholar
Wade, N. J., & Wade, N. (2000). A natural history of vision. Cambridge: MIT Press.
MATH Google Scholar
Wilkes, D., & Tsotsos, J. K. (1992). Active object recognition. In Proceedings of computer vision and pattern recognition, pp. 136–141.
Wörgötter, F., Aksoy, E. E., Kruger, N., Piater, J., Ude, A., & Tamosiunaite, M. (2012). A simple ontology of manipulation actions based on hand-object relations. IEEE Transactions on Autonomous Mental Development, 1, 117–134.
Google Scholar
Worch, J.-H., Balint-Benczedi, F., and Beetz, M. (2015). Perception for Everyday Human Robot Interaction,KI - Kunstliche Intelligenz, pp.1-7,Springer Berlin-Heidelberg.
Zampogiannis, K., Yang, Y., Fermuller, C., & Aloimonos, Y. (2015). Learning the spatial semantics of manipulation actions through preposition grounding. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 1389–1396). IEEE.
Page 2
From: Revisiting active perception
Why | The current state of the agent determines what its next actions might be based on the expectations that its state generates. These are termed Expectation-Action tuples. This would rely on any form of inductive inference (inductive generalization, Bayesian inference, analogical reasoning, prediction, etc.) because inductive reasoning takes specific information (premises) and makes a broader generalization (conclusion) that is considered probable. The only way to know is to test the conclusion. A fixed, pre-specified, control loop is not within this definition |
What | Each expectation applies to a specific subset of the world that can be sensed (visual field, tactile field, etc.) and any subsequent action would be executed within that field. We may call this Scene Selection |
How | A variety of actions must precede the execution of a sensing or perceiving action. The agent must be placed appropriately within the sensory field (Mechanical Alignment). The sensing geometry must be set to enable the best sensing action for the agent’s expectations (Sensor Alignment, including components internal to a sensor such as focus, light levels, etc.). Finally, the agent’s perception mechanism must be adapted to be most receptive for interpretation of sensing results, both specific to current agent expectations as well as more general world knowledge (Priming) |
When | An agent expectation requires Temporal Selection, that is, each expectation has a temporal component that prescribes when is it valid and with what duration |
Where | The sensory elements of each expectation can only be sensed from a particular viewpoint and its determination is modality specific. For example, how an agent determines a viewpoint for a visual scene differs from how it does so for a tactile surface. The specifics of the sensor and the geometry of its interaction with its domain combine to accomplish this. This will be termed the Viewpoint Selection process |