Suman Ghosh
Projects
Event-driven, Bio-inspired Online Depth Estimation for Scene Exploration on the iCub
Depth perception in humanoid robots is critical for motion-based tasks like manipulation, obstacle avoidance, and collaboration with other humans. In computer vision, traditional frame based cameras have been primarily used to solve the stereo-correspondence and depth estimation problem. These systems tend to be slow and computationally expensive since they involve feature matching for periodic static frames. Inspired by the human vision system, we develop a pipeline to compute real-time depth maps using eventbased cameras. Event cameras are neuromorphic sensors designed to trigger an input event only the change in illumination at a specific image location crosses a predefined threshold. This enables us to exclude redundant information from static parts of the scene and build systems with low latency, high dynamic range, and thus low computational overheads. Furthermore, the temporal correspondence of events generated from the left and right cameras provide additional support for solving the stereo matching problem. In this work, we used event-driven cameras mounted on the iCub humanoid robot to design an online stereo-based depth estimation system. Each event is processed asynchronously as and when they appear, with a matrix-based cooperative stereo-matching network that represents the functionality of a Spiking Neural Network. Based on principles of within-disparity continuity and cross-disparity uniqueness, the cooperative network model computes disparity value for each input event via a Winner-Takes-All mechanism. The output of the network, sampled at 0.1s intervals, produces a dynamic disparity map. Results demonstrate real-time performance of our system on the robot in the presence of noise and ambiguities.
​
Natural Multi-modal Robot Tele-operation using Qualitative Spatial Relations
We devised an approach for teleoperating a mobile robot based on qualitative spatial relations, which are instructed through speech-based and deictic commands. Given a workspace containing a robot, a user and some objects, we exploit fuzzy reasoning criteria to describe the pertinence map between the locations in the workspace and qualitative commands incrementally acquired. We discuss the modularity features of the used reasoning technique through some use cases addressing a conjunction of spatial kernels. In particular, we address the problem of finding a suitable target location from a set of qualitative spatial relations based on symbolic reasoning and Monte Carlo simulations. Our architecture is analyzed in a scenario considering simple kernels and an almost-perfect perception of the environment. The presented approach is modular and scalable, and it could be also exploited to design application where multi-modal qualitative interactions are considered.
​
[Publication] [GitHub]
Reinforcement Learning-based Modeling and Runtime Repair of Safe Stand-up Routine for a Bipedal Robot
Standing up is a critical recovery task for bipedal locomotion due to high occurrence of falls in legged humanoid robots. A reliable stand-up strategy is required to ensure that the robot avoids certain unsafe states (collision, fall etc.) that could cause critical failure. We use epsilon-greedy Q-Learning around a state space formed using a scripted action sequence, to model the stand-up routine as a parametric Discrete Time Markov Chain. We subsequently use Probabilistic Model Checking and repair the learned model greedily to minimize reachability of unsafe states. We focus on monitoring the robot during runtime to catch discrepancies between the learned and real model, caused by sudden environmental changes. We test the framework extensively by introducing synthetic faults in the robot model that were absent during initial modeling. Results illustrate that our framework can successfully repair learned strategies when the faults introduced are not catastrophic
​
VizLens / ApplianceReader: A Wearable, Crowd-sourced, Vision-based System to Make Appliances Accessible for Visually Impaired Users
Visually impaired people have difficulty using everyday appliances in unfamiliar setups. ApplianceReader combines a wearable point-of-view camera with on-demand crowdsourcing and computer vision to make appliance interfaces accessible. The system sends pre-processed photos of the unseen appliance interface to online crowd workers, who work in parallel to quickly label and describe elements of the interface in depth. During appliance operation, the system takes in live video stream from the wearable camera (eg. Google Glass). Based on the task to be done, we use object recognition and feature matching to find targets within the live video stream. Computer vision techniques are used to track the user's finger pointing at the controls. The system then uses audio feedback to intuitively guide the user’s finger from its current location to the target location minimizing the number of turns and backtracking. This enables blind users to interactively explore and use appliances without asking the crowd repetitively. ApplianceReader broadly demonstrates the potential of hybrid approaches that combine the reliability of human knowledge and speed of automation to effectively realize intelligent, interactive access technology.
Robot Path-Planning in Competitive Grid Games using Nash Q-Learning
Following up on the original work of Dr. Junling Hu and Prof. Michael P. Wellman on Nash Q-Learning applied to cooperative scenarios in general-sum stochastic games, we devised a Q-Learning variant applied to a game theoretic predator-prey pursuit scenario where multiple agents compete for available resources to emerge the winner. We developed an environment which involved two agents – a “guardian”, which protects a territory, and a “spy”, whose job was to evade the guardian and collect senstive information. We then formed joint state-action pairs with respect to valid direction of movement for each agent and performed controlled experiments to observe how the Q-values converged. Employing the NashQ algorithm on both the agents, we observed that in this competitive environment, both the agents were able to learn efficiently in the Learning stage and acted in ways during the Planning stage which gave both of them a 50-50 chance of success during simultaneous automation. This was thus a clear demonstration of the efficiency of the NashQ Learning algorithm in a multi-agent competitive environment.
​
Multidirectional Scratch Detection and Image Restoration
Detection of multidirectional scratches in old films and digital images tend to pose serious problem in the domain of digital film restoration. Given the variability of the nature of such scratches, it is difficult to devise a robust algorithm that can automatically detect all such defects, and consequently lead to successful image reconstruction. Few algorithms exist in the literature that deals with reconstruction of images corrupted by slanted scratches. For the aforementioned problem, we propose a detection algorithm based on binary image formation, Hough transform, image rotation, and length-based thresholding. The scratches thus detected are superimposed on the original image and subsequent restoration is achieved using a bi-directional interpolation module.
​
Adaptive Gamma Correction using Brightness-induced Weighted Histogram for Contrast Enhancement​
This work presents an efficient approach based on the adaptive gamma correction method to enhance the visual contrast in digital images. Unlike conventional enhancement methods that neglect the initial distribution of histograms in input images, the algorithm proposed here, processes the 2D histograms of underexposed and overexposed images differently from the histograms of the normally exposed images. The gamma values are adapted based on the cumulative distribution function computed from the 2D histograms of the over and underexposed images or the probability density functions approximated from the transformed histograms of the normally exposed images. Experimental results over a wide variety of test images indicate that the proposed method can yield comparable or better enhancement of images as compared to some of the most representative algorithms for contrast enhancement. The computational complexity of the proposed algorithm also compares very favorably with that of the existing state-of-the-art.
Control of Complex Networks
At the Advanced Digital and Embedded System Laboratory, Jadavpur University, I worked on contollability of scale-free Complex Networks, where we studied network architectures following the Albert-Barabási model, ErdÅ‘s–Rényi model and power law distribution. We implemented maximum matching heuristics to identify driver nodes within the dense dynamic network. These driver nodes can be used to emit control signals to actuate link repair and efficient data rerouting so that both transmission delay and downtime may be minimized.
Differential Evolution-based Constrained Optimization
As a visiting student at the Indian Statistical Institute, I worked on non-convex optimization using Evolutionary Computation. We approached the problem of static constrained optimization first with a variant of the ensemble Differential Evolution (DE) algorithm. Although we could easily deal with inequality constraints, equality constraints posed greater difficulties. We employed Support Vector Clustering to utilize support vectors as constraint boundary trackers. Support Vector Machine with exponential kernels was used to classify the population into boundary and interior agents.