Selected Projects

[Conformal NILM] Building upon the previously accepted [paper] at Buildsys22, we have used distribution-free likelihood by using Conformal Prediction. Calibration methods like Isotonic regression and Conformal prediction are used on top of SOTA S2P Homoscedastic/Heteroscedastic models and Quantile S2P model. Applied smoothing to mitigate same score function outputs in case of sparsely used appliances.
[Active NILM] There is a lack of labeled data especially in the energy domain since collecting them require appliance specific sensors to be installed increasing costs and privacy issues. We used active learning on Pecan Street Dataport dataset, 25 houses in Austin, Texas. Comparable, if not better performance was achieved using only 65% of data. Different acquisition functions like entropy, mutual information, rank based sampling and round robin was used to help in uncertainty quantification.
[Uncertain NILM] Neural network models traditionally give a point prediction which gives no useful information. Rather if a certain range of interval is provided, it helps the user make informed decisions. In the energy domain, such information can help reduce energy consumption upto 15%. We were the first one to quantify uncertainty in energy disaggregation. Approximate Bayesian methods like MC Dropout, Deep Ensemble and Bootstrap were used. Moreover, Isotonic regression helped recalibrate the uncertainties. This work is published in ACM Buildsys’2022, a CORE A ranked CS Conference.
[Learning from Synthetic Data] Image datasets like CIFAR-10, Imagenet-100 have been manually annotated leading to labeling errors and privacy issues. Obtaining these are often difficult and expensive. We use Generative Adversial Networks (GANs) on top of self-supervised learning techniques for synthetic data generation. By combining them with real images, we train networks for downstream classification tasks achieving a better accuracy. Thereafter, we used Stable diffusion with prompt engineering of domains to get even further realistic images and maintaining a fine tradeoff between fidelity and diversity.
[Developing test generation techniques from decision trees learnt for insurance migration systems to reduce training data] Insurance systems make new schemes which are either applicable or not applicable to individuals. There are a number of variables which are present in this scheme determining eligibility. Among these variables there exist constraints which are non-intuitive many times. A decision tree is learnt and finetuned repeatedly by adding testcases generated by parsing the tree. The testcases are generated keeping in mind the constraints a variable might have with other. This resulted in a 7% jump in accuracy in software testing systems.
[Compilers] A Compiler built from scratch with Python. It involves the following stages for compiling namely, lexical analysis, parsing, semantic analysis, optimization, and a user-friendly interface. There are advanced features like error handling. We have also introduced unique variable “var” for all the different possibilities of writing programs.