Projects

Jump to: Work Projects · Academic Projects

Work Projects

Autonomous Pick-and-Place with XArm5 Robotic Arm
This project was done while being employed as AI Researcher at UNC Blue Sky Innovations at University of North Carolina (UNC) Chapel Hill, NC, USA. This work focused on enabling a robotic arm to autonomously recognize and pick up objects using computer vision and reinforcement learning. I integrated deep learning–based object detection (OpenCV and CNN models) with ROS and MoveIt for real-time motion planning and manipulation tasks.
Implementation Guide 1: Setting up Development Environment for the XArm5 Robotic Arm
Implementation Guide 2: Picking Up an Object Autonomously Using the XArm5 Robotic Arm
Data Scientist — Computer Vision for Smart Agriculture
This project was done while being employed as Data Scientist at Infinite Acres (owned by 80 Acres Farms), Hamilton, Ohio, USA. I built an end-to-end deep learning pipeline in Azure Databricks to identify plant diseases and extract phenotypic traits from crop images. I collaborated with plant scientists using Azure ML labeling tools to curate training data, developed and deployed a multi-label classification model to detect early disease symptoms, and applied OpenCV methods to quantify leaf area and fruit size. Also created Power BI dashboards to visualize plant health and growth metrics, enabling data-driven monitoring of controlled-environment farms.
Stream Metabolism Modeling using Machine Learning
This project was done while being employed as Data Scientist at the Duke River Center, Nicholas School of Environment, Duke University, USA. The overarching goal of the StreamPULSE project was to model stream metabolism in rivers using dissolved oxygen data collected from in-stream sensors. My primary work focused on addressing missing and noisy data in these real-time sensor streams. I evaluated multiple imputation methods for filling data gaps and found that simple statistical approaches outperformed recurrent neural networks. Using unsupervised anomaly detection with Robust Random Cut Forest (RRCF), I identified sensor malfunctions and natural events that caused outliers. Additionally, I mentored two undergraduate team members, guiding them in Python-based data analysis workflows and supporting their development of related visualization tools for the project.
Poster PDF: Developing Data Cleaning Tools for StreamPULSE users
Executive Summary PDF: Executive Summary Slides

Academic Projects

Shielding-Induced Safe Reinforcement Learning for Drone Navigation
This project was completed as part of the graduate course AI Safety and Assessment (CSE 598) at Arizona State University in Spring 2024. The work explored how safety constraints can be embedded into reinforcement learning (RL) using a technique called shielding—an approach that evaluates and modifies an agent’s actions to prevent unsafe behaviors during training. Using NVIDIA Isaac Gym’s Ingenuity drone environment, I introduced obstacles and modified the observation space to include obstacle-relative positions, enabling the agent to learn safe navigation. Three shielding strategies were implemented and tested with an Advantage Actor-Critic (A2C) agent: Hard (action replacement), Soft (feedback-based), and Hybrid (combining both). Results showed that while shielding slowed early convergence compared to the baseline, it improved stability and ensured collision-free flight trajectories. This work demonstrates how shielding mechanisms can contribute to safer and more interpretable RL for autonomous aerial systems.
Project Report PDF: AI Shielding for Drone Navigation — Report
Presentation PDF: AI Safety Drone Flight Presentation
Integrating Deep Learning with Object-Based Image Analysis (OBIA) in Remote Sensing
This project was completed as part of the graduate course Machine Learning for Remote Sensing (CSE 598) at Arizona State University in Spring 2024. The work explored how deep learning can be integrated with Object-Based Image Analysis (OBIA) to combine the strengths of both approaches—automatic feature extraction in DL and spatial contextual understanding in OBIA—for remote sensing image segmentation. Using three GeoBench datasets (NZ-Cattle, NEON-Tree, and PV4GER), I implemented a ResNet-50–based segmentation model and compared two pretraining strategies: standard ImageNet weights and contrastive DetCon pretraining (DeepMind). The experiments, run locally on dual RTX 3090 GPUs, evaluated mean Precision and Intersection-over-Union (IoU) over multiple seeds with statistical tests. Results showed that DetCon-pretrained models achieved significantly higher IoU for NZ-Cattle and PV4GER datasets, confirming the advantage of object-level pretraining in improving segmentation performance. The study demonstrates the potential of hybrid DL + OBIA frameworks for geospatial analysis and sets a foundation for future large-scale, multi-class image segmentation research.
Project Report PDF: OBIA + DL Final Report
Presentation PDF: OBIA + DL Presentation Slides
Comparative Analysis of Machine Learning and Neural Network Models
This project was completed as part of the graduate course Machine Learning (EEE 549) at Arizona State University in Fall 2023. The work compared the performance of classical machine learning algorithms—Logistic Regression, Support Vector Machines (SVM), and K-Nearest Neighbors (KNN with PCA)—and neural networks including Feedforward Neural Networks (FNN) and Convolutional Neural Networks (CNN) across three benchmark datasets: UCI Adult Income, Breast Cancer Wisconsin, and Fashion-MNIST. I prepared and split all datasets into training, validation, and test sets, implemented normalization pipelines, and ran all models locally on GPU using NVIDIA RAPIDS for accelerated computation. Hyperparameter tuning was performed via grid search, with ROC-AUC, F1, and Precision-Recall metrics used for evaluation. The results showed that Logistic Regression (with L2 regularization) performed best for Breast Cancer data (AUROC ≈ 0.99), SVM with RBF kernel generalized well on the UCI Adult dataset (Accuracy ≈ 0.84), and CNN outperformed all other methods on the Fashion-MNIST dataset (Accuracy ≈ 91.7%), highlighting the suitability of deep models for high-dimensional image data.
Project Report PDF: EEE 549 Final Project Report
Course Projects — Database Systems (CSE 511, Arizona State University)
These two projects were completed as part of the graduate course Data Processing at Scale (CSE 511) at Arizona State University in Fall 2023. In Project 1 (NoSQL), I implemented Python functions to query and filter business data stored in a UnQLite database based on city and geographic location, using geospatial computations to retrieve nearby results and calculate distances from given coordinates. This provided hands-on experience with NoSQL databases, query optimization, and geospatial data processing. In Project 2 (Hot Spot Analysis), I applied big data tools—Apache Spark, Scala, and Hadoop—to perform spatial and spatio-temporal analysis on the New York City Yellow Taxi dataset (2009–2012). I implemented the ST_Contains function for hot zone analysis and computed spatial hot cells using the Getis–Ord statistic to identify regions of high taxi activity. This project reinforced concepts in distributed computing, spatial indexing, and GPU-accelerated data exploration using NVIDIA RAPIDS.
Project 1 PDF: NoSQL Database
Project 2 PDF: Hot Spot Analysis
Deep Learning for Energy Infrastructure Mapping
This project was done while serving as Data Science Project Manager at the Energy Data Analytics Lab, Bass Connections Program, Duke University, USA. The interdisciplinary project aimed to advance global energy access by mapping electricity infrastructure from satellite imagery—an affordable alternative to on-the-ground data collection in developing regions. I led a team of seven Duke undergraduates using Kanban for project management and contributed to model development and evaluation. We applied U-Net for semantic segmentation to detect energy infrastructure in satellite images, achieving around 75% accuracy, and used PCA and t-SNE to interpret model representations. To improve generalizability across regions, I created and integrated synthetic images into the training pipeline, enriching the dataset and reducing overfitting. The overarching goal was to develop a deployable, geography-agnostic model for accurately identifying power infrastructure to inform energy access decisions.
Project Website: Bass Connections – A Wider Lens on Energy
Team Presentation (YouTube): Deep Learning for Energy Access Decisions
Project Page: bass-connections-2019.github.io
Data Scientist — Evaluation of Forest Conservation Programs in Amazonia Rainforest
This project was completed as part of my Master’s Capstone in the MIDS program at Duke University, in collaboration with Conservation International (CI), USA. The overarching goal was to evaluate the effectiveness of the Bolsa Verde (BV) forest conservation program and understand spatial trends in forest cover loss in the Amazonia rainforest of Brazil from 2011 to 2018. I extracted geospatial data in raster format using GeoPandas, converted it into a tabular form for analysis, and grouped the dataset into clusters based on environmental covariates such as slope, elevation, and distance to roads and cities. Propensity score matching was applied to balance treatment (BV implemented) and control (BV not implemented) regions, followed by logistic and Poisson regression models to assess causal impacts. The analysis revealed that BV was most effective in regions with lower elevation and closer proximity to cities and roads, where deforestation pressures are higher. The findings informed CI’s broader goal of achieving zero net deforestation in Amazonia and were presented through a final white paper and capstone presentation.
Project Report PDF: Evaluation of Forest Conservation Programs — White Paper
Presentation PDF: Capstone Final Presentation