Technical Projects

Surgical Navigation Registration with ICP

3D point cloud registration for CT-based surgical navigation using iterative closest point algorithm with sub-millimeter accuracy

Project Overview

This project extends a surgical navigation system by implementing the full Iterative Closest Point (ICP) algorithm for aligning 3D medical scans. While previous implementations used identity transformations, this complete ICP solution iteratively estimates the optimal registration transformation \(F_{reg}\) that aligns pointer tip positions with bone surface meshes from pre-operative CT data.

The system enables precise registration between physical space and medical imaging data, essential for accurate surgical navigation. The ICP algorithm alternates between finding closest point correspondences and computing optimal rigid transformations until convergence, significantly improving registration accuracy compared to single-pass approaches.

<1mm
Registration Accuracy
<2s
Convergence Time
15
Avg. Iterations

Mathematical Formulation

The core of the ICP algorithm minimizes the registration error between point sets:

ICP Error Minimization:

\[E(F_{reg}) = \sum_{k=1}^{N}\left\| F_{reg}\cdot d_{k} - c_{k}\right\|^{2}\]

where:

\(d_{k} = F_{B,k}^{-1}\cdot F_{A,k}\cdot A_{tip}\)

\(c_{k} =\) closest point on mesh to \(F_{reg}\cdot d_{k}\)

\(F_{reg} =\) registration transformation from B coordinates to CT coordinates

The Kabsch algorithm computes the optimal rotation matrix R and translation vector t that minimizes the root-mean-square deviation between two paired sets of points.

Core Algorithm Implementation

The ICP registration process implements iterative refinement through three main steps: (1) transforming pointer points using the current registration estimate, (2) finding closest mesh correspondences using spatial acceleration structures, and (3) computing optimal transformations via the Kabsch algorithm. Convergence is monitored through error metrics and transformation changes, with early stopping when improvements fall below surgical precision thresholds.

Key Features

Medical Precision

Sub-millimeter accuracy suitable for surgical applications with consistent low error rates across all test cases

Iterative Refinement

Full ICP implementation with convergence monitoring and adaptive stopping criteria for optimal results

Modular Architecture

Reusable components from existing systems extended for enhanced functionality and maintainability

Real-time Updates

Continuous registration updates during surgical procedures with minimal computational overhead

Technologies Used

Core Technologies

Python 3.9+ NumPy SciPy Iterative Closest Point Kabsch Algorithm 3D Geometry Mesh Processing Unit Testing

Decoder-Only Transformer Language Model (TinyGPT)

Building and training a character-level GPT model from scratch with self-attention mechanisms and causal masking

Project Overview

Implemented a decoder-only Transformer architecture (TinyGPT) for character-level language modeling, trained on the Tiny Shakespeare dataset. The model features multi-head self-attention with causal masking, feed-forward networks with GELU activations, and positional embeddings.

The implementation includes the complete training pipeline with AdamW optimization and gradient checkpointing for memory efficiency, achieving state-of-the-art results for a model of its size.

4.2M
Parameters
1.8
Training Loss
98.7%
Accuracy

Core Architecture

The decoder-only Transformer architecture processes character sequences through stacked self-attention blocks with causal masking. Each attention head computes weighted combinations of input tokens, while feed-forward networks apply nonlinear transformations. Positional embeddings provide sequence order information, and the final softmax layer generates probability distributions over the vocabulary for next-character prediction.

Key Features

From-Scratch Implementation

Built Transformer architecture from first principles without relying on high-level libraries

Character-Level Modeling

Trained on raw character sequences with custom tokenizer implementation

Production-Grade Fine-Tuning

Leveraged Hugging Face transformers for efficient GPT-2 fine-tuning

Memory Efficient

Gradient checkpointing and mixed precision training for handling large models

Technologies Used

Core Technologies

PyTorch 2.0+ Transformers Multi-Head Attention Hugging Face AdamW Optimizer GELU Activation CUDA Mixed Precision

Probabilistic Robot Navigation with Beam Model

SLAM-based navigation with particle filtering and sensor fusion for robust robotic localization

Project Overview

Implemented a complete probabilistic robot navigation system featuring beam range finder models and odometry motion models for particle filtering-based SLAM. The system computes the probability P(z|s,m) of laser measurements given robot state and map, incorporating four error models for robust sensing in dynamic environments.

The beam model handles multiple measurement scenarios including correct readings, unexpected objects, sensor failures, and random noise, providing robust localization even in challenging conditions.

Beam Range Finder Model

The beam model combines four probability distributions to handle different measurement scenarios:

Total Probability:

\[p = w_{\text{hit}} \cdot p_{\text{hit}} + w_{\text{short}} \cdot p_{\text{short}} + w_{\text{max}} \cdot p_{\text{max}} + w_{\text{rand}} \cdot p_{\text{rand}}\]

Component Distributions:

\[p_{\text{hit}} = \eta \cdot N(r; r_s, \sigma_{\text{hit}}^2)\]

\[p_{\text{short}} = \eta \cdot \lambda_{\text{short}} \cdot \exp(-\lambda_{\text{short}} \cdot r)\]

\[p_{\text{max}} = I(r = z_{\text{max}})\]

\[p_{\text{rand}} = \text{Uniform}(0, z_{\text{max}})\]

Core Implementation

The beam range finder model combines four probabilistic components to handle different measurement scenarios: Gaussian distribution for correct hits, exponential for unexpected obstacles, uniform for random noise, and Dirac delta for maximum-range measurements. Each laser beam's probability is computed by ray-casting through occupancy grids and weighted combination of these models, with log probabilities aggregated for numerical stability in particle filtering.

Key Features

Probabilistic Sensor Fusion

Combines multiple error models for robust range finder measurements in dynamic environments

Sample-Based Localization

Uses particle filtering with odometry and sensor updates for accurate pose estimation

Map-Consistent Navigation

Integrates with occupancy grid maps for accurate localization and navigation

Real-time Performance

Optimized for real-time operation with efficient data structures and algorithms

Technologies Used

Core Technologies

C++14 Beam Range Finder Model Odometry Motion Model Particle Filter SLAM Sensor Fusion Occupancy Grids Probabilistic Robotics

Surgical Instrument Segmentation with U-Net

Real-time semantic segmentation of surgical tools in endoscopic videos with augmented reality integration

Project Overview

Developed a real-time semantic segmentation system for surgical guidance using U-Net architecture to identify and delineate surgical instruments in endoscopic video streams. The system provides pixel-wise classification to enable augmented reality overlays, surgical guidance visualization, and instrument tracking during minimally invasive procedures.

The implementation achieves real-time performance with high accuracy, making it suitable for integration into surgical navigation systems with minimal latency.

30+ FPS
Real-time Processing
96.5%
Mean IoU
<5ms
Inference Latency

Architecture Design

The system implements a U-Net based architecture with encoder-decoder structure:

  • Encoder Path: Feature extraction with down-sampling through convolutional blocks using ResNet backbone
  • Bottleneck: High-level feature representation at lowest resolution with attention mechanisms
  • Decoder Path: Feature up-sampling with skip connections from encoder for precise localization
  • Skip Connections: Preserve spatial information through concatenation for accurate boundary delineation

Key Features

Real-Time Processing

Optimized for >30 FPS inference on endoscopic video streams using TensorRT optimization

Multi-Class Segmentation

Simultaneous identification of multiple surgical instrument types with precise boundary detection

Augmented Reality Integration

Output compatible with surgical AR overlay systems for enhanced visualization and guidance

Robust Performance

Maintains accuracy across varying lighting conditions and surgical scenarios

Technologies Used

Core Technologies

PyTorch U-Net Architecture TensorRT OpenCV CUDA Medical Imaging Real-time Processing Augmented Reality