Arun Sharma

News

[12/2025] One paper accepted in AAAI Workshop on AI to Accelerate Science and Engineering.

[11/2025] Two oral presentations in ACM SIGSPATIAL 2025 in Minneapolis.

[09/2025] Four papers accepted at ACM SIGSPATIAL 2025.

[08/2025] One paper accepted in SSTD 2025.

[07/2025] Successfully defended my Ph.D. dissertation.

[02/2025] Two papers accepted at AAAI Bridge on Knowledge-Guided ML Workshop 2025.

[12/2024] One paper accepted at SIAM Data Mining 2025.

[11/2024] Oral presentation in ACM SIGSPATIAL 2024 in Atlanta, GA.

[10/2024] NSF Travel Award for SIGSPATIAL 2024.

[10/2024] One paper accepted at ACM Transactions on Spatial Algorithms and Systems.

[09/2024] Two papers accepted at ACM SIGSPATIAL 2024.

[08/2024] Invited poster presentation at Knowledge-Guided Machine Learning Workshop 2024.

[06/2024] One paper accepted at COSIT 2024.

[05/2024] Invited lightning talk and poster presentation in the AI-CLIMATE annual meeting.

[12/2023] One paper accepted at SIAM Data Mining 2024.

[08/2023] Completed my internship at Esri under Dr. Erik G. Hoel.

[04/2023] Invited presentation at MIDAS Future Leader Summit at University of Michigan.

[03/2023] One paper accepted in GIScience 2023.

[11/2022] Oral presentation in ACM SIGSPATIAL 2022 in Seattle, WA.

[10/2022] NSF Travel Award for SIGSPATIAL 2022.

[09/2022] Oral presentation in COSIT 2022 in Kobe, Japan.

[08/2022] One paper accepted in ACM SIGSPATIAL 2022.

[05/2022] Received Doctoral Dissertation Fellowship 2022-2023.

[04/2022] One paper accepted in COSIT 2022.

[03/2022] One paper accepted in AGILE 2022.

[09/2021] Oral presentation in GIScience 2021 online.

[05/2021] One paper accepted at ACM Transactions in Intelligent System and Technology.

[10/2020] Invited presentation at University of Maryland, College Park online.

[06/2020] One paper accepted in GIScience 2021.

Selected Projects

Each project has its own page with the full paper rendered for the web, plus PDF, code, demo, and BibTeX.

GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video

In preparation. Target: 3DV / CVPR 4DV workshop 2027

Promptable 3D scene segmentation from monocular video via heat-method geodesic propagation over Gaussian centroids.

webpage · PDF · ·

Sat-Splat-Distort: Distortion-Aware Gaussian Splatting for Satellite RPC, Pushbroom, Fisheye, and 360 Cameras

In preparation. Target: CVPR EarthVision 2027

Distortion-aware 3D Gaussian Splatting for satellite RPC, pushbroom, fisheye, and 360 cameras.

webpage · PDF · ·

PhysFlow-Earth: Physics-Constrained Rectified Flow for Earth Observation Super-Resolution and Climate Downscaling

In preparation. Target: NeurIPS 2026 Climate Change AI workshop

Physics-constrained rectified flow for Sentinel-2 super-resolution and ERA5/CHIRPS climate downscaling.

webpage · PDF · ·

TrajPrompt: Open-Vocabulary Maritime Behavior Search with Trajectory Contrastive Learning, TGARD, and Satellite Confirmation

In preparation. Target: NeurIPS 2026 Datasets and Benchmarks

Open-vocabulary maritime trajectory search: trajectory-CLIP and TGARD with SAM 2 over AIS tracks and satellite imagery.

webpage · PDF · ·

DarkVesselNet: Multi-Modal Remote Sensing and Trajectory Reasoning for Dark Vessel Detection

In preparation. Target: CVPR EarthVision 2027; xView3-SAR benchmark

Multi-modal dark-vessel detection fusing Sentinel-1 SAR, Sentinel-2 optical, and AIS with TGARD and Pi-DPM anomaly reasoning.

webpage · PDF · ·

Pin Infrastructure Service: A Constraint-First Microservice for Autonomous Ride-Hail Pickup and Drop-Off Selection

Systems preprint

Production gRPC service for map-constrained pickup and drop-off pin selection in autonomous ride-hail.

webpage · PDF · ·

MapFix-Spatial: Interactive Distortion-Aware Coordinate Correction with Deterministic and AI-Assisted Analysis

Preprint

Distortion-aware geospatial coordinate correction with a deterministic engine and optional LLM-assisted analysis.

webpage · PDF · ·

Physics-Informed Reinforcement Learning for Trajectory Generation and Reasoning

In preparation. Target: NeurIPS workshop

Physics-informed reinforcement learning with Group Relative Policy Optimization for trajectory generation and reasoning.

webpage · PDF · ·

GeoTrace-Agent: A Production Multi-Agent Framework for Spatiotemporal Reasoning

In preparation. Target: NeurIPS workshop

Multi-agent framework for spatiotemporal reasoning with Haegerstrand space-time prisms, MCP tools, and A2A messaging.

webpage · PDF · ·

Education & Experience

Education

University of Minnesota, Twin Cities 2018 - 2025

Ph.D. in Computer Science

Advisor: Prof. Shashi Shekhar. Committee: Prof. Vipin Kumar, Prof. Ravi Janardan, and Prof. Ying Song. Dissertation: Distortion-Aware Spatial Data Science. Doctoral Dissertation Fellowship, 2022-2023.

State University of New York at Buffalo 2016 - 2018

M.S. in Computer Science

Graduate training in computer science before joining the University of Minnesota spatial computing and spatial data science research group.

Experience

Esri (Environmental Systems Research Institute) May 2023 - Dec 2023

Research Scientist Intern

Improved detection of route deviations and dark shipping from 55% to 73% accuracy with an end-to-end anomaly-detection pipeline using Transformer-based models, Evidential Deep Learning, AWS SageMaker, Lambda, ECS, and Step Functions on roughly 500M AIS records.
Reduced maritime route-query latency by 40% for real-time vessel tracking with a scalable Graph-based Traffic Representation and Association framework built on PySpark and GeoAnalytics APIs.
Cut model retraining time by 35% and API latency by 30% using model quantization, SageMaker Multi-Model Endpoints, Step Functions, SQS, CloudWatch, and CI/CD.

University of Minnesota, Twin Cities Aug 2018 - Aug 2025

Graduate Research Assistant

Led Pi-DPM, a physics-informed diffusion model for detecting GPS-spoofed and AI-generated deep-fake trajectories across maritime and urban domains.
Co-led Kriging-informed conditional diffusion for regional sea-level downscaling, turning coarse climate projections into fine-grained coastal risk maps.
Built GeoTrace-Agent, a multi-agent framework for auditable spatiotemporal reasoning over AIS feeds, OSM road networks, Copernicus weather, Sentinel imagery, and space-time-prism tools.
Designed Pi-GRPO, a physics-informed RL stack for trajectory generation and trajectory-reasoning policies with PPO, DPO, GRPO, vLLM-backed rollouts, and human-in-the-loop preference curation.

Reading Resources

Papers and books I am reading, with short notes and a relevance rating (out of 5). Click a title for my notes; "source" links to the original.

Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control (2nd ed.)

S. L. Brunton, J. N. Kutz · Textbook, 2nd ed., Cambridge University Press · ★★★★½4.5/5

The canonical Brunton-Kutz textbook tying SVD, sparsity, ML, dynamical systems, control, ROMs, and physics-informed ML into one toolkit.

notes · source

How to Scale Your Model: A Systems View of LLMs on TPUs

Jacob Austin, Sholto Douglas, Roy Frostig, Anselm Levskaya, Charlie Chen, Reiner Pope, et al. (Google DeepMind) · Online book (jax-ml.github.io/scaling-book), Google DeepMind · ★★★★½4.5/5

A first-principles, roofline-driven playbook for scaling Transformer training and inference across thousands of TPUs (and GPUs).

notes · source

Reinforcement Learning: An Overview

Kevin P. Murphy · arXiv 2024 (v5, Dec 2025) · ★★★★½4.5/5

Kevin Murphy's 250-page modern RL monograph: value/policy/model-based, multi-agent, and a deep LLMs-and-RL chapter (RLHF, RLVR, PPO/GRPO/DPO).

notes · source

Beyond Visual Fidelity: Benchmarking Super-Resolution Models for Large-Scale Remote Sensing Imagery via Downstream Task Integration

Zhili Li, Kangyang Chai, Zhihao Wang, Xiaowei Jia, Yanhua Li, Gengchen Mai, Sergii Skakun, Dinesh Manocha, Yiqun Xie · arXiv 2026 (v2); under review at IEEE TPAMI · ★★★★☆4/5

A remote-sensing super-resolution benchmark that scores SR models by downstream task utility, not just PSNR/SSIM.

notes · notes (v1) · source

CockroachDB: The Resilient Geo-Distributed SQL Database

R. Taft, I. Sharif, A. Matei, N. VanBenschoten, J. Lewis, et al. (Cockroach Labs) · SIGMOD 2020 (Industry Track) · ★★★★☆4/5

How CockroachDB delivers serializable, geo-distributed SQL transactions on commodity clouds without atomic clocks.

notes · source

LLM Query Scheduling with Prefix Reuse and Latency Constraints

G. Dexter, S. Tang, A. Fatahi Baarzi, Q. Song, T. Dharamsi, A. Gupta (LinkedIn; Nubank) · NeurIPS 2025 (Poster); also arXiv 2502.04677 · ★★★★☆4/5

Theory + algorithm for scheduling LLM queries under prefix-cache reuse and TTFT limits; k-LPM cuts P99 latency vs FCFS/LPM.

notes · source

Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems

K. Behdin, A. Fatahibaarzi, Q. Song, Y. Dai, A. Gupta, Z. Wang, et al. (LinkedIn / MIT) · EMNLP 2025 (Industry Track) · ★★★★☆4/5

LinkedIn's playbook for shrinking a 100B+ RecSys LLM 20x via distillation, structured pruning, and FP8 quant, then serving it fast.

notes · source

Code as Agent Harness: Toward Executable, Verifiable, and Stateful Agent Systems

X. Ning, K. Tieu, D. Fu, T. Wei et al. (UIUC, Meta, Stanford); senior authors H. Tong, J. He, T. Zhang · arXiv 2026 (v1), ~102pp survey · ★★★½☆3.5/5

Survey reframing code as the operational "harness" for LLM agents: reasoning, acting, state, verification, and multi-agent coordination.

notes · source

Enhancing Stability for Large Models Training in Constrained Bandwidth Networks

Yun Dai, Tejas Dharamsi, Byron Hsu, Tao Song, Hamed Firooz (LinkedIn) · ICML 2024 ES-FoMo workshop (short paper), PMLR 235 · ★★★½☆3.5/5

Finds and fixes a GPU race condition in ZeRO++ hpZ that silently breaks 40B-70B LLM training on low-bandwidth clusters.

notes · source

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

Gemini Embedding Team, Google DeepMind (M. Shanbhogue, Z. Li, S. Zhang, G. Hernández Ábrego, et al.) · arXiv 2026 (v1), tech report · ★★★½☆3.5/5

Google's native multimodal embedding model: one Gemini-initialized encoder maps text, image, video, and audio into a shared vector space.

notes · source

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

G. Winata, F. Hudi, P. A. Irawan, D. Anugraha, R. A. Putri et al. (60+ authors; senior authors incl. D. I. Adelani, A. Oh, A. F. Aji, T. Watanabe, C.-W. Ngo) · arXiv 2024 (v5, May 2025); NAACL 2025 · ★★½☆☆2.5/5

1M-sample VQA benchmark probing whether vision-language models recognize dishes and origins across 30 languages and 189 countries.

notes · source

Links

Arun Sharma

News

Selected Projects

GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video

Sat-Splat-Distort: Distortion-Aware Gaussian Splatting for Satellite RPC, Pushbroom, Fisheye, and 360 Cameras

PhysFlow-Earth: Physics-Constrained Rectified Flow for Earth Observation Super-Resolution and Climate Downscaling

TrajPrompt: Open-Vocabulary Maritime Behavior Search with Trajectory Contrastive Learning, TGARD, and Satellite Confirmation

DarkVesselNet: Multi-Modal Remote Sensing and Trajectory Reasoning for Dark Vessel Detection

Pin Infrastructure Service: A Constraint-First Microservice for Autonomous Ride-Hail Pickup and Drop-Off Selection

MapFix-Spatial: Interactive Distortion-Aware Coordinate Correction with Deterministic and AI-Assisted Analysis

Physics-Informed Reinforcement Learning for Trajectory Generation and Reasoning

GeoTrace-Agent: A Production Multi-Agent Framework for Spatiotemporal Reasoning

Publications

Education & Experience

Education

Experience

Teaching

Reading Resources

Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control (2nd ed.)

How to Scale Your Model: A Systems View of LLMs on TPUs

Reinforcement Learning: An Overview

Beyond Visual Fidelity: Benchmarking Super-Resolution Models for Large-Scale Remote Sensing Imagery via Downstream Task Integration

CockroachDB: The Resilient Geo-Distributed SQL Database

LLM Query Scheduling with Prefix Reuse and Latency Constraints

Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems

Code as Agent Harness: Toward Executable, Verifiable, and Stateful Agent Systems

Enhancing Stability for Large Models Training in Constrained Bandwidth Networks

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

Links

Documents

Profiles

Contact