I’m a student focused on building scalable systems for large language models, with a strong interest in ML systems, AI infrastructure, AI model architecture, and AI agent architecture. I study the intersection of distributed systems, cloud computing, machine learning, model design, and agentic AI systems, with an emphasis on building AI systems that are reliable, efficient, and scalable. I have hands-on experience through projects involving distributed system design, cloud platforms, and DevOps practices. This includes building and experimenting with data pipelines, model training workflows, serving setups, and large language model applications. I’ve also worked with LLMs by exploring fine-tuning, evaluation, deployment considerations, architectural trade-offs, and the design of agent workflows involving planning, tool use, memory, and orchestration. I’m particularly interested in the systems, model architecture, and agent architecture challenges behind modern AI: scaling training and inference, optimizing performance, designing efficient model architectures, building robust AI agents, and managing end-to-end ML pipelines. I’m actively looking to deepen my understanding and contribute to problems where machine learning, large-scale systems, and agentic AI architecture meet.
Highlights
- Pro
Pinned Loading
-
Hydromotion
Hydromotion PublicA simple projectile motion simulation with custom physics engine using Qt Framework and OpenGL graphics api
C++
-
KubeKV
KubeKV PublicSimplified etcd-style metadata store in Go, built on Raft with leader election, replicated logs, crash persistence, snapshots, and a linearizable key-value service.
Go
-
Distributed-Ledger
Distributed-Ledger PublicFault-tolerant distributed ledger with ISIS total ordering, deterministic transaction delivery, and TCP-based peer coordination.
Go
-
superAttention/video_moderation_alignment
superAttention/video_moderation_alignment PublicEnd-to-end pipeline for video safety alignment: SFT + DPO on Qwen3-VL with structured outputs, benchmark design, and evaluation of over-refusal.
HTML 1
-
Distributed-Transaction
Distributed-Transaction PublicDistributed transaction system using strict two-phase locking (2PL), wound-wait deadlock prevention, and two-phase commit (2PC) to provide isolation and atomicity across multiple servers.
Go
-
superAttention/llm-prefix-cache-analysis
superAttention/llm-prefix-cache-analysis PublicBenchmarking LLM prefix-cache eviction policies against Tree-Constrained Belady on ShareGPT traces.
HTML
If the problem persists, check the GitHub status page or contact support.



