Skip to content
View carlosfundora's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report carlosfundora

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. llama.cpp-1-bit-turbo llama.cpp-1-bit-turbo Public

    Forked from ggml-org/llama.cpp

    HIP/ROCm fork optimized for AMD RDNA2 (gfx1030) with PrismML Q1_0_G128 1-bit quant support, RotorQuant, TurboQuant, EAGLE3 and P-EAGLE speculative decoding, and full Wave32 kernel optimizations.

    C++ 5

  2. sglang-1-bit-turbo sglang-1-bit-turbo Public

    Forked from sgl-project/sglang

    SGLang 1-Bit Turbo — AMD ROCm (gfx1030) inference fork with RotorQuant/TurboQuant KV compression, PHANTOM-X zero-copy draft speculation, EAGLE3 speculative decoding, 12 RDNA2 crash fixes, and Prism…

    Python 4

  3. vllm-1-bit-turbo vllm-1-bit-turbo Public

    Forked from mitkox/vllm-turboquant

    vLLM 0.18.1rc1 fork optimized for HIP/ROCm with support added for PrismML Bonsai Q1_0 and Q1_0_G128 1-bit GPU inference, TurboQuant TQ3_0 KV cache, and AMD gfx1030/RDNA2 architecture.

    Python

  4. gfxGRAPH gfxGRAPH Public

    CUDA Graph → HIP Graph translation layer for AMD gfx1030 (RDNA2). Bridges all 4 CUDA Graph parity gaps on ROCm.

    Python

  5. SpecForge SpecForge Public

    Forked from sgl-project/SpecForge

    Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

    Python

  6. ATLAS ATLAS Public

    Forked from itigges22/ATLAS

    Adaptive Test-time Learning and Autonomous Specialization

    Python