Skip to content

BufReader

dexter edited this page Oct 13, 2025 · 2 revisions

BufReader: Zero-Copy Network Reading with Advanced Memory Management

Table of Contents

Key Takeaways

If you're short on time, here are the most important conclusions:

BufReader's Core Advantages (Concurrent Scenarios):

  • ⭐ 98.5% GC Reduction: 134 GCs β†’ 2 GCs (streaming server scenario)
  • πŸš€ 99.93% Less Allocations: 5.57 million β†’ 3,918 allocations
  • πŸ”„ 10-20x Throughput Improvement: Zero allocation + memory reuse

Key Data:

Streaming Server Scenario (100 concurrent streams):
bufio.Reader: 79 GB allocated, 134 GCs
BufReader:    0.6 GB allocated, 2 GCs

Ideal Use Cases:

  • βœ… High-concurrency network servers
  • βœ… Streaming media processing
  • βœ… Long-running services (24/7)

Quick Test:

sh scripts/benchmark_bufreader.sh

Introduction

In high-performance network programming, frequent memory allocation and copying are major sources of performance bottlenecks. While Go's standard library bufio.Reader provides buffered reading capabilities, it still involves significant memory allocation and copying operations when processing network data streams. This article provides an in-depth analysis of these issues and introduces BufReader from the Monibuca project, demonstrating how to achieve zero-copy, high-performance network data reading through the GoMem memory allocator.

1. Memory Allocation Issues in Standard Library bufio.Reader

1.1 How bufio.Reader Works

bufio.Reader uses a fixed-size internal buffer to reduce system call frequency:

type Reader struct {
    buf          []byte    // Fixed-size buffer
    rd           io.Reader // Underlying reader
    r, w         int       // Read/write positions
}

func (b *Reader) Read(p []byte) (n int, err error) {
    // 1. If buffer is empty, read data from underlying reader to fill buffer
    if b.r == b.w {
        n, err = b.rd.Read(b.buf)  // Data copied to internal buffer
        b.w += n
    }
    
    // 2. Copy data from buffer to target slice
    n = copy(p, b.buf[b.r:b.w])    // Another data copy
    b.r += n
    return
}

1.2 Memory Allocation Problem Analysis

When using bufio.Reader to read network data, the following issues exist:

Issue 1: Multiple Memory Copies

sequenceDiagram
    participant N as Network Socket
    participant B as bufio.Reader Internal Buffer
    participant U as User Buffer
    participant A as Application Layer
    
    N->>B: System call reads data (1st copy)
    Note over B: Data stored in fixed buffer
    B->>U: copy() to user buffer (2nd copy)
    Note over U: User gets data copy
    U->>A: Pass to application layer (possible 3rd copy)
    Note over A: Application processes data
Loading

Each read operation requires at least two memory copies:

  1. From network socket to bufio.Reader's internal buffer
  2. From internal buffer to user-provided slice

Issue 2: Fixed Buffer Limitations

// bufio.Reader uses fixed-size buffer
reader := bufio.NewReaderSize(conn, 4096)  // Fixed 4KB

// Reading large chunks requires multiple operations
data := make([]byte, 16384)  // Need to read 16KB
for total := 0; total < 16384; {
    n, err := reader.Read(data[total:])  // Need to loop 4 times
    total += n
}

Issue 3: Frequent Memory Allocation

// Each read requires allocating new slices
func processPackets(reader *bufio.Reader) {
    for {
        // Allocate new memory for each packet
        header := make([]byte, 4)        // Allocation 1
        reader.Read(header)
        
        size := binary.BigEndian.Uint32(header)
        payload := make([]byte, size)    // Allocation 2
        reader.Read(payload)
        
        // After processing, memory is GC'd
        processPayload(payload)
        // Next iteration allocates again...
    }
}

1.3 Performance Impact

In high-frequency network data processing scenarios, these issues lead to:

  1. Increased CPU Overhead: Frequent copy() operations consume CPU resources
  2. Higher GC Pressure: Massive temporary memory allocations increase garbage collection burden
  3. Increased Latency: Each memory allocation and copy adds processing latency
  4. Reduced Throughput: Memory operations become bottlenecks, limiting overall throughput

2. BufReader: A Zero-Copy Solution

2.1 Design Philosophy

BufReader is designed based on the following core principles:

  1. Zero-Copy Reading: Read directly from network to final memory location, avoiding intermediate copies
  2. Memory Reuse: Reuse memory blocks through GoMem allocator, avoiding frequent allocations
  3. Chained Buffering: Use multiple memory blocks in a linked list instead of a single fixed buffer
  4. On-Demand Allocation: Dynamically adjust memory usage based on actual read amount

2.2 Core Data Structures

type BufReader struct {
    Allocator *ScalableMemoryAllocator  // Scalable memory allocator
    buf       MemoryReader               // Memory block chain reader
    totalRead int                        // Total bytes read
    BufLen    int                        // Block size per read
    Mouth     chan []byte                // Data input channel
    feedData  func() error               // Data feeding function
}

// MemoryReader manages multiple memory blocks
type MemoryReader struct {
    *Memory                    // Memory manager
    Buffers [][]byte          // Memory block chain
    Size    int               // Total size
    Length  int               // Readable length
}

2.3 Workflow

2.3.1 Zero-Copy Data Reading Flow

sequenceDiagram
    participant N as Network Socket
    participant A as ScalableMemoryAllocator
    participant B as BufReader.buf
    participant U as User Code
    
    U->>B: Read(n)
    B->>B: Check if buffer has data
    alt Buffer empty
        B->>A: Request memory block
        Note over A: Get from pool or allocate new block
        A-->>B: Return memory block reference
        B->>N: Read directly to memory block
        Note over N,B: Zero-copy: data written to final location
    end
    B-->>U: Return slice view of memory block
    Note over U: User uses directly, no copy needed
    U->>U: Process data
    U->>A: Recycle memory block (optional)
    Note over A: Block returns to pool for reuse
Loading

2.3.2 Memory Block Management Flow

graph TD
    A[Start Reading] --> B{buf has data?}
    B -->|Yes| C[Return data view directly]
    B -->|No| D[Call feedData]
    D --> E[Allocator.Read requests memory]
    E --> F{Pool has free block?}
    F -->|Yes| G[Reuse existing memory block]
    F -->|No| H[Allocate new memory block]
    G --> I[Read data from network]
    H --> I
    I --> J[Append to buf.Buffers]
    J --> K[Update Size and Length]
    K --> C
    C --> L[User reads data]
    L --> M{Data processed?}
    M -->|Yes| N[ClipFront recycle front blocks]
    N --> O[Allocator.Free return to pool]
    O --> P[End]
    M -->|No| A
Loading

2.4 Core Implementation Analysis

2.4.1 Initialization and Memory Allocation

func NewBufReader(reader io.Reader) *BufReader {
    return NewBufReaderWithBufLen(reader, defaultBufSize)
}

func NewBufReaderWithBufLen(reader io.Reader, bufLen int) *BufReader {
    r := &BufReader{
        Allocator: NewScalableMemoryAllocator(bufLen),  // Create allocator
        BufLen:    bufLen,
        feedData: func() error {
            // Key: Read from allocator, fill directly to memory block
            buf, err := r.Allocator.Read(reader, r.BufLen)
            if err != nil {
                return err
            }
            n := len(buf)
            r.totalRead += n
            // Directly append memory block reference, no copy
            r.buf.Buffers = append(r.buf.Buffers, buf)
            r.buf.Size += n
            r.buf.Length += n
            return nil
        },
    }
    r.buf.Memory = &Memory{}
    return r
}

Zero-Copy Key Points:

  • Allocator.Read() reads directly from io.Reader to allocated memory block
  • Returned buf is a reference to the actual data storage memory block
  • append(r.buf.Buffers, buf) only appends reference, no data copy

2.4.2 Read Operations

func (r *BufReader) ReadByte() (b byte, err error) {
    // If buffer is empty, trigger data filling
    for r.buf.Length == 0 {
        if err = r.feedData(); err != nil {
            return
        }
    }
    // Read from memory block chain, no copy needed
    return r.buf.ReadByte()
}

func (r *BufReader) ReadRange(n int, yield func([]byte)) error {
    for r.recycleFront(); n > 0 && err == nil; err = r.feedData() {
        if r.buf.Length > 0 {
            if r.buf.Length >= n {
                // Directly pass slice view of memory block, no copy
                r.buf.RangeN(n, yield)
                return
            }
            n -= r.buf.Length
            r.buf.Range(yield)
        }
    }
    return
}

Zero-Copy Benefits:

  • yield callback receives a slice view of the memory block
  • User code directly operates on original memory blocks without intermediate copying
  • After reading, processed blocks are automatically recycled

2.4.3 Memory Recycling

func (r *BufReader) recycleFront() {
    // Clean up processed memory blocks
    r.buf.ClipFront(r.Allocator.Free)
}

func (r *BufReader) Recycle() {
    r.buf = MemoryReader{}
    if r.Allocator != nil {
        // Return all memory blocks to allocator
        r.Allocator.Recycle()
    }
    if r.Mouth != nil {
        close(r.Mouth)
    }
}

2.5 Comparison with bufio.Reader

graph LR
    subgraph "bufio.Reader (Multiple Copies)"
        A1[Network] -->|System Call| B1[Kernel Buffer]
        B1 -->|Copy 1| C1[bufio Buffer]
        C1 -->|Copy 2| D1[User Slice]
        D1 -->|Copy 3?| E1[Application]
    end
    
    subgraph "BufReader (Zero-Copy)"
        A2[Network] -->|System Call| B2[Kernel Buffer]
        B2 -->|Direct Read| C2[GoMem Block]
        C2 -->|Slice View| D2[User Code]
        D2 -->|Recycle| C2
        C2 -->|Reuse| C2
    end
Loading
Feature bufio.Reader BufReader
Memory Copies 2-3 times 0 times (slice view)
Buffer Mode Fixed-size single buffer Variable-size chained buffer
Memory Allocation May allocate each read Object pool reuse
Memory Recycling GC automatic Active return to pool
Large Data Handling Multiple operations needed Single append to chain
GC Pressure High Very low

3. Performance Benchmarks

3.1 Test Scenario Design

3.1.1 Real Network Simulation

To make benchmarks more realistic, we implemented a mockNetworkReader that simulates real network behavior.

Real Network Characteristics:

In real network reading scenarios, the data length returned by each Read() call is uncertain, affected by multiple factors:

  • TCP receive window size
  • Network latency and bandwidth
  • OS buffer state
  • Network congestion
  • Network quality fluctuations

Simulation Implementation:

type mockNetworkReader struct {
    data     []byte
    offset   int
    rng      *rand.Rand
    minChunk int  // Minimum chunk size
    maxChunk int  // Maximum chunk size
}

func (m *mockNetworkReader) Read(p []byte) (n int, err error) {
    // Each time return random length data between minChunk and maxChunk
    chunkSize := m.minChunk + m.rng.Intn(m.maxChunk-m.minChunk+1)
    n = copy(p[:chunkSize], m.data[m.offset:])
    m.offset += n
    return n, nil
}

Different Network Condition Simulations:

Network Condition Data Block Range Real Scenario
Good Network 1024-4096 bytes Stable LAN, premium network
Normal Network 256-2048 bytes Regular internet connection
Poor Network 64-512 bytes High latency, small TCP window
Worst Network 1-128 bytes Mobile network, severe congestion

This simulation makes benchmark results more realistic and reliable.

3.1.2 Test Scenario List

We focus on the following core scenarios:

  1. Concurrent Network Connection Reading - Demonstrates zero allocation
  2. Concurrent Protocol Parsing - Simulates real applications
  3. GC Pressure Test - Shows long-term running advantages ⭐
  4. Streaming Server Scenario - Real business scenario ⭐

3.2 Benchmark Design

Core Test Scenarios

Benchmarks focus on concurrent network scenarios and GC pressure comparison:

1. Concurrent Network Connection Reading

  • Simulates 100+ concurrent connections continuously reading data
  • Each read processes 1KB data packets
  • bufio: Allocates new buffer each time (make([]byte, 1024))
  • BufReader: Zero-copy processing (ReadRange)

2. Concurrent Protocol Parsing

  • Simulates streaming server parsing protocol packets
  • Reads packet header (4 bytes) + data content
  • Compares memory allocation strategies

3. GC Pressure Test (⭐ Core)

  • Continuous concurrent reading and processing
  • Tracks GC count, total memory allocation, allocation count
  • Demonstrates differences in long-term running

4. Streaming Server Scenario (⭐ Real Application)

  • Simulates 100 concurrent streams
  • Each stream reads and forwards data to subscribers
  • Complete real application scenario comparison

Key Test Logic

Concurrent Reading:

// bufio.Reader - Allocate each time
buf := make([]byte, 1024)  // 1KB allocation
n, _ := reader.Read(buf)
processData(buf[:n])

// BufReader - Zero-copy
reader.ReadRange(1024, func(data []byte) {
    processData(data)  // Direct use, no allocation
})

GC Statistics:

// Record GC statistics
var beforeGC, afterGC runtime.MemStats
runtime.ReadMemStats(&beforeGC)

b.RunParallel(func(pb *testing.PB) {
    // Concurrent testing...
})

runtime.ReadMemStats(&afterGC)
b.ReportMetric(float64(afterGC.NumGC-beforeGC.NumGC), "gc-runs")
b.ReportMetric(float64(afterGC.TotalAlloc-beforeGC.TotalAlloc)/1024/1024, "MB-alloc")

Complete test code: pkg/util/buf_reader_benchmark_test.go

3.3 Running Benchmarks

We provide complete benchmark code (pkg/util/buf_reader_benchmark_test.go) and convenient test scripts.

Method 1: Using Test Script (Recommended)

# Run complete benchmark suite
sh scripts/benchmark_bufreader.sh

This script will run all tests sequentially and output user-friendly results.

Method 2: Manual Testing

cd pkg/util

# Run all benchmarks
go test -bench=BenchmarkConcurrent -benchmem -benchtime=2s -test.run=xxx

# Run specific tests
go test -bench=BenchmarkGCPressure -benchmem -benchtime=5s -test.run=xxx

# Run streaming server scenario
go test -bench=BenchmarkStreamingServer -benchmem -benchtime=3s -test.run=xxx

Method 3: Run Key Tests Only

cd pkg/util

# GC pressure comparison (core advantage)
go test -bench=BenchmarkGCPressure -benchmem -test.run=xxx

# Streaming server scenario (real application)
go test -bench=BenchmarkStreamingServer -benchmem -test.run=xxx

3.4 Actual Performance Test Results

Actual results from running benchmarks on Apple M2 Pro:

Test Environment:

  • CPU: Apple M2 Pro (12 cores)
  • OS: macOS (darwin/arm64)
  • Go: 1.23.0

3.4.1 Core Performance Comparison

Test Scenario bufio.Reader BufReader Difference
Concurrent Network Read 103.2 ns/op
1027 B/op, 1 allocs
147.6 ns/op
4 B/op, 0 allocs
Zero alloc ⭐
GC Pressure Test 1874 ns/op
5,576,659 mallocs
3 gc-runs
112.7 ns/op
3,918 mallocs
2 gc-runs
16.6x faster ⭐⭐⭐
Streaming Server 374.6 ns/op
79,508 MB-alloc
134 gc-runs
30.29 ns/op
601 MB-alloc
2 gc-runs
12.4x faster ⭐⭐⭐

3.4.2 GC Pressure Comparison (Core Finding)

GC Pressure Test results best demonstrate long-term running differences:

bufio.Reader:

Operation Latency:   1874 ns/op
Allocation Count:    5,576,659 times (over 5 million!)
GC Runs:            3 times
Per Operation:      2 allocs/op

BufReader:

Operation Latency:   112.7 ns/op (16.6x faster)
Allocation Count:    3,918 times (99.93% reduction)
GC Runs:            2 times
Per Operation:      0 allocs/op (zero allocation!)

Key Metrics:

  • πŸš€ 16x Throughput Improvement: 45.7M ops/s vs 2.8M ops/s
  • ⭐ 99.93% Allocation Reduction: From 5.57 million to 3,918 times
  • ✨ Zero Allocation Operations: 0 allocs/op vs 2 allocs/op

3.4.3 Streaming Server Scenario (Real Application)

Simulating 100 concurrent streams, continuously reading and forwarding data:

bufio.Reader:

Operation Latency:   374.6 ns/op
Memory Allocation:   79,508 MB (79 GB!)
GC Runs:            134 times
Per Operation:      4 allocs/op

BufReader:

Operation Latency:   30.29 ns/op (12.4x faster)
Memory Allocation:   601 MB (99.2% reduction)
GC Runs:            2 times (98.5% reduction!)
Per Operation:      0 allocs/op

Stunning Differences:

  • 🎯 GC Runs: 134 β†’ 2 (98.5% reduction)
  • πŸ’Ύ Memory Allocation: 79 GB β†’ 0.6 GB (132x reduction)
  • ⚑ Throughput: 10.1M β†’ 117M ops/s (11.6x improvement)

3.4.4 Long-Term Running Impact

For streaming server scenarios, 1-hour running estimation:

bufio.Reader:

Estimated Memory Allocation: ~2.8 TB
Estimated GC Runs: ~4,800 times
Cumulative GC Pause: Significant

BufReader:

Estimated Memory Allocation: ~21 GB (133x reduction)
Estimated GC Runs: ~72 times (67x reduction)
Cumulative GC Pause: Minimal

Usage Recommendations:

Scenario Recommended Reason
Simple file reading bufio.Reader Standard library sufficient
High-concurrency network server BufReader ⭐ 98% GC reduction
Streaming media processing BufReader ⭐ Zero allocation, high throughput
Long-running services BufReader ⭐ More stable system

3.4.5 Essential Reasons for Performance Improvement

While bufio.Reader is faster in some simple scenarios, BufReader's design goals are not to be faster in all cases, but rather:

  1. Eliminate Memory Allocation - Avoid frequent make([]byte, n) in real applications
  2. Reduce GC Pressure - Reuse memory through object pool, reducing garbage collection burden
  3. Zero-Copy Processing - Provide ReadRange API for direct data manipulation
  4. Chained Buffering - Support complex data processing patterns

In scenarios like Monibuca streaming server, the value of these features far exceeds microsecond-level latency differences.

Real Impact: When handling 1000 concurrent streaming connections:

// bufio.Reader approach
// 1000 connections Γ— 30fps Γ— 1024 bytes/packet = 30,720,000 allocations per second
// 1024 bytes per allocation = ~30GB/sec temporary memory allocation
// Triggers massive GC

// BufReader approach  
// 0 allocations (memory reuse)
// 90%+ GC pressure reduction
// Significantly improved system stability

Selection Guidelines:

  • πŸ“ Simple file reading β†’ bufio.Reader
  • πŸ”„ High-concurrency network services β†’ BufReader (98% GC reduction)
  • πŸ’Ύ Long-running services β†’ BufReader (zero allocation)
  • 🎯 Streaming server β†’ BufReader (10-20x throughput)

4. Real-World Use Cases

4.1 RTSP Protocol Parsing

// Use BufReader to parse RTSP requests
func parseRTSPRequest(conn net.Conn) (*RTSPRequest, error) {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()
    
    // Read request line: zero-copy, no memory allocation
    requestLine, err := reader.ReadLine()
    if err != nil {
        return nil, err
    }
    
    // Read headers: directly operate on memory blocks
    headers, err := reader.ReadMIMEHeader()
    if err != nil {
        return nil, err
    }
    
    // Read body (if present)
    if contentLength := headers.Get("Content-Length"); contentLength != "" {
        length, _ := strconv.Atoi(contentLength)
        // ReadRange provides zero-copy data access
        var body []byte
        err = reader.ReadRange(length, func(chunk []byte) {
            body = append(body, chunk...)
        })
    }
    
    return &RTSPRequest{
        RequestLine: requestLine,
        Headers:     headers,
    }, nil
}

4.2 Streaming Media Packet Parsing

// Use BufReader to parse FLV packets
func parseFLVPackets(conn net.Conn) error {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()
    
    for {
        // Read packet header: 4 bytes
        packetType, err := reader.ReadByte()
        if err != nil {
            return err
        }
        
        // Read data size: 3 bytes big-endian
        dataSize, err := reader.ReadBE32(3)
        if err != nil {
            return err
        }
        
        // Read timestamp: 4 bytes
        timestamp, err := reader.ReadBE32(4)
        if err != nil {
            return err
        }
        
        // Skip StreamID: 3 bytes
        if err := reader.Skip(3); err != nil {
            return err
        }
        
        // Read actual data: zero-copy processing
        err = reader.ReadRange(int(dataSize), func(data []byte) {
            // Process data directly, no copy needed
            processPacket(packetType, timestamp, data)
        })
        if err != nil {
            return err
        }
        
        // Skip previous tag size
        if err := reader.Skip(4); err != nil {
            return err
        }
    }
}

4.3 Performance-Critical Scenarios

BufReader is particularly suitable for:

  1. High-frequency small packet processing: Network protocol parsing, RTP/RTCP packet handling
  2. Large data stream transmission: Continuous reading of video/audio streams
  3. Multi-step protocol reading: Protocols requiring step-by-step reading of different length data
  4. Low-latency requirements: Real-time streaming media transmission, online gaming
  5. High-concurrency scenarios: Servers with massive concurrent connections

5. Best Practices

5.1 Correct Usage Patterns

// βœ… Correct: Specify appropriate block size on creation
func goodExample(conn net.Conn) {
    // Choose block size based on actual packet size
    reader := util.NewBufReaderWithBufLen(conn, 16384)  // 16KB blocks
    defer reader.Recycle()  // Ensure resource recycling
    
    // Use ReadRange for zero-copy
    reader.ReadRange(1024, func(data []byte) {
        // Process directly, don't hold reference to data
        process(data)
    })
}

// ❌ Wrong: Forget to recycle resources
func badExample1(conn net.Conn) {
    reader := util.NewBufReader(conn)
    // Missing defer reader.Recycle()
    // Memory blocks cannot be returned to object pool
}

// ❌ Wrong: Holding data reference
var globalData []byte

func badExample2(conn net.Conn) {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()
    
    reader.ReadRange(1024, func(data []byte) {
        // ❌ Wrong: data will be recycled after Recycle
        globalData = data  // Dangling reference
    })
}

// βœ… Correct: Copy when data needs to be retained
func goodExample2(conn net.Conn) {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()
    
    var saved []byte
    reader.ReadRange(1024, func(data []byte) {
        // Explicitly copy when retention needed
        saved = make([]byte, len(data))
        copy(saved, data)
    })
    // Now safe to use saved
}

5.2 Block Size Selection

// Choose appropriate block size based on scenario
const (
    // Small packet protocols (e.g., RTSP, HTTP headers)
    SmallPacketSize = 4 << 10   // 4KB
    
    // Medium data streams (e.g., audio)
    MediumPacketSize = 16 << 10  // 16KB
    
    // Large data streams (e.g., video)
    LargePacketSize = 64 << 10   // 64KB
)

func createReaderForProtocol(conn net.Conn, protocol string) *util.BufReader {
    var bufSize int
    switch protocol {
    case "rtsp", "http":
        bufSize = SmallPacketSize
    case "audio":
        bufSize = MediumPacketSize
    case "video":
        bufSize = LargePacketSize
    default:
        bufSize = util.defaultBufSize
    }
    return util.NewBufReaderWithBufLen(conn, bufSize)
}

5.3 Error Handling

func robustRead(conn net.Conn) error {
    reader := util.NewBufReader(conn)
    defer func() {
        // Ensure resources are recycled in all cases
        reader.Recycle()
    }()
    
    // Set timeout
    conn.SetReadDeadline(time.Now().Add(5 * time.Second))
    
    // Read data
    data, err := reader.ReadBytes(1024)
    if err != nil {
        if err == io.EOF {
            // Normal end
            return nil
        }
        // Handle other errors
        return fmt.Errorf("read error: %w", err)
    }
    
    // Process data
    processData(data)
    return nil
}

6. Performance Optimization Tips

6.1 Batch Processing

// βœ… Optimized: Batch reading and processing
func optimizedBatchRead(reader *util.BufReader) error {
    // Read large chunk of data at once
    return reader.ReadRange(65536, func(chunk []byte) {
        // Batch processing in callback
        for len(chunk) > 0 {
            packetSize := int(binary.BigEndian.Uint32(chunk[:4]))
            packet := chunk[4 : 4+packetSize]
            processPacket(packet)
            chunk = chunk[4+packetSize:]
        }
    })
}

// ❌ Inefficient: Read one by one
func inefficientRead(reader *util.BufReader) error {
    for {
        size, err := reader.ReadBE32(4)
        if err != nil {
            return err
        }
        packet, err := reader.ReadBytes(int(size))
        if err != nil {
            return err
        }
        processPacket(packet.Buffers[0])
    }
}

6.2 Avoid Unnecessary Copying

// βœ… Optimized: Direct processing, no copy
func zeroCopyProcess(reader *util.BufReader) error {
    return reader.ReadRange(4096, func(data []byte) {
        // Operate directly on original memory
        sum := 0
        for _, b := range data {
            sum += int(b)
        }
        reportChecksum(sum)
    })
}

// ❌ Inefficient: Unnecessary copy
func unnecessaryCopy(reader *util.BufReader) error {
    mem, err := reader.ReadBytes(4096)
    if err != nil {
        return err
    }
    // Another copy performed
    data := make([]byte, mem.Size)
    copy(data, mem.Buffers[0])
    
    sum := 0
    for _, b := range data {
        sum += int(b)
    }
    reportChecksum(sum)
    return nil
}

6.3 Proper Resource Management

// βœ… Optimized: Use object pool to manage BufReader
type ConnectionPool struct {
    readers sync.Pool
}

func (p *ConnectionPool) GetReader(conn net.Conn) *util.BufReader {
    if reader := p.readers.Get(); reader != nil {
        r := reader.(*util.BufReader)
        // Re-initialize
        return r
    }
    return util.NewBufReader(conn)
}

func (p *ConnectionPool) PutReader(reader *util.BufReader) {
    reader.Recycle()  // Recycle memory blocks
    p.readers.Put(reader)  // Recycle BufReader object itself
}

// Use connection pool
func handleConnection(pool *ConnectionPool, conn net.Conn) {
    reader := pool.GetReader(conn)
    defer pool.PutReader(reader)
    
    // Handle connection
    processConnection(reader)
}

7. Summary

7.1 Performance Comparison Visualization

Based on actual benchmark results (concurrent scenarios):

πŸ“Š GC Runs Comparison (Core Advantage) ⭐⭐⭐
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
bufio.Reader   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  134 runs
BufReader      β–ˆ  2 runs  ← 98.5% reduction!

πŸ“Š Total Memory Allocation Comparison
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
bufio.Reader   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  79 GB
BufReader      β–ˆ  0.6 GB  ← 99.2% reduction!

πŸ“Š Operation Throughput Comparison
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
bufio.Reader   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  10.1M ops/s
BufReader      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  117M ops/s  ← 11.6x!

Key Metrics (Streaming Server Scenario):

  • 🎯 GC Runs: From 134 to 2 (98.5% reduction)
  • πŸ’Ύ Memory Allocation: From 79 GB to 0.6 GB (132x reduction)
  • ⚑ Throughput: 11.6x improvement

7.2 Core Advantages

BufReader achieves zero-copy, high-performance network data reading through:

  1. Zero-Copy Architecture

    • Data read directly from network to final memory location
    • Use slice views to avoid data copying
    • Chained buffer supports large data processing
  2. Memory Reuse Mechanism

    • GoMem object pool reuses memory blocks
    • Active memory management reduces GC pressure
    • Configurable block sizes adapt to different scenarios
  3. Significant Performance Improvement (in concurrent scenarios)

    • GC runs reduced by 98.5% (134 β†’ 2)
    • Memory allocation reduced by 99.2% (79 GB β†’ 0.6 GB)
    • Throughput improved by 10-20x
    • Significantly improved system stability

7.3 Ideal Use Cases

BufReader is particularly suitable for:

  • βœ… High-performance network servers
  • βœ… Streaming media data processing
  • βœ… Real-time protocol parsing
  • βœ… Large data stream transmission
  • βœ… Low-latency requirements
  • βœ… High-concurrency environments

Not suitable for:

  • ❌ Simple file reading (standard library sufficient)
  • ❌ Single small data reads
  • ❌ Performance-insensitive scenarios

7.4 Choosing Between bufio.Reader and BufReader

Scenario Recommended
Simple file reading bufio.Reader
Low-frequency network reads bufio.Reader
High-performance network server BufReader
Streaming media processing BufReader
Protocol parsers BufReader
Zero-copy requirements BufReader
Memory-sensitive scenarios BufReader

7.5 Key Points

Remember when using BufReader:

  1. Always call Recycle(): Ensure memory blocks are returned to object pool
  2. Don't hold data references: Data in ReadRange callback will be recycled
  3. Choose appropriate block size: Adjust based on actual packet size
  4. Leverage ReadRange: Achieve true zero-copy processing
  5. Use with GoMem: Fully leverage memory reuse advantages

Through the combination of BufReader and GoMem, Monibuca achieves high-performance network data processing, providing solid infrastructure support for streaming media servers.

References

Clone this wiki locally