.NET Performance Tips That Actually Make a Difference

Practical optimizations for .NET applications that have substantial impact on performance and resource usage.

Jean-Pierre Broeders

Freelance DevOps Engineer

February 22, 20266 min. read

.NET Performance Tips That Actually Make a Difference

Performance optimization often gets postponed until later. Until the application goes live and production servers start sweating under realistic load. A production API that needs to respond within 100ms but consistently takes 800ms is a problem that won't be solved by adding more RAM.

Span and Memory for String Operations

Most .NET applications waste memory on substring operations. Every time string.Substring() gets called, a new string instance is created. At high throughput, this adds up quickly.

// Traditional: new allocation per substring
string input = "2026-02-22T06:30:00Z";
string date = input.Substring(0, 10);  // allocates new object

// With Span<T>: zero-copy slicing
ReadOnlySpan<char> input = "2026-02-22T06:30:00Z";
ReadOnlySpan<char> date = input.Slice(0, 10);  // no allocation

The difference seems marginal, but in an API processing 1000 requests per second this becomes noticeable quickly. Span works directly on the underlying memory without creating new objects.

For parsing scenarios this performs particularly well:

public static bool TryParseCustomFormat(ReadOnlySpan<char> input, out DateTime result)
{
    if (input.Length != 19) 
    {
        result = default;
        return false;
    }
    
    int year = int.Parse(input.Slice(0, 4));
    int month = int.Parse(input.Slice(5, 2));
    int day = int.Parse(input.Slice(8, 2));
    
    result = new DateTime(year, month, day);
    return true;
}

ValueTask for Hot Paths

Async/await is standard in modern .NET applications, but creates unnecessary allocations when operations can complete synchronously. A typical scenario: cache lookup that usually results in a hit.

// Task<T> always allocates
public async Task<User> GetUserAsync(int id)
{
    if (_cache.TryGetValue(id, out var user))
        return user;  // synchronous return, but Task gets allocated anyway
    
    return await _database.GetUserAsync(id);
}

// ValueTask<T> prevents allocation on cache hit
public async ValueTask<User> GetUserAsync(int id)
{
    if (_cache.TryGetValue(id, out var user))
        return user;  // no heap allocation
    
    return await _database.GetUserAsync(id);
}

An application with 70% cache hit rate saves thousands of allocations per second with this approach. Garbage collector has less work, latency drops.

StringBuilder Capacity

StringBuilder is used to keep string concatenation efficient, but often gets initialized incorrectly.

// Grows incrementally, multiple allocations
var builder = new StringBuilder();
for (int i = 0; i < 1000; i++)
{
    builder.Append($"Item {i}\n");
}

// Pre-allocate with estimated capacity
var builder = new StringBuilder(capacity: 15000);  // ~15 chars per item
for (int i = 0; i < 1000; i++)
{
    builder.Append($"Item {i}\n");
}

When capacity isn't specified, StringBuilder starts at 16 characters and doubles on every overflow. That means multiple array copies. For known workloads, pre-allocation makes a significant difference.

Array Pooling for Buffers

Temporary buffers often get allocated and immediately discarded. ArrayPool reuses arrays without garbage collector overhead.

// Standard: new allocation per request
byte[] ProcessRequest(Stream input)
{
    byte[] buffer = new byte[4096];
    // ... process data
    return result;
}  // buffer gets GC'd

// With ArrayPool: reuse
byte[] ProcessRequest(Stream input)
{
    byte[] buffer = ArrayPool<byte>.Shared.Rent(4096);
    try
    {
        // ... process data
        return result;
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(buffer);
    }
}

This pattern is especially effective in high-throughput services where temporary buffers are constantly needed. The pool automatically manages available arrays and reuses them where possible.

LINQ Materializes Collections

LINQ is expressive and readable, but materializes intermediate results when not careful.

// Materializes 3x
var result = items
    .Where(x => x.Active)      // IEnumerable
    .ToList()                   // List - first materialization
    .OrderBy(x => x.Priority)   // IOrderedEnumerable
    .ToList()                   // List - second materialization
    .Take(10)                   // IEnumerable
    .ToList();                  // List - third materialization

// Materializes 1x at the end
var result = items
    .Where(x => x.Active)
    .OrderBy(x => x.Priority)
    .Take(10)
    .ToList();

Each .ToList() call creates a new collection and copies all elements. Chain operations first, materialize only when needed.

For large datasets, switching to for-loops can even perform better:

// LINQ
var active = items.Where(x => x.Active && x.Priority > 5).ToList();

// Loop (faster for large collections)
var active = new List<User>(capacity: items.Count / 2);
foreach (var item in items)
{
    if (item.Active && item.Priority > 5)
        active.Add(item);
}

Async Streams for Large Datasets

When processing large datasets, traditional async often blocks until everything is loaded. IAsyncEnumerable streams results as they become available.

// Loads everything first, then returns
public async Task<List<User>> GetUsersAsync()
{
    var users = new List<User>();
    await foreach (var batch in _database.GetBatchesAsync())
    {
        users.AddRange(batch);
    }
    return users;  // caller must wait for complete list
}

// Stream results directly
public async IAsyncEnumerable<User> GetUsersStreamAsync()
{
    await foreach (var batch in _database.GetBatchesAsync())
    {
        foreach (var user in batch)
        {
            yield return user;  // consumer can process immediately
        }
    }
}

The consumer can start processing as soon as first results arrive, instead of waiting for the complete result.

Conclusion

These optimizations aren't premature optimization. They're fundamental patterns that can be applied from day one without increasing code complexity. Span for string operations, ValueTask for hot paths, ArrayPool for buffers - implementation takes minutes, impact runs into thousands of requests per second.

Performance isn't a feature that gets added later. It gets built in from the beginning, or it becomes a problem in production.

Want to stay updated?

Subscribe to my newsletter or get in touch for freelance projects.

Get in Touch