.NET Performance Tips That Actually Make a Difference
Practical optimizations for .NET applications that have substantial impact on performance and resource usage.
Jean-Pierre Broeders
Freelance DevOps Engineer
.NET Performance Tips That Actually Make a Difference
Performance optimization often gets postponed until later. Until the application goes live and production servers start sweating under realistic load. A production API that needs to respond within 100ms but consistently takes 800ms is a problem that won't be solved by adding more RAM.
Span and Memory for String Operations
Most .NET applications waste memory on substring operations. Every time string.Substring() gets called, a new string instance is created. At high throughput, this adds up quickly.
// Traditional: new allocation per substring
string input = "2026-02-22T06:30:00Z";
string date = input.Substring(0, 10); // allocates new object
// With Span<T>: zero-copy slicing
ReadOnlySpan<char> input = "2026-02-22T06:30:00Z";
ReadOnlySpan<char> date = input.Slice(0, 10); // no allocation
The difference seems marginal, but in an API processing 1000 requests per second this becomes noticeable quickly. Span
For parsing scenarios this performs particularly well:
public static bool TryParseCustomFormat(ReadOnlySpan<char> input, out DateTime result)
{
if (input.Length != 19)
{
result = default;
return false;
}
int year = int.Parse(input.Slice(0, 4));
int month = int.Parse(input.Slice(5, 2));
int day = int.Parse(input.Slice(8, 2));
result = new DateTime(year, month, day);
return true;
}
ValueTask for Hot Paths
Async/await is standard in modern .NET applications, but creates unnecessary allocations when operations can complete synchronously. A typical scenario: cache lookup that usually results in a hit.
// Task<T> always allocates
public async Task<User> GetUserAsync(int id)
{
if (_cache.TryGetValue(id, out var user))
return user; // synchronous return, but Task gets allocated anyway
return await _database.GetUserAsync(id);
}
// ValueTask<T> prevents allocation on cache hit
public async ValueTask<User> GetUserAsync(int id)
{
if (_cache.TryGetValue(id, out var user))
return user; // no heap allocation
return await _database.GetUserAsync(id);
}
An application with 70% cache hit rate saves thousands of allocations per second with this approach. Garbage collector has less work, latency drops.
StringBuilder Capacity
StringBuilder is used to keep string concatenation efficient, but often gets initialized incorrectly.
// Grows incrementally, multiple allocations
var builder = new StringBuilder();
for (int i = 0; i < 1000; i++)
{
builder.Append($"Item {i}\n");
}
// Pre-allocate with estimated capacity
var builder = new StringBuilder(capacity: 15000); // ~15 chars per item
for (int i = 0; i < 1000; i++)
{
builder.Append($"Item {i}\n");
}
When capacity isn't specified, StringBuilder starts at 16 characters and doubles on every overflow. That means multiple array copies. For known workloads, pre-allocation makes a significant difference.
Array Pooling for Buffers
Temporary buffers often get allocated and immediately discarded. ArrayPool
// Standard: new allocation per request
byte[] ProcessRequest(Stream input)
{
byte[] buffer = new byte[4096];
// ... process data
return result;
} // buffer gets GC'd
// With ArrayPool: reuse
byte[] ProcessRequest(Stream input)
{
byte[] buffer = ArrayPool<byte>.Shared.Rent(4096);
try
{
// ... process data
return result;
}
finally
{
ArrayPool<byte>.Shared.Return(buffer);
}
}
This pattern is especially effective in high-throughput services where temporary buffers are constantly needed. The pool automatically manages available arrays and reuses them where possible.
LINQ Materializes Collections
LINQ is expressive and readable, but materializes intermediate results when not careful.
// Materializes 3x
var result = items
.Where(x => x.Active) // IEnumerable
.ToList() // List - first materialization
.OrderBy(x => x.Priority) // IOrderedEnumerable
.ToList() // List - second materialization
.Take(10) // IEnumerable
.ToList(); // List - third materialization
// Materializes 1x at the end
var result = items
.Where(x => x.Active)
.OrderBy(x => x.Priority)
.Take(10)
.ToList();
Each .ToList() call creates a new collection and copies all elements. Chain operations first, materialize only when needed.
For large datasets, switching to for-loops can even perform better:
// LINQ
var active = items.Where(x => x.Active && x.Priority > 5).ToList();
// Loop (faster for large collections)
var active = new List<User>(capacity: items.Count / 2);
foreach (var item in items)
{
if (item.Active && item.Priority > 5)
active.Add(item);
}
Async Streams for Large Datasets
When processing large datasets, traditional async often blocks until everything is loaded. IAsyncEnumerable
// Loads everything first, then returns
public async Task<List<User>> GetUsersAsync()
{
var users = new List<User>();
await foreach (var batch in _database.GetBatchesAsync())
{
users.AddRange(batch);
}
return users; // caller must wait for complete list
}
// Stream results directly
public async IAsyncEnumerable<User> GetUsersStreamAsync()
{
await foreach (var batch in _database.GetBatchesAsync())
{
foreach (var user in batch)
{
yield return user; // consumer can process immediately
}
}
}
The consumer can start processing as soon as first results arrive, instead of waiting for the complete result.
Conclusion
These optimizations aren't premature optimization. They're fundamental patterns that can be applied from day one without increasing code complexity. Span
Performance isn't a feature that gets added later. It gets built in from the beginning, or it becomes a problem in production.
