AI Code Generation in Practice: What Works and What Doesn't

The hype around AI code generation has cooled off a bit. Good. That means there's room for a realistic conversation about what it actually delivers in a day-to-day development workflow.

The reality of generated code

Generated code isn't magic. It's pattern matching on steroids. Repetitive tasks benefit enormously, while complex business logic still requires a human brain.

A concrete example. Say there's a need for a standard CRUD endpoint:

[ApiController]
[Route("api/[controller]")]
public class ProductsController : ControllerBase
{
    private readonly IProductRepository _repository;

    public ProductsController(IProductRepository repository)
    {
        _repository = repository;
    }

    [HttpGet("{id}")]
    public async Task<ActionResult<Product>> GetById(int id)
    {
        var product = await _repository.GetByIdAsync(id);
        if (product == null) return NotFound();
        return Ok(product);
    }

    [HttpPost]
    public async Task<ActionResult<Product>> Create(
        [FromBody] CreateProductRequest request)
    {
        var product = new Product
        {
            Name = request.Name,
            Price = request.Price,
            Category = request.Category
        };

        await _repository.AddAsync(product);
        return CreatedAtAction(
            nameof(GetById),
            new { id = product.Id },
            product);
    }
}

This kind of boilerplate? That's where AI code generation shines. The structure is predictable, the pattern appears thousands of times in training data, and the output is usable right away.

Where it actually works

Generating unit tests. Feed a function as input and let the AI come up with test cases. Often it surfaces edge cases that would otherwise get missed. Not every test is perfect, but the starting point saves real time.

# Input: a simple validation function
def validate_email(email: str) -> bool:
    import re
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))

# Generated tests (after review and tweaks)
def test_valid_email():
    assert validate_email("user@example.com") is True

def test_missing_at_sign():
    assert validate_email("userexample.com") is False

def test_empty_string():
    assert validate_email("") is False

def test_special_chars_in_local():
    assert validate_email("user+tag@example.com") is True

def test_double_dot_domain():
    assert validate_email("user@example..com") is False

Database migrations and schemas. Going from a description to a migration script works surprisingly well. Especially for standard relationships.

Configuration files. Docker Compose files, CI pipelines, Terraform modules — configuration with a fixed format is ideal territory.

Where it breaks down

Problems start the moment domain knowledge is required. AI doesn't know your business rules. It has no idea that customer X gets a special discount on Tuesdays, or that certain products can't be combined in an order.

Another common issue: security. Generated code regularly contains vulnerabilities. SQL queries without parameterization. Hardcoded API keys. Missing input validation.

// This kind of code comes up way too often — DON'T use it
const query = `SELECT * FROM users WHERE name = '${userName}'`;

// What it should have been
const query = 'SELECT * FROM users WHERE name = $1';
const result = await pool.query(query, [userName]);

Any experienced developer writes the second version. But AI still picks the first one too often, especially with quick prompts.

A practical approach

A workflow that works well in practice:

Generate the skeleton. Let AI create the basics — endpoints, models, configuration.
Review everything. No exceptions. Every line of generated code gets read as if a junior wrote it.
Write business logic by hand. The core of the application stays human-crafted.
Have tests generated. After writing the function. Verify that edge cases make sense.
Run security scanning. Tools like Snyk or SonarQube over the generated output.

Setting the right expectations

AI code generation doesn't replace developers. It replaces typing. That distinction matters. The time savings come from not having to type boilerplate, not from skipping architectural thinking.

Teams that get the most out of it treat it as smart autocomplete. Not as a colleague shipping features independently. With that mindset, it genuinely saves hours per week — without the technical debt that piles up when generated code gets accepted blindly.

The tooling keeps improving rapidly. But the fundamental limitation remains: a model generating code from patterns doesn't understand your specific context. As long as that's the case, it stays a tool. A good one, though.

AI Code Generation in Practice: What Works and What Doesn't

AI Code Generation in Practice: What Works and What Doesn't

The reality of generated code

Where it actually works

Where it breaks down

A practical approach

Setting the right expectations

Related Articles

Prompt Engineering: Getting Better Results from Large Language Models

Want to stay updated?