LLMs are making BDD & Gherkin rise again ?
  1. First impression with LLM powered coding assistant
  2. Challenge with BDD
  3. Opportunity
  4. Discussion

As AI coding assistants become increasingly common in software development, many teams are discovering their potential to transform how we design and maintain test automation. In this post, I share my early thoughts on how LLM-powered tools might revolutionize Behavior-Driven Development (BDD).

First impression with LLM powered coding assistant

When working with AI coding assistants powered by large language models (LLMs), I have noticed they thrive on structure, checklists, and concrete examples.

Before large language models (LLMs) became as popular as they are today, I was already fascinated by Behavior-Driven Development (BDD) and its Gherkin syntax – a human- and business-friendly way to express test cases in natural language. Within its clear rules and structured syntax, feels like a natural fit for LLM-driven assistance.

Challenge with BDD

As the number of test cases grows, maintaining them becomes a real challenge. Without a well-defined and structured set of step definitions, things can quickly turn chaotic.

From my experience, the most time-consuming part of setting up a new BDD framework is designing step definitions. It requires careful balance:

  • Clarity: Steps must be understandable at a glance.
  • Reusability: They should be generic enough to cover multiple cases.
  • Simplicity: Too generic also a problem as over-abstracted steps can become confusing and difficult to maintain. We must keep the definition focus on their own single responsibility so that It’s easier for us to quickly identity the scope of it and device where we can reuse or creating a new definition.

Once a “step definition dictionary” is established, another challenge emerges, “identifying whether a similar step already exists“. Without proper tooling, duplicate or overlapping steps can proliferate, creating inconsistency and maintenance pain.

These issues highlight how tooling and management are critical to BDD’s long-term success.

Opportunity

When I first used an LLM-based chatbot to generate test cases, I realized how powerful these tools could be. They can:

  • Help with naming step definitions consistently.
  • Translate requirements and expectations into BDD-style test cases, grounded in an existing step definition dictionary.
  • or I just write my draft test cases, and AI assistant would refine it and take care of the remaining of naming, find existing step definition that I can reuse, etc,..

With this capability, I believe that LLM-based tools have the potential to elevate test automation to an entirely new level, especially for the BDD community – where structured syntax like Gherkin provides the perfect foundation for AI-assisted generation, validation, and management of test cases.

Discussion

This might also influence how we manage our test cases and how Test Case Management systems evolve in the future. Future systems will likely integrate with AI coding assistants, feeding them text-based materials that support automated test case generation and optimization.

It could also mark a broader movement/trend – like feature files and natural-language requirements – become the primary input for automation, much like how GitOps, Infrastructure as Code are redefining infrastructure management.

Ultimately, I believe that LLM-powered testing will play a crucial role in modern software delivery, enhancing collaboration, reducing maintenance friction, and improving the overall productivity of testing teams.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

I’m Hung

Welcome to my personal space. This is a small corner where I share ideas and things I find interesting,..

Let’s connect

Don’t hesitate to reach out if there’s anything we can discuss, or simply to connect :)