App-Test-Generator

 view release on metacpan or  search on metacpan

README.md  view on Meta::CPAN

From Perl:

    use App::Test::Generator qw(generate);

    # Generate to STDOUT
    App::Test::Generator->generate("t/conf/add.yml");

    # Generate directly to a file
    App::Test::Generator->generate('t/conf/add.yml', 't/add_fuzz.t');

    # Holy grail mode - read a Perl file, generate tests, and run them
    # This is a long way away yet, but see t/schema_input.t for a proof of concept
    my $extractor = App::Test::Generator::SchemaExtractor->new(
      input_file => 'Foo.pm',
      output_dir => $dir
    );
    my $schemas = $extractor->extract_all();
    foreach my $schema(keys %{$schemas}) {
      my $tempfile = '/var/tmp/foo.t';    # Use File::Temp in real life
      App::Test::Generator->generate(
        schema => $schemas->{$schema},
        output_file => $tempfile,
      );
      system("$^X -I$dir $tempfile");
      unlink $tempfile;
    }

# OVERVIEW

This module takes a formal input/output specification for a routine or
method and automatically generates test cases. In effect, it allows you
to easily add comprehensive black-box tests in addition to the more
common white-box tests that are typically written for CPAN modules and other
subroutines.

The generated tests combine:

- Random fuzzing based on input types
- Deterministic edge cases for min/max constraints
- Static corpus tests defined in Perl or YAML

This approach strengthens your test suite by probing both expected and
unexpected inputs, helping you to catch boundary errors, invalid data
handling, and regressions without manually writing every case.

# DESCRIPTION

This module implements the logic behind [fuzz-harness-generator](https://metacpan.org/pod/fuzz-harness-generator).
It parses configuration files (fuzz and/or corpus YAML), and
produces a ready-to-run `.t` test script to run through `prove`.

It reads configuration files in any format,
and optional YAML corpus files.
All of the examples in this documentation are in `YAML` format,
other formats may not work as they aren't so heavily tested.
It then generates a [Test::Most](https://metacpan.org/pod/Test%3A%3AMost)-based fuzzing harness combining:

- Randomized fuzzing of inputs (with edge cases)
- Optional static corpus tests from Perl `%cases` or YAML file (`yaml_cases` key)
- Functional or OO mode (via `$new`)
- Reproducible runs via `$seed` and configurable iterations via `$iterations`

# MUTATION-GUIDED TEST GENERATION

`App::Test::Generator` includes a pipeline that automatically closes the
feedback loop between mutation testing, schema extraction, and fuzz
testing. The goal is that surviving mutants drive the creation of new
tests that kill them on the next run, without manual intervention.

## The Pipeline

    mutation survivor
        |
        v
    SchemaExtractor extracts the schema for the enclosing sub
        |
        v
    Schema augmented with boundary values from the mutant
        |
        v
    Augmented schema written to t/conf/
        |
        v
    t/fuzz.t picks up the new schema and runs fuzz tests
        |
        v
    Mutation killed on next run

## How to Use It

The pipeline is driven by three flags passed to
`bin/test-generator-index`, which is invoked automatically by
`bin/generate-test-dashboard` on each CI push.

### Step 1: Generate TODO stubs for all survivors

    bin/test-generator-index --generate_mutant_tests=t

Produces `t/mutant_YYYYMMDD_HHMMSS.t` containing:

- TODO stubs for HIGH and MEDIUM difficulty survivors, with
boundary value suggestions, environment variable hints, and the
enclosing subroutine name for navigation context.
- Comment-only hints for LOW difficulty survivors.

Multiple mutations on the same source line are deduplicated into one
stub. One good test kills all variants on that line.

### Step 2: Generate runnable schemas for NUM\_BOUNDARY survivors

    bin/test-generator-index \
        --generate_mutant_tests=t \
        --generate_test=mutant

For each NUM\_BOUNDARY survivor, calls
[App::Test::Generator::SchemaExtractor](https://metacpan.org/pod/App%3A%3ATest%3A%3AGenerator%3A%3ASchemaExtractor) to extract the schema for
the enclosing subroutine. If the confidence level is sufficient, the
schema is augmented with the boundary value from the mutant (plus one
value either side) and written to `t/conf/` as a runnable YAML file.
["fuzz.t" in t](https://metacpan.org/pod/t#fuzz.t) picks it up automatically on the next test run.

README.md  view on Meta::CPAN

      preserves_zero:
        input:
          value:
            type: number
            value: 0
        output:
          type: number
          value: 0

### `$module`

The name of the module (optional).

Using the reserved word `builtin` means you're testing a Perl builtin function.

If omitted, the generator will guess from the config filename:
`My-Widget.conf` -> `My::Widget`.

### `$function`

The function/method to test.

This defaults to `run`.

### `%new`

An optional hashref of args to pass to the module's constructor.

    new:
      api_key: ABC123
      verbose: true

To ensure `new()` is called with no arguments, you still need to define new, thus:

    module: MyModule
    function: my_function

    new:

### `%cases`

An optional Perl static corpus, when the output is a simple string (expected => \[ args... \]).

Maps the expected output string to the input and \_STATUS

    cases:
      ok:
        input: ping
        _STATUS: OK
      error:
        input: ""
        _STATUS: DIES

### `$yaml_cases` - optional path to a YAML file with the same shape as `%cases`.

### `$seed`

An optional integer.
When provided, the generated `t/fuzz.t` will call `srand($seed)` so fuzz runs are reproducible.

### `$iterations`

An optional integer controlling how many fuzz iterations to perform (default 30).

### `%edge_cases`

An optional hash mapping of extra values to inject.

        # Two named parameters
        edge_cases:
                name: [ '', 'a' x 1024, \"\x{263A}" ]
                age: [ -1, 0, 99999999 ]

        # Takes a string input
        edge_cases: [ 'foo', 'bar' ]

Values can be strings or numbers; strings will be properly quoted.
Note that this only works with routines that take named parameters.

### `%type_edge_cases`

An optional hash mapping types to arrayrefs of extra values to try for any field of that type:

        type_edge_cases:
                string: [ '', ' ', "\t", "\n", "\0", 'long' x 1024, chr(0x1F600) ]
                number: [ 0, 1.0, -1.0, 1e308, -1e308, 1e-308, -1e-308, 'NaN', 'Infinity' ]
                integer: [ 0, 1, -1, 2**31-1, -(2**31), 2**63-1, -(2**63) ]

### `%edge_case_array`

Specify edge case values for routines that accept a single unnamed parameter.
This is specifically designed for simple functions that take one argument without a parameter name.
These edge cases supplement the normal random string generation, ensuring specific problematic values are always tested.
During fuzzing iterations, there's a 40% probability that a test case will use a value from edge\_case\_array instead of randomly generated data.

    ---
    module: Text::Processor
    function: sanitize

    input:
      type: string
      min: 1
      max: 1000

    edge_case_array:
      - "<script>alert('xss')</script>"
      - "'; DROP TABLE users; --"
      - "\0null\0byte"
      - "emoji😊test"
      - ""
      - " "

    seed: 42
    iterations: 30

### Semantic Data Generators

For property-based testing with [Test::LectroTest](https://metacpan.org/pod/Test%3A%3ALectroTest),
you can use semantic generators to create realistic test data.

`unix_timestamp` is currently fully supported,
other fuzz testing support for `semantic` entries is being developed.

    input:
      email:
        type: string
        semantic: email

      user_id:
        type: string
        semantic: uuid

      phone:
        type: string
        semantic: phone_us

#### Available Semantic Types

- `email` - Valid email addresses (user@domain.tld)
- `url` - HTTP/HTTPS URLs
- `uuid` - UUIDv4 identifiers
- `phone_us` - US phone numbers (XXX-XXX-XXXX)
- `phone_e164` - International E.164 format (+XXXXXXXXXXXX)
- `ipv4` - IPv4 addresses (0.0.0.0 - 255.255.255.255)
- `ipv6` - IPv6 addresses
- `username` - Alphanumeric usernames with \_ and -
- `slug` - URL slugs (lowercase-with-hyphens)
- `hex_color` - Hex color codes (#RRGGBB)
- `iso_date` - ISO 8601 dates (YYYY-MM-DD)
- `iso_datetime` - ISO 8601 datetimes (YYYY-MM-DDTHH:MM:SSZ)
- `semver` - Semantic version strings (major.minor.patch)
- `jwt` - JWT-like tokens (base64url format)
- `json` - Simple JSON objects
- `base64` - Base64-encoded strings
- `md5` - MD5 hashes (32 hex chars)
- `sha256` - SHA-256 hashes (64 hex chars)
- `unix_timestamp`

## EDGE CASE GENERATION

In addition to purely random fuzz cases, the harness generates
deterministic edge cases for parameters that declare `min`, `max` or `len` in their schema definitions.

For each constraint, three edge cases are added:

- Just inside the allowable range

    This case should succeed, since it lies strictly within the bounds.

- Exactly on the boundary

    This case should succeed, since it meets the constraint exactly.

- Just outside the boundary

README.md  view on Meta::CPAN


### Basic Property-Based Transform Example

Here's a complete example testing the `abs` builtin function:

**t/conf/abs.yml**:

    ---
    module: builtin
    function: abs

    config:
      test_undef: no
      test_empty: no
      test_nuls: no
      properties:
        enable: true
        trials: 1000

    input:
      number:
        type: number
        position: 0

    output:
      type: number
      min: 0

    transforms:
      positive:
        input:
          number:
            type: number
            min: 0
        output:
          type: number
          min: 0

      negative:
        input:
          number:
            type: number
            max: 0
        output:
          type: number
          min: 0

This configuration:

- Enables property-based testing with 1000 trials per property
- Defines two transforms: one for positive numbers, one for negative
- Automatically generates properties that verify `abs()` always returns non-negative numbers

Generate the test:

    fuzz-harness-generator t/conf/abs.yml > t/abs_property.t

The generated test will include:

- Traditional edge-case tests for boundary conditions
- Random fuzzing with 30 iterations (or as configured)
- Property-based tests that verify the transforms with 1000 trials each

### What Properties Are Tested?

The generator automatically detects and tests these properties based on your transform specifications:

- **Range constraints** - If output has `min` or `max`, verifies results stay within bounds
- **Type preservation** - Ensures numeric inputs produce numeric outputs
- **Definedness** - Verifies the function doesn't return `undef` unexpectedly
- **Specific values** - If output specifies a `value`, checks exact equality

For the `abs` example above, the generated properties verify:

    # For the "positive" transform:
    - Given a positive number, abs() returns >= 0
    - The result is a valid number
    - The result is defined

    # For the "negative" transform:
    - Given a negative number, abs() returns >= 0
    - The result is a valid number
    - The result is defined

### Advanced Example: String Normalization

Here's a more complex example testing a string normalization function:

**t/conf/normalize.yml**:

    ---
    module: Text::Processor
    function: normalize_whitespace

    config:
      properties:
        enable: true
        trials: 500

    input:
      text:
        type: string
        min: 0
        max: 1000
        position: 0

    output:
      type: string
      min: 0
      max: 1000

    transforms:
      empty_preserved:
        input:
          text:
            type: string
            value: ""
        output:
          type: string
          value: ""

      single_space:
        input:
          text:
            type: string
            min: 1
            matches: '^\S+(\s+\S+)*$'
        output:
          type: string
          matches: '^\S+( \S+)*$'

      length_bounded:
        input:
          text:
            type: string
            min: 1
            max: 100
        output:
          type: string
          min: 1
          max: 100

This tests that the normalization function:

- Preserves empty strings (`empty_preserved` transform)
- Collapses multiple spaces into single spaces (`single_space` transform)
- Maintains length constraints (`length_bounded` transform)

### Interpreting Property Test Results

When property-based tests run, you'll see output like:

    ok 123 - negative property holds (1000 trials)
    ok 124 - positive property holds (1000 trials)

If a property fails, Test::LectroTest will attempt to find the minimal failing
case and display it:

    not ok 123 - positive property holds (47 trials)
    # Property failed
    # Reason: counterexample found

This helps you quickly identify edge cases that your function doesn't handle correctly.

### Configuration Options for Property-Based Testing

In the `config` section:

    config:
      properties:
        enable: true     # Enable property-based testing (default: false)
        trials: 1000     # Number of test cases per property (default: 1000)

You can also disable traditional fuzzing and only use property-based tests:

    config:
      properties:
        enable: true
        trials: 5000

    iterations: 0  # Disable random fuzzing, use only property tests

### When to Use Property-Based Testing

Property-based testing with transforms is particularly useful for:

- Mathematical functions (`abs`, `sqrt`, `min`, `max`, etc.)
- Data transformations (encoding, normalization, sanitization)
- Parsers and formatters
- Functions with clear input-output relationships
- Code that should satisfy mathematical properties (commutativity, associativity, idempotence)

### Requirements

Property-based testing requires [Test::LectroTest](https://metacpan.org/pod/Test%3A%3ALectroTest) to be installed:

    cpanm Test::LectroTest

If not installed, the generated tests will automatically skip the property-based
portion with a message.

### Testing Email Validation

    ---
    module: Email::Valid
    function: rfc822

    config:
      properties:
        enable: true
        trials: 200
      close_stdin: true
      test_undef: no
      test_empty: no
      test_nuls: no

    input:
      email:
        type: string
        semantic: email
        position: 0

    output:
      type: boolean

    transforms:
      valid_emails:
        input:
          email:
            type: string
            semantic: email
        output:
          type: boolean

This generates 200 realistic email addresses for testing, rather than random strings.

### Combining Semantic with Regex

You can combine semantic generators with regex validation:

    input:

README.md  view on Meta::CPAN

- Idempotence: f(f(x)) == f(x)
- Commutativity: f(x, y) == f(y, x)
- Associativity: f(f(x, y), z) == f(x, f(y, z))
- Inverse relationships: decode(encode(x)) == x
- Domain-specific invariants: Custom business logic

Define your own properties with custom Perl code:

    transforms:
      normalize:
        input:
          text:
            type: string
        output:
          type: string
        properties:
          - name: single_spaces
            description: "No multiple consecutive spaces"
            code: $result !~ /  /

          - name: no_leading_space
            description: "No space at start"
            code: $result !~ /^\s/

          - name: reversible
            description: "Can be reversed back"
            code: length($result) == length($text)

The code has access to:

- `$result` - The function's return value
- Input variables - All input parameters (e.g., `$text`, `$number`)
- The function itself - Can call it again for idempotence checks

#### Combining Auto-detected and Custom Properties

The generator automatically detects properties from your output spec, and adds
your custom properties:

    transforms:
      sanitize:
        input:
          html:
            type: string
        output:
          type: string
          min: 0              # Auto-detects: defined, min_length >= 0
          max: 10000
        properties:           # Additional custom checks:
          - name: no_scripts
            code: $result !~ /<script/i
          - name: no_iframes
            code: $result !~ /<iframe/i

## GENERATED OUTPUT

The generated test:

- Seeds RND (if configured) for reproducible fuzz runs
- Uses edge cases (per-field and per-type) with configurable probability
- Runs `$iterations` fuzz cases plus appended edge-case runs
- Validates inputs with Params::Get / Params::Validate::Strict
- Validates outputs with [Return::Set](https://metacpan.org/pod/Return%3A%3ASet)
- Runs static `is(... )` corpus tests from Perl and/or YAML corpus
- Runs [Test::LectroTest](https://metacpan.org/pod/Test%3A%3ALectroTest) tests

# METHODS

## generate

Takes a schema file and produces a test file (or STDOUT).

    # Modern named API
    App::Test::Generator->generate(
        schema_file => 'schemas/foo.yml',
        output_file => 'test/foo.t',
    );

    # Legacy positional API
    App::Test::Generator->generate($schema_file, $test_file);

### API Specification

#### Input

    {
        schema_file => { type => 'string', optional => 1 },
        input_file  => { type => 'string', optional => 1 },
        output_file => { type => 'string', optional => 1 },
        schema      => { type => 'hashref', optional => 1 },
        quiet       => { type => 'boolean', optional => 1 },
    }

#### Output

    { type => 'string' }

## render\_fallback

Render any Perl value into a compact Perl source-code string using
[Data::Dumper](https://metacpan.org/pod/Data%3A%3ADumper). Used as a catch-all when no more specific renderer
applies.

    my $code = render_fallback({ key => 'value' });
    # returns: "{'key' => 'value'}"

### Arguments

- `$v`

    Any Perl value, including undef, scalars, refs, and blessed objects.

### Returns

A string of Perl source code that reproduces the value when evaluated.
Returns the string `'undef'` when `$v` is undef.

### Side effects

Temporarily sets `$Data::Dumper::Terse` and `$Data::Dumper::Indent`
to produce compact single-line output. Both are restored on return via

README.md  view on Meta::CPAN


- `$href`

    A hashref whose values are arrayrefs. Keys whose values are not
    arrayrefs are silently skipped.

### Returns

A comma-separated string of `'key' =` \[ val, ... \]> entries, one per
qualifying key, sorted alphabetically. Returns the string `'()'` if
`$href` is undef, empty, or not a hashref — this produces an empty
hash assignment in the generated test rather than a syntax error.

### Side effects

None.

### Notes

Array element values are rendered via `perl_quote` which handles
scalars, arrayrefs, and Regexp objects. Non-arrayref values are
skipped without warning — this is intentional since callers may pass
mixed-value hashes and only want the arrayref entries rendered.

### API specification

#### input

    { href => { type => HASHREF, optional => 1 } }

#### output

    { type => SCALAR }

## perl\_quote

Convert any Perl value into a source-code fragment that reproduces that value
when evaluated in a generated test file.

### Arguments

- `$v`

    Any Perl value. May be undef, a scalar, an arrayref, a Regexp, or a blessed
    object. All types are handled — undef becomes `'undef'`, numbers are
    unquoted, strings are single-quoted, arrayrefs recurse, Regexps become
    `qr{...}`, and anything else falls through to `render_fallback`.

### API specification

#### input

    { v => { type => 'any', optional => 1 } }

#### output

    { type => 'string' }

# NOTES

`seed` and `iterations` really should be within `config`.

# SEE ALSO

- [Test Coverage Report](https://nigelhorne.github.io/App-Test-Generator/coverage/)
- [App::Test::Generator::Template](https://metacpan.org/pod/App%3A%3ATest%3A%3AGenerator%3A%3ATemplate) - Template of the file of tests created by `App::Test::Generator`
- [App::Test::Generator::SchemaExtractor](https://metacpan.org/pod/App%3A%3ATest%3A%3AGenerator%3A%3ASchemaExtractor) - Create schemas from Perl programs
- [Params::Validate::Strict](https://metacpan.org/pod/Params%3A%3AValidate%3A%3AStrict): Schema Definition
- [Params::Get](https://metacpan.org/pod/Params%3A%3AGet): Input validation
- [Return::Set](https://metacpan.org/pod/Return%3A%3ASet): Output validation
- [Test::LectroTest](https://metacpan.org/pod/Test%3A%3ALectroTest)
- [Test::Most](https://metacpan.org/pod/Test%3A%3AMost)
- [YAML::XS](https://metacpan.org/pod/YAML%3A%3AXS)

# AUTHOR

Nigel Horne, `<njh at nigelhorne.com>`

Portions of this module's initial design and documentation were created with the
assistance of AI.

# SUPPORT

This module is provided as-is without any warranty.

You can find documentation for this module with the perldoc command.

    perldoc App::Test::Generator

You can also look for information at:

- MetaCPAN

    [https://metacpan.org/release/App-Test-Generator](https://metacpan.org/release/App-Test-Generator)

- GitHub

    [https://github.com/nigelhorne/App-Test-Generator](https://github.com/nigelhorne/App-Test-Generator)

- CPANTS

    [http://cpants.cpanauthors.org/dist/App-Test-Generator](http://cpants.cpanauthors.org/dist/App-Test-Generator)

- CPAN Testers' Matrix

    [http://matrix.cpantesters.org/?dist=App-Test-Generator](http://matrix.cpantesters.org/?dist=App-Test-Generator)

- CPAN Testers Dependencies

    [http://deps.cpantesters.org/?module=App::Test::Generator](http://deps.cpantesters.org/?module=App::Test::Generator)

# LICENCE AND COPYRIGHT

Copyright 2025-2026 Nigel Horne.

Usage is subject to the terms of GPL2.
If you use it,
please let me know.



( run in 1.018 second using v1.01-cache-2.11-cpan-96521ef73a4 )