App-Test-Generator

 view release on metacpan or  search on metacpan

doc/SchemaExtractor.pm  view on Meta::CPAN

    max: 254
    min: 5
    matches: /^[^@]+@[^@]+\.[^@]+$/
    optional: 0
    type: string
method: validate_email
notes: []
```

### 3. Edit schemas as needed

Low confidence schemas may need manual correction:

```yaml
---
confidence: low
input:
  thing:
    optional: 0
    type: string  # <- You might need to fix this
method: mysterious_method
notes:
  - 'thing: type unknown - please review'
```

### 4. Use with Test Generator

```bash
fuzz-harness-generator -r schemas/validate_email.yaml
```

## Usage Examples

### Basic Usage

```bash
# Extract schemas with default settings
extract-schemas lib/MyModule.pm

# Specify output directory
extract-schemas --output-dir my_schemas lib/MyModule.pm

# Verbose mode (shows analysis details)
extract-schemas --verbose lib/MyModule.pm
```

### Run the Demo

Test the extractor with the included demo:

```bash
perl demo_extractor.pl
```

This creates a sample module, extracts schemas, and validates the results.

## How It Works

### 1. POD Analysis

The extractor looks for parameter documentation in POD:

```perl
=head2 validate_email($email)

=head3 INPUT

  $email - string (5-254 chars), email address

Returns: 1 if valid
=cut
```

Extracts:
- Type: `string`
- Min: `5`
- Max: `254`
- Optional: `false` (inferred from "required")

### 2. Code Pattern Analysis

Detects validation patterns:

```perl
sub validate_email {
    my ($self, $email) = @_;

    croak "Email required" unless defined $email;
    croak "Too short" unless length($email) >= 5;     # min: 5
    croak "Too long" unless length($email) <= 254;    # max: 254
    croak "Invalid" unless $email =~ /pattern/;       # matches: /pattern/

    return 1;
}
```

### 3. Type Inference

Infers types from usage:

```perl
if (ref($param) eq 'ARRAY')  # → type: arrayref
if (ref($param) eq 'HASH')   # → type: hashref
if ($param =~ /regex/)       # → type: string
if ($param > 5)              # → type: number
```

### 4. Confidence Scoring

- **High**: Well-documented with POD and validation code
- **Medium**: Some information from code or partial POD
- **Low**: Minimal information, needs manual review

## Schema Format

Generated YAML schemas follow this structure:

```yaml
---
method: method_name
confidence: high|medium|low



( run in 1.229 second using v1.01-cache-2.11-cpan-39bf76dae61 )