formula results from the CPAN

Crypt-HSXKPasswd
view release on metacpan or search on metacpan
    You can visualise this process as follows:

        correct horse batter staple
        correct HORSE battery staple
        25 correct HORSE battery staple 83
        25*correct*HORSE*battery*staple*83
        ++25*correct*HORSE*battery*staple*83++
    
    Each of these steps can be customised in the following ways:

    1   The number of words to be used, and the minimum and maximum lengths
        of the words can be configured.

    2   The case of the words can be modified in a number of ways, including
        randomly choosing the case for each word.

        It is also possible to specify so-called *133t-style* character
        substitutions, e.g. replacing all occurrences of the letter `e' with
        the digit `3', or all occurrences of the letter `s' with the symbol
        `$'.

    3   The number of digits to add as pseudo words to the front and back of
        the password can be configured. A length of zero can be specified
        for both to generate passwords without any randomly chosen digits.

    4   The separator character can be specified directly, or it can be
        randomly chosen from a list of symbols. It is also possible to
        specify that no separator should be used.

    5   The padding character can also be specified directly, or remotely
        chosen from a list of possible symbols. Padding can also be disabled
        completely. If padding is to be used it can be applied in two modes
        - fixed, and adaptive.

        With fixed padding a specified number of copies of the separator
        character are added to the front and back of the password. The fixed
        padding does not have to be symmetric.

        With adaptive padding the required number of copies of the separator
        character are added to the back of the password until it reaches a
        specified length.

  THE MATHS
    Before examining the password strength of passwords generated with this
    module we need to lay out the relatively simple maths underlying it all.

    Maths Primer
    A coin could be used as a very simple password generator. Each character
    in the password would be the result of a single coin toss. If the coin
    lands heads up, we add a `H' to our password, if it lands tails up, we
    add a `T'.

    If you made a one-letter password in this way there would only be two
    possibilities, `H', or `T', or two permutations. If you made a
    two-letter password in this way there would be four possible
    combinations, or permutations, `HH', `HT', `TH', and `TT'. If you made a
    three-character password in this way there would be 16 permutations, a
    five character one would have 32 permutations, and so forth.

    So, for a coin toss, which has two possible values for each character,
    the formula for the number of permutations `P' for a given length of
    password `L' is:

        P = 2^L

    Or, two to the power of the length of the password.

    If we now swapped our coin for a dice, we would go from two possible
    values per letter, to six possible values per letter. For one dice roll
    there would be six permutations, for two there would be 36, for three
    there would be 108 and so on.

    This means that for a dice, the number of permutations can be calculated
    with the formula:

        P = 6^L

    When talking about passwords, the set of possible symbols used for each
    character in the password is referred to as the password's *alphabet*.
    So, for the coin toss the alphabet was just `H' and `T', and for the
    dice it was `1', `2', `3', `4', `5', and `6'. The actual characters used
    in the alphabet make no difference to the strength of the password, all
    that matters is the size of the alphabet, which we'll call `A'.

    As you can probably infer from the two examples above, the formula for
    the number of possible permutations `P' for a password of length `L'
    created from an alphabet of size `A' is:

        P = A^L

    In the real world our passwords are generally made up of a mix of
    letters, digits, and symbols. If we use mixed case that gives us 52
    letters alone, then add in the ten digits from `0' to `9' and we're
    already up to 62 possible characters before we even start on the array
    of symbols and punctuation characters on our keyboards. It's generally
    accepted that if you include symbols and punctuation, there are 95
    characters available for use in randomly generated passwords. Hence, in
    the real-world, the value for `A' is assumed to be 95. When you start
    raising a number as big as 95 to even low powers the number of
    permutations quickly rises.

    A two character password with alphabet of 95 has 9025 permutations,
    increasing the length to three characters brings that up to 857,375, and
    so on. These numbers very quickly become too big to handle. For just an
    8 character password we are talking about 6,634,204,312,890,625
    permutations, which is a number so big most people couldn't say it (what
    do you call something a thousand times bigger than a trillion?).

    Because the numbers get so astronomically big so quickly, computer
    scientists use bits of entropy to measure password strength rather than
    the number of permutations. The formula to turn permutations into bits
    of entropy `E' is very simple:

        E = Log(2)P

    In other words, the entropy is the log to base two of the permutations.
    For our eight character example that equates to about 52 bits.

    There are two approaches to increasing the number of permutations, and
    hence the entropy, you can choose more characters, or, you can make the
    alphabet you are choosing from bigger.

    The Entropy of HSXKPasswd Passwords
    Exactly how much entropy does a password need? That's the subject of
    much debate, and the answer ultimately depends on the value of the
    assets being protected by the password.

    Two common recommendations you hear are 8 characters containing a mix of
    upper and lower case letters, digits, and symbols, or 12 characters with
    the same composition. These evaluation to approximately 52 bits of
    entropy and 78 bits of entropy respectively.

    When evaluating the entropy of passwords generated by this module, it
    has to be done from two points of view for the answer to be meaningful.
    Firstly, a best-case scenario - the attacker has absolutely no knowledge
    of how the password was generated, and hence must mount a brute-force
    attack. Then, secondly from the point of view of an attacker with full
    knowledge of how the password was generated. Not just the knowledge that
    this module was used, but a copy of the dictionary file used, and, a
    copy of the configuration settings used.

    For the purpose of this documentation, the entropy in the first
    scenario, the brute force attack, will be referred to as the blind
    entropy, and the entropy in the second scenario the seen entropy.

    The blind entropy is solely determined by the configuration settings,
    the seen entropy depends on both the settings and the dictionary file
    used.

    Calculating the bind entropy `Eb' is quite straightforward, we just need
    to know the size of the alphabet resulting from the configuration `A',
    and the minimum length of passwords generated with the configuration
    `L', and plug those values into this formula:

        Eb = Log(2)(A^L)

    Calculating `A' simply involves determining whether or not the
    configuration results in a mix of letter cases (26 or 52 characters),
    the inclusion of at least one symbol (if any one is present, assume the
    industry standard of a 33 character search space), and the inclusion of
    at least one digit (10 character). This will result in a value between
    26 and 95.

    Calculating `L' is also straightforward. The one minor complication is
    that some configurations result in a variable length password. In this
    case, assume the shortest possible length the configuration could
    produce.

    The example password from the PHILOSOPHY section
    (`!15.play.MAJOR.fresh.FLAT.23!') was generated using the preset
    `WEB32'. This preset uses four words of between four and five letters
    long, with the case of each word randomly set to all lower or all upper
    as the basis for the password, it then chooses two pairs of random
    digits as extra words to go front and back, before separating each word
    with a copy of a randomly chosen symbol, and padding the front and back
    of the password with a copy of a different randomly chosen symbol. This
    results in passwords that contain a mix of cases, digits, and symbols,
    and are between 27 and 31 characters long. If we add these values into
    the formula we find that the blind entropy for passwords created with
    this preset is:

        Eb = Log(2)(95^27) = 163 bits

    This is spectacularly secure! And, this is the most likely kind of
    attack for a password to face. However, to have confidence in the
    password we must also now calculate the entropy when the attacker knows
    everything about how the password was generated.

    We will calculate the entropy resulting from the same `WEB32' config
    being used to generate a password using the sample library file that
    ships with the module.

    The number of permutations the attacker needs to check is purely the
    product of possibly results for each random choice made during the
    assembly of the password.

    Lets start with the words that will form the core of the password. The
    configuration chooses four words of between four and five letters long
    from the dictionary, and then randomises their case, effectively making
    it a choice from twice as many words (each word in each case).

    The sample dictionary file contains 698 words of the configured length,
    which doubles to 1396. Choosing four words from that very large alphabet
    gives a starting point of `1396^4', or 3,797,883,801,856 permutations.

    Next we need to calculate the permutations for the separator character.
    The configuration specifies just nine permitted characters, and we
    choose just one, so that equates to 9 permutations.

    Similarly, the padding character on the end is chosen from 13 permitted
    symbols giving 13 more permutations.

    Finally, there are four randomly chosen digits, giving `10^4', or 10,000
    permutations.

    The total number of permutations is the product of all these
    permutations:

        Pseen = 3,797,883,801,856 * 9 * 13 * 10,000 = 2.77x10^17
    
    Finally, we convert this to entropy by taking the base 2 log:

        Eseen = Log(2)2.77x10^17 = ~57bits
    
    What this means is that most probably, passwords generated with this
    preset using the sample dictionary file are spectacularly more secure
    than even 12 randomly chosen characters, and, that in the very unlikely
    event that an attackers knows absolutely everything about how the
    password was generated, it is still significantly more secure than 8
    randomly chosen characters.

    Because the exact strength of the passwords produced by this module
    depend on the configuration and dictionary file used, the constructor
    does the above math when creating an HSXKPasswd object, and throws a
    warning if either the blind entropy falls below 78bits, or the seen
    entropy falls below 52 bits.

SUBROUTINES/METHODS
  MODULE CONFIGURATION
( run in 0.698 second using v1.01-cache-2.11-cpan-0bb4e1dffa6 )