Crypt-PQClean-Sign

 view release on metacpan or  search on metacpan

pqclean/crypto_sign/falcon-1024/aarch64/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-1024/aarch64/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-1024/avx2/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-1024/avx2/fft.c  view on Meta::CPAN

    size_t t, n, hn, m;

    /*
     * First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
     * (because GM[1] = w^rev(1) = w^(N/2) = i).
     * In our chosen representation, this is a no-op: everything is
     * already where it should be.
     */

    /*
     * Subsequent iterations are truncated to use only the first
     * half of values.
     */
    n = (size_t)1 << logn;
    hn = n >> 1;
    t = hn;
    for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
        size_t ht, hm, i1, j1;

        ht = t >> 1;
        hm = m >> 1;

pqclean/crypto_sign/falcon-1024/avx2/fft.c  view on Meta::CPAN

     *               f[j] = x + y
     *               f[j + t] = s * (x - y)
     *       t = dt
     *   for i1 = 0; i1 < N; i1 ++:
     *       f[i1] = f[i1] / N
     *
     * iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
     * (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
     *
     * In the main loop (not counting the final division loop), in
     * all iterations except the last, the first and second half of f[]
     * (as an array of complex numbers) are separate. In our chosen
     * representation, we do not keep the second half.
     *
     * The last iteration recombines the recomputed half with the
     * implicit half, and should yield only real numbers since the
     * target polynomial is real; moreover, s = i at that step.
     * Thus, when considering x and y:
     *    y = conj(x) since the final f[j] must be real
     *    Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
     *    filled with 2*Im(x).

pqclean/crypto_sign/falcon-1024/avx2/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-1024/clean/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-1024/clean/fft.c  view on Meta::CPAN

    size_t t, n, hn, m;

    /*
     * First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
     * (because GM[1] = w^rev(1) = w^(N/2) = i).
     * In our chosen representation, this is a no-op: everything is
     * already where it should be.
     */

    /*
     * Subsequent iterations are truncated to use only the first
     * half of values.
     */
    n = (size_t)1 << logn;
    hn = n >> 1;
    t = hn;
    for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
        size_t ht, hm, i1, j1;

        ht = t >> 1;
        hm = m >> 1;

pqclean/crypto_sign/falcon-1024/clean/fft.c  view on Meta::CPAN

     *               f[j] = x + y
     *               f[j + t] = s * (x - y)
     *       t = dt
     *   for i1 = 0; i1 < N; i1 ++:
     *       f[i1] = f[i1] / N
     *
     * iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
     * (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
     *
     * In the main loop (not counting the final division loop), in
     * all iterations except the last, the first and second half of f[]
     * (as an array of complex numbers) are separate. In our chosen
     * representation, we do not keep the second half.
     *
     * The last iteration recombines the recomputed half with the
     * implicit half, and should yield only real numbers since the
     * target polynomial is real; moreover, s = i at that step.
     * Thus, when considering x and y:
     *    y = conj(x) since the final f[j] must be real
     *    Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
     *    filled with 2*Im(x).

pqclean/crypto_sign/falcon-1024/clean/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-512/aarch64/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-512/aarch64/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-512/avx2/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-512/avx2/fft.c  view on Meta::CPAN

    size_t t, n, hn, m;

    /*
     * First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
     * (because GM[1] = w^rev(1) = w^(N/2) = i).
     * In our chosen representation, this is a no-op: everything is
     * already where it should be.
     */

    /*
     * Subsequent iterations are truncated to use only the first
     * half of values.
     */
    n = (size_t)1 << logn;
    hn = n >> 1;
    t = hn;
    for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
        size_t ht, hm, i1, j1;

        ht = t >> 1;
        hm = m >> 1;

pqclean/crypto_sign/falcon-512/avx2/fft.c  view on Meta::CPAN

     *               f[j] = x + y
     *               f[j + t] = s * (x - y)
     *       t = dt
     *   for i1 = 0; i1 < N; i1 ++:
     *       f[i1] = f[i1] / N
     *
     * iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
     * (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
     *
     * In the main loop (not counting the final division loop), in
     * all iterations except the last, the first and second half of f[]
     * (as an array of complex numbers) are separate. In our chosen
     * representation, we do not keep the second half.
     *
     * The last iteration recombines the recomputed half with the
     * implicit half, and should yield only real numbers since the
     * target polynomial is real; moreover, s = i at that step.
     * Thus, when considering x and y:
     *    y = conj(x) since the final f[j] must be real
     *    Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
     *    filled with 2*Im(x).

pqclean/crypto_sign/falcon-512/avx2/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-512/clean/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-512/clean/fft.c  view on Meta::CPAN

    size_t t, n, hn, m;

    /*
     * First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
     * (because GM[1] = w^rev(1) = w^(N/2) = i).
     * In our chosen representation, this is a no-op: everything is
     * already where it should be.
     */

    /*
     * Subsequent iterations are truncated to use only the first
     * half of values.
     */
    n = (size_t)1 << logn;
    hn = n >> 1;
    t = hn;
    for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
        size_t ht, hm, i1, j1;

        ht = t >> 1;
        hm = m >> 1;

pqclean/crypto_sign/falcon-512/clean/fft.c  view on Meta::CPAN

     *               f[j] = x + y
     *               f[j + t] = s * (x - y)
     *       t = dt
     *   for i1 = 0; i1 < N; i1 ++:
     *       f[i1] = f[i1] / N
     *
     * iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
     * (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
     *
     * In the main loop (not counting the final division loop), in
     * all iterations except the last, the first and second half of f[]
     * (as an array of complex numbers) are separate. In our chosen
     * representation, we do not keep the second half.
     *
     * The last iteration recombines the recomputed half with the
     * implicit half, and should yield only real numbers since the
     * target polynomial is real; moreover, s = i at that step.
     * Thus, when considering x and y:
     *    y = conj(x) since the final f[j] must be real
     *    Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
     *    filled with 2*Im(x).

pqclean/crypto_sign/falcon-512/clean/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-padded-1024/aarch64/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-padded-1024/aarch64/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-padded-1024/avx2/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-padded-1024/avx2/fft.c  view on Meta::CPAN

    size_t t, n, hn, m;

    /*
     * First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
     * (because GM[1] = w^rev(1) = w^(N/2) = i).
     * In our chosen representation, this is a no-op: everything is
     * already where it should be.
     */

    /*
     * Subsequent iterations are truncated to use only the first
     * half of values.
     */
    n = (size_t)1 << logn;
    hn = n >> 1;
    t = hn;
    for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
        size_t ht, hm, i1, j1;

        ht = t >> 1;
        hm = m >> 1;

pqclean/crypto_sign/falcon-padded-1024/avx2/fft.c  view on Meta::CPAN

     *               f[j] = x + y
     *               f[j + t] = s * (x - y)
     *       t = dt
     *   for i1 = 0; i1 < N; i1 ++:
     *       f[i1] = f[i1] / N
     *
     * iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
     * (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
     *
     * In the main loop (not counting the final division loop), in
     * all iterations except the last, the first and second half of f[]
     * (as an array of complex numbers) are separate. In our chosen
     * representation, we do not keep the second half.
     *
     * The last iteration recombines the recomputed half with the
     * implicit half, and should yield only real numbers since the
     * target polynomial is real; moreover, s = i at that step.
     * Thus, when considering x and y:
     *    y = conj(x) since the final f[j] must be real
     *    Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
     *    filled with 2*Im(x).

pqclean/crypto_sign/falcon-padded-1024/avx2/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-padded-1024/clean/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-padded-1024/clean/fft.c  view on Meta::CPAN

    size_t t, n, hn, m;

    /*
     * First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
     * (because GM[1] = w^rev(1) = w^(N/2) = i).
     * In our chosen representation, this is a no-op: everything is
     * already where it should be.
     */

    /*
     * Subsequent iterations are truncated to use only the first
     * half of values.
     */
    n = (size_t)1 << logn;
    hn = n >> 1;
    t = hn;
    for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
        size_t ht, hm, i1, j1;

        ht = t >> 1;
        hm = m >> 1;

pqclean/crypto_sign/falcon-padded-1024/clean/fft.c  view on Meta::CPAN

     *               f[j] = x + y
     *               f[j + t] = s * (x - y)
     *       t = dt
     *   for i1 = 0; i1 < N; i1 ++:
     *       f[i1] = f[i1] / N
     *
     * iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
     * (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
     *
     * In the main loop (not counting the final division loop), in
     * all iterations except the last, the first and second half of f[]
     * (as an array of complex numbers) are separate. In our chosen
     * representation, we do not keep the second half.
     *
     * The last iteration recombines the recomputed half with the
     * implicit half, and should yield only real numbers since the
     * target polynomial is real; moreover, s = i at that step.
     * Thus, when considering x and y:
     *    y = conj(x) since the final f[j] must be real
     *    Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
     *    filled with 2*Im(x).

pqclean/crypto_sign/falcon-padded-1024/clean/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-padded-512/aarch64/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-padded-512/aarch64/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-padded-512/avx2/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {

pqclean/crypto_sign/falcon-padded-512/avx2/fft.c  view on Meta::CPAN

    size_t t, n, hn, m;

    /*
     * First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
     * (because GM[1] = w^rev(1) = w^(N/2) = i).
     * In our chosen representation, this is a no-op: everything is
     * already where it should be.
     */

    /*
     * Subsequent iterations are truncated to use only the first
     * half of values.
     */
    n = (size_t)1 << logn;
    hn = n >> 1;
    t = hn;
    for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
        size_t ht, hm, i1, j1;

        ht = t >> 1;
        hm = m >> 1;

pqclean/crypto_sign/falcon-padded-512/avx2/fft.c  view on Meta::CPAN

     *               f[j] = x + y
     *               f[j + t] = s * (x - y)
     *       t = dt
     *   for i1 = 0; i1 < N; i1 ++:
     *       f[i1] = f[i1] / N
     *
     * iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
     * (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
     *
     * In the main loop (not counting the final division loop), in
     * all iterations except the last, the first and second half of f[]
     * (as an array of complex numbers) are separate. In our chosen
     * representation, we do not keep the second half.
     *
     * The last iteration recombines the recomputed half with the
     * implicit half, and should yield only real numbers since the
     * target polynomial is real; moreover, s = i at that step.
     * Thus, when considering x and y:
     *    y = conj(x) since the final f[j] must be real
     *    Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
     *    filled with 2*Im(x).

pqclean/crypto_sign/falcon-padded-512/avx2/keygen.c  view on Meta::CPAN

     * The values a and b start at x and y, respectively. Since x
     * and y are odd, their GCD is odd, and it is easily seen that
     * all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
     * GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
     * or b is reduced by at least one bit at each iteration, so
     * the algorithm necessarily converges on the case a = b, at
     * which point the common value is the GCD.
     *
     * In the algorithm expressed above, when a = b, the fourth case
     * applies, and sets b = 0. Since a contains the GCD of x and y,
     * which are both odd, a must be odd, and subsequent iterations
     * (if any) will simply divide b by 2 repeatedly, which has no
     * consequence. Thus, the algorithm can run for more iterations
     * than necessary; the final GCD will be in a, and the (u,v)
     * coefficients will be (u0,v0).
     *
     *
     * The presentation above is bit-by-bit. It can be sped up by
     * noticing that all decisions are taken based on the low bits
     * and high bits of a and b. We can extract the two top words
     * and low word of each of a and b, and compute reduction
     * parameters pa, pb, qa and qb such that the new values for
     * a and b are:

pqclean/crypto_sign/falcon-padded-512/clean/codec.c  view on Meta::CPAN

        /*
         * We pushed exactly 8 bits.
         */
        acc_len += 8;

        /*
         * Push as many zeros as necessary, then a one. Since the
         * absolute value is at most 2047, w can only range up to
         * 15 at this point, thus we will add at most 16 bits
         * here. With the 8 bits above and possibly up to 7 bits
         * from previous iterations, we may go up to 31 bits, which
         * will fit in the accumulator, which is an uint32_t.
         */
        acc <<= (w + 1);
        acc |= 1;
        acc_len += w + 1;

        /*
         * Produce all full bytes.
         */
        while (acc_len >= 8) {



( run in 1.082 second using v1.01-cache-2.11-cpan-96521ef73a4 )