view release on metacpan or search on metacpan
pqclean/crypto_sign/falcon-1024/aarch64/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-1024/aarch64/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-1024/avx2/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-1024/avx2/fft.c view on Meta::CPAN
size_t t, n, hn, m;
/*
* First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
* (because GM[1] = w^rev(1) = w^(N/2) = i).
* In our chosen representation, this is a no-op: everything is
* already where it should be.
*/
/*
* Subsequent iterations are truncated to use only the first
* half of values.
*/
n = (size_t)1 << logn;
hn = n >> 1;
t = hn;
for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
size_t ht, hm, i1, j1;
ht = t >> 1;
hm = m >> 1;
pqclean/crypto_sign/falcon-1024/avx2/fft.c view on Meta::CPAN
* f[j] = x + y
* f[j + t] = s * (x - y)
* t = dt
* for i1 = 0; i1 < N; i1 ++:
* f[i1] = f[i1] / N
*
* iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
* (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
*
* In the main loop (not counting the final division loop), in
* all iterations except the last, the first and second half of f[]
* (as an array of complex numbers) are separate. In our chosen
* representation, we do not keep the second half.
*
* The last iteration recombines the recomputed half with the
* implicit half, and should yield only real numbers since the
* target polynomial is real; moreover, s = i at that step.
* Thus, when considering x and y:
* y = conj(x) since the final f[j] must be real
* Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
* filled with 2*Im(x).
pqclean/crypto_sign/falcon-1024/avx2/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-1024/clean/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-1024/clean/fft.c view on Meta::CPAN
size_t t, n, hn, m;
/*
* First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
* (because GM[1] = w^rev(1) = w^(N/2) = i).
* In our chosen representation, this is a no-op: everything is
* already where it should be.
*/
/*
* Subsequent iterations are truncated to use only the first
* half of values.
*/
n = (size_t)1 << logn;
hn = n >> 1;
t = hn;
for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
size_t ht, hm, i1, j1;
ht = t >> 1;
hm = m >> 1;
pqclean/crypto_sign/falcon-1024/clean/fft.c view on Meta::CPAN
* f[j] = x + y
* f[j + t] = s * (x - y)
* t = dt
* for i1 = 0; i1 < N; i1 ++:
* f[i1] = f[i1] / N
*
* iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
* (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
*
* In the main loop (not counting the final division loop), in
* all iterations except the last, the first and second half of f[]
* (as an array of complex numbers) are separate. In our chosen
* representation, we do not keep the second half.
*
* The last iteration recombines the recomputed half with the
* implicit half, and should yield only real numbers since the
* target polynomial is real; moreover, s = i at that step.
* Thus, when considering x and y:
* y = conj(x) since the final f[j] must be real
* Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
* filled with 2*Im(x).
pqclean/crypto_sign/falcon-1024/clean/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-512/aarch64/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-512/aarch64/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-512/avx2/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-512/avx2/fft.c view on Meta::CPAN
size_t t, n, hn, m;
/*
* First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
* (because GM[1] = w^rev(1) = w^(N/2) = i).
* In our chosen representation, this is a no-op: everything is
* already where it should be.
*/
/*
* Subsequent iterations are truncated to use only the first
* half of values.
*/
n = (size_t)1 << logn;
hn = n >> 1;
t = hn;
for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
size_t ht, hm, i1, j1;
ht = t >> 1;
hm = m >> 1;
pqclean/crypto_sign/falcon-512/avx2/fft.c view on Meta::CPAN
* f[j] = x + y
* f[j + t] = s * (x - y)
* t = dt
* for i1 = 0; i1 < N; i1 ++:
* f[i1] = f[i1] / N
*
* iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
* (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
*
* In the main loop (not counting the final division loop), in
* all iterations except the last, the first and second half of f[]
* (as an array of complex numbers) are separate. In our chosen
* representation, we do not keep the second half.
*
* The last iteration recombines the recomputed half with the
* implicit half, and should yield only real numbers since the
* target polynomial is real; moreover, s = i at that step.
* Thus, when considering x and y:
* y = conj(x) since the final f[j] must be real
* Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
* filled with 2*Im(x).
pqclean/crypto_sign/falcon-512/avx2/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-512/clean/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-512/clean/fft.c view on Meta::CPAN
size_t t, n, hn, m;
/*
* First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
* (because GM[1] = w^rev(1) = w^(N/2) = i).
* In our chosen representation, this is a no-op: everything is
* already where it should be.
*/
/*
* Subsequent iterations are truncated to use only the first
* half of values.
*/
n = (size_t)1 << logn;
hn = n >> 1;
t = hn;
for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
size_t ht, hm, i1, j1;
ht = t >> 1;
hm = m >> 1;
pqclean/crypto_sign/falcon-512/clean/fft.c view on Meta::CPAN
* f[j] = x + y
* f[j + t] = s * (x - y)
* t = dt
* for i1 = 0; i1 < N; i1 ++:
* f[i1] = f[i1] / N
*
* iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
* (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
*
* In the main loop (not counting the final division loop), in
* all iterations except the last, the first and second half of f[]
* (as an array of complex numbers) are separate. In our chosen
* representation, we do not keep the second half.
*
* The last iteration recombines the recomputed half with the
* implicit half, and should yield only real numbers since the
* target polynomial is real; moreover, s = i at that step.
* Thus, when considering x and y:
* y = conj(x) since the final f[j] must be real
* Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
* filled with 2*Im(x).
pqclean/crypto_sign/falcon-512/clean/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-padded-1024/aarch64/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-padded-1024/aarch64/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-padded-1024/avx2/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-padded-1024/avx2/fft.c view on Meta::CPAN
size_t t, n, hn, m;
/*
* First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
* (because GM[1] = w^rev(1) = w^(N/2) = i).
* In our chosen representation, this is a no-op: everything is
* already where it should be.
*/
/*
* Subsequent iterations are truncated to use only the first
* half of values.
*/
n = (size_t)1 << logn;
hn = n >> 1;
t = hn;
for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
size_t ht, hm, i1, j1;
ht = t >> 1;
hm = m >> 1;
pqclean/crypto_sign/falcon-padded-1024/avx2/fft.c view on Meta::CPAN
* f[j] = x + y
* f[j + t] = s * (x - y)
* t = dt
* for i1 = 0; i1 < N; i1 ++:
* f[i1] = f[i1] / N
*
* iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
* (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
*
* In the main loop (not counting the final division loop), in
* all iterations except the last, the first and second half of f[]
* (as an array of complex numbers) are separate. In our chosen
* representation, we do not keep the second half.
*
* The last iteration recombines the recomputed half with the
* implicit half, and should yield only real numbers since the
* target polynomial is real; moreover, s = i at that step.
* Thus, when considering x and y:
* y = conj(x) since the final f[j] must be real
* Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
* filled with 2*Im(x).
pqclean/crypto_sign/falcon-padded-1024/avx2/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-padded-1024/clean/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-padded-1024/clean/fft.c view on Meta::CPAN
size_t t, n, hn, m;
/*
* First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
* (because GM[1] = w^rev(1) = w^(N/2) = i).
* In our chosen representation, this is a no-op: everything is
* already where it should be.
*/
/*
* Subsequent iterations are truncated to use only the first
* half of values.
*/
n = (size_t)1 << logn;
hn = n >> 1;
t = hn;
for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
size_t ht, hm, i1, j1;
ht = t >> 1;
hm = m >> 1;
pqclean/crypto_sign/falcon-padded-1024/clean/fft.c view on Meta::CPAN
* f[j] = x + y
* f[j + t] = s * (x - y)
* t = dt
* for i1 = 0; i1 < N; i1 ++:
* f[i1] = f[i1] / N
*
* iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
* (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
*
* In the main loop (not counting the final division loop), in
* all iterations except the last, the first and second half of f[]
* (as an array of complex numbers) are separate. In our chosen
* representation, we do not keep the second half.
*
* The last iteration recombines the recomputed half with the
* implicit half, and should yield only real numbers since the
* target polynomial is real; moreover, s = i at that step.
* Thus, when considering x and y:
* y = conj(x) since the final f[j] must be real
* Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
* filled with 2*Im(x).
pqclean/crypto_sign/falcon-padded-1024/clean/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-padded-512/aarch64/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-padded-512/aarch64/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-padded-512/avx2/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {
pqclean/crypto_sign/falcon-padded-512/avx2/fft.c view on Meta::CPAN
size_t t, n, hn, m;
/*
* First iteration: compute f[j] + i * f[j+N/2] for all j < N/2
* (because GM[1] = w^rev(1) = w^(N/2) = i).
* In our chosen representation, this is a no-op: everything is
* already where it should be.
*/
/*
* Subsequent iterations are truncated to use only the first
* half of values.
*/
n = (size_t)1 << logn;
hn = n >> 1;
t = hn;
for (u = 1, m = 2; u < logn; u ++, m <<= 1) {
size_t ht, hm, i1, j1;
ht = t >> 1;
hm = m >> 1;
pqclean/crypto_sign/falcon-padded-512/avx2/fft.c view on Meta::CPAN
* f[j] = x + y
* f[j + t] = s * (x - y)
* t = dt
* for i1 = 0; i1 < N; i1 ++:
* f[i1] = f[i1] / N
*
* iGM[k] contains (1/w)^rev(k) for primitive root w = exp(i*pi/N)
* (actually, iGM[k] = 1/GM[k] = conj(GM[k])).
*
* In the main loop (not counting the final division loop), in
* all iterations except the last, the first and second half of f[]
* (as an array of complex numbers) are separate. In our chosen
* representation, we do not keep the second half.
*
* The last iteration recombines the recomputed half with the
* implicit half, and should yield only real numbers since the
* target polynomial is real; moreover, s = i at that step.
* Thus, when considering x and y:
* y = conj(x) since the final f[j] must be real
* Therefore, f[j] is filled with 2*Re(x), and f[j + t] is
* filled with 2*Im(x).
pqclean/crypto_sign/falcon-padded-512/avx2/keygen.c view on Meta::CPAN
* The values a and b start at x and y, respectively. Since x
* and y are odd, their GCD is odd, and it is easily seen that
* all steps conserve the GCD (GCD(a-b,b) = GCD(a, b);
* GCD(a/2,b) = GCD(a,b) if GCD(a,b) is odd). Moreover, either a
* or b is reduced by at least one bit at each iteration, so
* the algorithm necessarily converges on the case a = b, at
* which point the common value is the GCD.
*
* In the algorithm expressed above, when a = b, the fourth case
* applies, and sets b = 0. Since a contains the GCD of x and y,
* which are both odd, a must be odd, and subsequent iterations
* (if any) will simply divide b by 2 repeatedly, which has no
* consequence. Thus, the algorithm can run for more iterations
* than necessary; the final GCD will be in a, and the (u,v)
* coefficients will be (u0,v0).
*
*
* The presentation above is bit-by-bit. It can be sped up by
* noticing that all decisions are taken based on the low bits
* and high bits of a and b. We can extract the two top words
* and low word of each of a and b, and compute reduction
* parameters pa, pb, qa and qb such that the new values for
* a and b are:
pqclean/crypto_sign/falcon-padded-512/clean/codec.c view on Meta::CPAN
/*
* We pushed exactly 8 bits.
*/
acc_len += 8;
/*
* Push as many zeros as necessary, then a one. Since the
* absolute value is at most 2047, w can only range up to
* 15 at this point, thus we will add at most 16 bits
* here. With the 8 bits above and possibly up to 7 bits
* from previous iterations, we may go up to 31 bits, which
* will fit in the accumulator, which is an uint32_t.
*/
acc <<= (w + 1);
acc |= 1;
acc_len += w + 1;
/*
* Produce all full bytes.
*/
while (acc_len >= 8) {