W4261 Introduction to Cryptography:
Fall 2016 Lecture Summaries
These are brief (possibly non-comprehensive) summaries written shortly after
each lecture, of what was covered. Readings refer by default to
chapters from the required textbook, though sometimes include pointers
to other texts found on the readings
webpage, or to materials written by us. The material in class does not
always correspond exactly to the material in the textbook: you are
responsible for the material taught in class.
In particular, often the readings pointed below contain significantly
more details and proofs than were covered in class. Conversely, class
occasionally contains material that is not in the textbook (I've tried
to indicate when this is the case).
Please go over the previous class
material (using your notes and textbook as appropriate) before every
lecture.
- Lecture 1 (9/6) Introduction to modern cryptography
and the goals of this class. Secret Sharing: definition of
threshold t-out-of-n secret sharing (correctness and security).
Example construction of 2-out-of-2 secret sharing.
Reading: See
the following lecture notes on
secret sharing.
- Lecture 2 (9/8) Review of
secret sharing, and two definitions of security. 2-out-of-2
additive secret sharing. 2-out-of-n secret sharing from 2-out-of-2
secret sharing (proof by reduction, hybrid proof technique). We did
not quite finish the proof (will do next time).
Reading: See the
following lecture notes on secret
sharing.
- Lecture 3 by Luke (9/13) Review of
2-out-of-2 additive secret sharing. Finished proof of 2-out-of-n
secret sharing from 2-out-of-2 secret sharing (proof by reduction,
hybrid proof technique). Introduced some number theory needed for
Shamir's secret sharing (t-out-of-n secret sharing scheme). Saw
Shamir's secret sharing scheme, proved correctness (through
polynomial interpolation theorem presented without full proof) and
security (through identical distributions perfect security
definition).
Reading: Slides for lecture can be found here. If interested in supplementary
material, Rosulek's textbook Chapter 3 covers Shamir's secret
sharing, while the book by Cramer, Damgård, Nielsen studies
secret sharing in depth (going much beyond what we saw in
class). HW1 out on 9/14.
- Lecture 4 (9/20)
Definition of private key encryption (syntax and correctness).
Kerckhoff's principle. Overview of some historical ciphers (Atbash,
Caesar, Shift, substitution, Vignere) and simple attacks on them
(exhaustive search, frequency analysis). Motivation and definition
of perfect secrecy: two equivalent definitions (we didn't prove
equivalence), capturing the intuition that the ciphertext alone
contains no information about the plaintext.
Shift/substitution cipher is not perfectly secret for
more than one character messages, but yes for one character
messages. Discussed potential implications of removing 0 from the
key space. Defined the one-time pad encryption scheme.
Reading: Ch. 1 (general intro), 2.1,2.2
- Lecture 5 (9/22)
Proved that one-time pad scheme satifies perfect secrecy. Discussed
two inherent issues: First, we showed that one-time pad cannot be
used to send multiple message securely; this problem is inherent to
any (stateless) perfectly secret scheme, but can be fixed by using a
stateful encryption algorithm (often not desirable). Second, we
proved that for any perfectly secret encryption, the key space must
be larger than the message space. Discussion and motivation for the
computational (rather than perfect) approach, protecting only
against "feasible" adversaries, with security except for a "tiny"
probability.
Reading: 2.2, 2.3, 3.1
- Lecture 6 (9/27)
Brief review of running time, polynomial time, ppt, and negligible
definitions, a negligible function times a polynomial is still
negligible. Security parameter.
Definition of EAV security (indistinguishability for a single
message against an eavesdropper). Argued informally that if an
EAV-secure scheme existed with a smaller keyspace than messaage
space, we would have P≠ NP; thus, proving security of such a scheme
is at least as hard as proving P ≠ NP. This is the case for most
other cryptographic primitives. In fact we don't even know how to
prove their existence if we assume P ≠ NP, but instead we prove
security under stronger assumptions (e.g factoring is hard or other
mathematical assumptions) as we will see. Private key
encryption (both EAV security and stronger notions of security) can
be shown equivalent to various other primitives
(OWF,PRG,PRF,Signatures,MAC,..) some of which we will study (all of
them can be constructed from some specific computational hardness
assumptions, and all of them imply P ≠ NP).
Began motivating pseudorandom generators and how they would help
with EAV secure encryption, via a pseudorandom one-time pad.
Reading: 3.1, 3.2.1. Note: our definition of EAV security is
the textbook's definition 3.9 - we did not (yet) mention the
equivalent def 3.8
- Lecture 7 (9/29)
Pseudorandom generators (PRG) definition.
Showed that if you can test whether a string is in the image of G or
not, then you can distinguish the output of G from uniform. This
means that (1) if we required the PRG definition to hold for any
distinguisher (not just ppt) then the definition would not be
satisfiable (the attack can always be executed in exponential time),
(2) if G is such that this attack can be executed in efficiently
time, G is not a PRG, (3) The existence of PRGs implies P ≠ NP.
We then went through several ways to constructions of potential G's
(eg by using an underlying given PRG), and discussed/proved whether or
not they can be a PRG.
Reading: 3.3.1, 3.3.2.
- Lecture 8 (10/4)
Proved that if G is a PRG then G(G(·)) is also a PRG (proof
via one hybrid in between the two we need to prove
indistinguishable). Discussion of how to extend it to many
applications: if you apply G only on the prefix of length n, any
polynomial number of applications results in a PRG; if you apply G
on the whole output, this is ok any number of times that results in
a polynomial size output (the proof is via a hybrid argument, which
we didn't fully present but it's similar to the mini-hybrid argument
that we saw for two applications of G). We concluded that if there
is a PRG with any expansion (even one bit expansion), then there's a
PRG with any polynomial size expansion. We then saw an example candidate
construction of a PRG based on a number theoretic assumption: If p
is a prime and g is an element of Zp*, and some conditions on p,g,
hold, then G(a,b)=(g^a,g^b,g^{ab}) (everything
mod p) is believed to be pseudorandom (and is certainly efficient,
deterministic, and length expanding). This is called the DDH
assumption, which we will revisit later. We then revisited the use
of PRG for encryption, and proved that if PRG exist, then there's an
EAV secure encryption scheme (using a pseudorandom pad).
Reading: 3.3, 7.4.2. The PRG based on DDH is not in the
textbook, although the assumption itself (more formally and for
general groups) is definition 8.63 in the textbook.
- Lecture 9 (10/6)
After recalling the EAV-secure scheme based on PRG that we saw last
time, we showed that it is not EAV secure for multiple messages. In
fact, we proved that there is no (stateless) deterministic encryption
scheme that satisfies multiple message (or even two message) EAV
security (as long as the message space contains at least two different
messages of the same size).
We discussed the stateful use of PRG for encrypting multiple messages
via stream ciphers, and then moved on to stateless encryption (our
default), which must be randomized.
We then discussed stronger notions of security, beyond the
"ciphertext only" attack that EAV security protects against:
known-plaintext attack, chosen plaintext attack (CPA), chosen ciphertext
attack (CCA) (and there are more we did not mention). We defined
formally CPA security (also known as IND-CPA), and claimed (omitting
proof) that CPA security for one message implies CPA security for
multiple messages.
We then showed that if we could have a secret key encoding a random
function from n bits to n bits, we could achieve CPA security by using
the value of the function on a random point as a one-time pad. This
is not possible to do efficiently (it would require exponential size
key), but this motivates pseudorandom functions (PRFs).
Reading: 3.4, 3.5.1
- Lecture 10 (10/11)
Defined PRFs, gave example of a function that is not a PRF,
discussed difference between PRF and PRG. We then proved that PRF
imply PRG, and that PRG implies PRF with a small (logarithmic) input
size, both of these via easy/direct constructions (we proved the
first, left the second proof as an exercise). We then showed
the GGM tree-based construction of PRF (with polynomial input size)
from PRG, but did not prove it.
We discussed why PRFs cannot be used directly as encryption
algorithm (not CPA secure, and not clear how to decrypt). We then
defined pseudorandom permutations (PRP) and strong PRP (also called
"block ciphers"), which solve the second issue, as they allow to
invert (but they are still not CPA secure, so block ciphers can't be
directly used as encryption algorithms).
Reading: 3.5.1, 7.5
- Lecture 11 (10/13)
Proved that PRF implies CPA-secure encryption (we used an encryption
scheme that outputs (r, Fk(r) xor m) for a random r).
This works for messages whose length is the block size of the PRF.
We started discussing encryption of longer messages, by parsing them
into blocks (with appropriate padding to have the length a multiple
of the block size). Block-by-block encryption remains CPA secure
(we didn't formally prove), but ciphertext is long. Discussed ECB
mode of encryption (not even EAV secure, do not use!).
Reading: 3.5.2, 3.6.2
- Lecture 12 (10/18)
Modes of encryption: discussed general considerations (security,
length of ciphertext, whether or not you need to be able to compute
the inverse of the block-cipher/PRF, whether you can do encryption
or decryption in parallel for each block at once, and others). Recalled
block-by-block and ECB from last time, and showed several others:
OFB, CBC, Counter mode. These three are CPA secure (we argued
informally why) but we also discussed a stateful slight variant of
CBC that is not CPA secure (used to attack SSL 3.0/TLS 1.0).
We now switch to constructions of strong PRP from PRF:
Defined Feistel networks, and proved that this gives a permutation
for any round function chosen, and any number of rounds. Showed one
round Feistel is not a PRP, regardless of the round function.
Claimed that 2 round Feistel is also never a PRP - finding the
attack was left as an exercise. Luby Rackoff Theorem: If we
instantiate Feistel network with round functions that are PRF with
independent keys for each round, then after 3 rounds we get a PRP
but not a strong PRP, and after 4 rounds we get a strong PRP.
(We did not prove this theorem).
Reading: 3.6.2, 7.6
- Lecture 13 (10/20)
Showed the attack (distinguisher) on 2 round Feistel (thus, it's not
a PRP). Next, we moved on to practical suggested instantiations for
PRP, also known as block ciphers. Some general discussion and
principles for block ciphers in practice: concrete security with
fixed key and block sizes, any attacks faster than exhaustive search
constitue a serious weakness, avalanche effect is necessary but far
from sufficient). Discussed DES, which follows the Feistel network
structure with 16 rounds, with a specific design of the round
function, which we showed and discussed. Note that this doesn't fall
under Luby-Rackoff theorem since the round function is not a PRF,
and the keys for each round are not independent (however, there are
more rounds). Discussed security of DES (remains without any
significant practical attacks, but key size and sometimes block size
are too small for secure use today). Started discussion of
increasing key length for a given block cipher: double encryption
(eg 2DES) can be attacked.
Reading: 6.2 preamble, 6.2.2, 6.2.3, 6.2.4
- Lecture 14 (10/25)
Showed triple-blockcipher, to increase security of the block
cipher, effectively doubling the key size.
For DES, the resulting 3DES is a reasonable choice, although key
size 112 bits and block size 64 bits are starting to be a bit too
small. We then moved on to AES: high level overview of its
history, security (very strong according to all current
indications), and construction. Mentioned (without much detail)
the underlying idea of Substitution Permutation Networks
(different variations of it are used for the round functions in
both AES and DES), and the principles of confusion and diffusion,
although did not give many details. Overview of how all the
primitives we've seen so far fit together.
CCA secure private key encryption: motivation (including mention
of padding oracle attacks) and definition
(mentioning CCA1 and CCA2 versions). Showed that the CPA secure
encryption we saw is not CCA secure. Briefly mentioned the
notions of malleability / non-malleability / homomorphic encryption.
Reading: 6.2.4,6.2.5, 3.7
- Lecture 15 (10/27)
Message authentication codes (MACS): discussed motivation and
goals (authenticity and integrity). Encryption does not generally
provide authentication (saw this with respect to all the
encryption schemes we saw so far). Definition of security for
MAC (existential unforgeability against adaptive chosen message
attacks),
including variations of fixed-length MAC and strong MAC
(where any valid new pair (m,t) is considered a forgery, even if
only t is new and m was already asked before). Noted that if the
Mac algorithm is deterministic, then there's a unique tag for each
message, and thus standard and strong MAC security definitions are
equivalent. Construction of fixed-length MAC by using a PRF, and
sketch of security proof.
Reading: 4.1, 4.2, 4.3.1
- Midterm Review by Ghada (11/1)
- Midterm (11/3) In-class, open-book open-notes midterm.
- Lecture 16 (11/10) Domain exstension for MAC
(using fixed length MAC to build arbitrary length MAC):
authenticating block-by-block, including random identifier, length
of message, and index of block in each block is secure (high level
of proof sketched). Showed attacks on block-by-block
authentication if any of these is removed. CBC-MAC (arbitrary
length MAC using PRF): showed construction, noted main differences
with CBC encryption (no IV and no intermediate values), and stated
(without proof) security for fixed length messages (only). Two
ways to make it secure for arbitrary length messages: either
prepend the length of the message as the first block, or apply one
more layer of PRF at the end with a fresh key. (Note: in Lecture
19 we will see hash-based MAC as another alternative).
Reading: 4.3, 4.4
- Lecture 17 (11/15)
Authenticated Encryption: discussion and definition -- CCA security
and (tailored) unforgeability. Achieving authenticated encryption
from underlying schemes of CPA secure encryption and secure MAC:
as candidates, we specified algorithms for
Encrypt-and-Authenticate, Encrypt-then-Authenticate,
Authenticate-then-Encrypt and began their security analysis
(proofs were sketched, sometimes in more detail and sometimes
less, but not formally written).
Encrypt-and-Authenticate: satisfies unforgeability, but not even
CPA secure (do not use). Encrypt-then-Authenticate: saw that if
the MAC is not strong, CCA security could be broken (if given
(c,t) it is easy to come with a valid (c,t') for another t', can
feed that to decryption oracle and break CCA).
However, with a strong MAC, this is CCA secure and unforgeable
(authenticated encryption).
Reading: 4.5.1, 4.5.2
- Lecture 18 (11/17)
Completed encrypt-then-authenticate security proof sketch (assuming
strong - eg deterministic - MAC).
Note that even if we don't care about encryption, this is a way to
boost encryption security from CPA to CCA.
Showed that not every CCA secure encryption is also authenticated
encryption (demonstrated via the PRF on message padded with
randomness example from HW3, 3b).
Also note that independent keys for the underlying encryption and
MAC algorithms are crucial for the security (example via encryption
with a PRP as above, and authentication via the inverse of the
PRP). This is a general principle when using different
components.
We then discussed padding oracle attacks (special case of CCA2
attack, utilizying just "valid/invalid ciphertext" answer from the
decryption oracle, and has been launched in practice many times.)
We gave details about one such attack (on CBC mode, where a valid
padding consists of N bytes of value N - the popular standard PKCS#5).
This attack also applies to Authenticate-then-Encrypt which is thus
not CCA secure (although it is CPA secure and unforgeable):
the attack applies directly if invalid ciphertexts result in
different error messages depending on whether the padding was
wrong, or the padding is right but MAC verification failed. Even
if both these generate the same error message, it's still
vulnerable to this attack if we take timing into account
(verification failure error will take longer to output).
As a rule of thumb, if your adversary may be active, all your
algorithms should start with verifying a MAC on the received
message, before doing any other cryptographic operations.
Collision resistant hash functions: motivation and
definition. Discussed asymptotic definition with a keyed hash
function (note that key is known -- no secrets), vs practical
unkeyed hash functions, and why proofs of security that yield a
collision are meaningful even with unkeyed functions.
Mentioned weaker possible definitions: target collision resistance
and one-wayness.
Reading: 4.5.1, 4.5.2, 4.5.4, 3.7.2 (several other variations on
padding oracle attacks (e.g POODLE) were launched in practice, if
you want to read about them - not part of class material), 5.1.
- Lecture 19 (11/22)
Generic attacks on CRHF: brute force (applies even to target
collision resistance), and birthday attack which takes time and space
about square root of the brute force (improved versions with
efficient space and with time-space tradeoffs exist too, but we did
not describe them). This means that we need a longer output (for
security against 2^n adversary, we need to choose output length at
least 2n). Merkle-Damgard transform for CRHF domain extension
(sketched proof of security). Overview of practical constructions
of CRHF: they are
typically based on designing a fixed-length CRHF (could be seen as
relying on a specially designed underlying block-cipher), and then using
Merkle-Damgard or a different way for domain extension.
MD5 has collisions (including meaningful ones) found - do not use.
SHA1: vulnerabilities known, but no explicit collision found
yet. SHA2 family seems ok. Winner of SHA3 competition by NIST,
Keccak, was recently standardized. Sample of applications of CRHF
and hash functions more generally. For MAC: Hash and Mac (domain
extension for MAC), and HMAC overview. Random Oracle Model (ROM)
overview. Equality checking (fingerprinting), which has many
applications (storing passwords, files, etc). Brief mention of
Merkle trees (see textbook for more detail on them).
Reading: 5.1, 5.2, 5.3, 5.4.1, 5.5, 5.6.1, 5.6 (we only
covered some of it, and quickly, but you may be interested to read
all of it), 6.3
- Lecture 20 (11/29)
Quick review of number theory (the finite group (Zn*,1) with
respect to modular multiplication is of size φ(n); for any x in the
group, x^φ(n)=1 so exponents can be reduced modulo the group
order; special case of prime p: Zp* is cyclic, and in poly time we
can generate a random n-bit prime p together with a generator g of
Zp*; etc.)
Discussed the following assumptions over cyclic groups: Discrete
Log assumption (DLA), computational Diffie-Hellman
assumption (CDH), and decisional Diffie-Hellman assumption (DDH).
In any group, if DDH holds then CDH holds, and if CDH then DLA
holds. The converse is not true in general, and even in specific
useful groups where DDH is believed to hold, we don't have a proof
that it's equivalent to DLA (thus, DLA is the weakest/best
assumption of the three).
Diffie-Hellman key exchange, and discussion of its security (the
key is indistinguishable from a random group element if DDH
asumption holds).
DDH assumption does not hold in groups of composite order, so we
discussed how to choose a group of prime order: start with Zp*
where p is a safe prime (p=2q+1 where q is a prime), then use its
subgroup of prime order q (the subgroup generated by g^2; it is
also the group of quadratic residues QRp). For this group, DDH
believed true.
Reading: For number theory background see the relevant parts
of
Angluin's notes and of
appendix B. 8.1.1--8.1.4, 8.3.1--8.3.3, 10.3
- Lecture 21 (12/1)
Quick number theoretic facts (most without proof) for Zp* (p prime)
with a generator g:
It is easy to check whether an element in the group has a square
root (raise to (p-1)/2 and check whether or not it is 1), and if
so the element is called a quadratic residue, and it is easy to
compute its square root.
The subgroup of quadratic residues QRp is generated by g^2 and its
order (size) is half of the size of Zp*, namely (p-1)/2 (if p
happens to be a safe prime, then this group is of prime order).
We showed why DDH does not hold over the group Zp* with the
generator g: g^xy is distinguishable from a random group element
g^z (where x,y,z chosen at random), because g^xy is more likely to
be a quardartic residue than g^z. However, if p is a safe prime
and we work in the subgroup QRp, g^xy and g^z are distributed
identically, and DDH is believed to be true (and thus we can use
DH key exchange).
Briefly discussed how to go from a random group element (the
key selected in the DH protocol) to a
random bit string (needed if the key is to be used for symmetric
key crypto): use a key derivation function (can be done eg using a
theoretical tool called strong extractor, or using a hash function
in the random oracle model).
Discussed vulnerability of DH key exchange to active attacks such
as "man in the middle" attack, and mentioned that for
authenticated channel we need to handle key distribution (eg with a
centralized entity like CAs).
Introduced public key encryption (PKE), and defined security via
indistinguishability. Noted that for PKE, EAV security and CPA
security are equivalent definitions (but not CCA). Showed that key
exhcnage in two rounds (single message from each party) is
equivalent to PKE (although we did not formally define KE). As a
special case of this observation, using
DH key exchange we can get a PKE scheme -- this is the El Gamal
PKE which we described, and is secure under the DDH assumption.
Reading: same as last lecture, as well as 11.1, 11.2, 11.4.1. We
touched upon topics discussed in chapter 10, 11.3, and 13.4.1, but
left most of it uncovered.
- Recitation by Edo (12/2) Focusing on number theory
background and examples.
Reading: Notes from the recitation can be found here. Other reading follows the readings posted for the relevant lectures.
- Lecture 22 (12/6)
Review of El-Gamal PKE.
Factoring assumption for a product of two large primes, and
discussion of its hardness (best algorithm
known is super polynomial but subexponential - which means key
lengths should be higher than desired level of security).
RSA assumption with respect to GenRSA experiment to generate
(N,e,d) (some variations on how e is chosen in the experiment yield
different assumptions, all considered hard). If factoring
assumption is false then RSA assumption is false, but the other
direction is not known (thus, RSA is a stronger assumption).
Indeed, factoring is equivalent to finding φ(N) and equivalent
to finding the inverse of e, all of which will allow to take e-th
roots mod N and break RSA assumption, but it's possible that there
are other ways to break RSA even if factoring is hard.
Textbook (plain) RSA and its insecurity as a PKE (e.g it is deterministic;
there are other attacks as well).
The RSA function (raising to power e mod N) is a permutation over
ZN*. We discussed informally the concept of a trapdoor permutation
(which this is a special case of, assuming the RSA assumption
holds): a permutation that is easy to compute and hard to invert,
but easy to invert given a trapdoor. Any trapdoor permutation can
be used to get a secure PKE, via hard-core bit (we just mentioned
without details). Specifically for RSA permutation, it is known
that finding the lsb with probability non-negligibly better than
1/2 is equivalent to finding the entire preimage (breaking RSA)
with non-negligible probability. This yields a PKE scheme
for one bit messages, secure under the RSA assumption: the
encryption outputs (r^e, lsb(r) xor m).
More efficient is padded RSA (we can prove security when the
message is logarithmic, and insecurity when the pad is logarithmic;
we do not have a proof or an attack when the message and the pad
are the same length). This (and variations) has been used in
practice.
Reading: 8.2 (for the starred subsections we just mentioned the
takeaway), 11.5.1, 11.5.2, and touched on 13.1. If you are
interested (not required for class), you can check
here for recommended key
lengths.
- Lecture 23 (12/8)
Review of RSA permutations and ways to add randomness to it,
towards getting a secure PKE based on RSA assumption. In addition
to the ones from last lectrure, we suggested a
hasing based scheme, where Enc(m)=(r^e, H(r) xor m) for a random
r. This is CPA secure in the random oracle model (ROM).
We talked about CCA security, showing CCA attacks on textbook RSA
and then on all the other PKE candidates we saw so
far. We mentioned OAEP, based on a
two-round Feistel structure with two different hash functions as
round function. OAEP is provably CCA secure in the ROM (when the
two hash functions are modeled as independent random functions).
We also mentioned that there are CCA secure PKE schemes in the
standard model based on RSA assumption, as well as based on DDH
(Cramer-Shoup) and other number theoretic assumptions.
We noted that typically hybrid encryption is used, starting with
PKE to obtain a symmetric key, and then using private key
encryption. Briefly mentioned the notion of KEM/DEM for hybrid encryption, and mentioned
that the hashing based scheme mentioned above can be used as part
of a CCA secure KEM/DEM scheme in ROM.
Defined Signature schemes and their security. Contrasted
signatures with MACs (including public verifiability,
transferability and non-repudiation). Showed textbook RSA
signatures and their insecurity. Existing approaches to
constructing secure signatures: (1) heuristic constructions,
loosely based on number theoretic assumptions, (2) constructions
provably secure in ROM based on number theoretic assumptions, (3)
constructions provably secure in the standard model from (a) number
theoretic assumptions, (b) CRHF, (c) primitives like PRF/PRG etc
(captured by "one-way functions").
The latter also means that in this sense digital signatures belong
to the "private key world" (you don't need a trapdoor or a PKE to
obtain signatures). We gave one example of a construction of
sigatures, falling under category (2): the RSA full domain hash
(RSA-FDH), provably secure in the ROM under the RSA assumption.
A note on signatures vs PKE: the notion that they are
"dual/opposite" and that "to sign, you decrypt, to verify you encrypt"
is wrong (often does not make
sense even syntactically, and when it does it's not secure;
e.g. this notion can be applied to textbook RSA encryption and
textbook RSA signatures, but that is not
secure for either signatures nor encryption).
Wrap up the class with an overview of some more advanced things
happening in cryptography.
Reading: We gave no proofs in this lecture, and stayed on a high
overview level. The textbook has many more details, which can be
found in 11.3, 11.5, 12.1-12.4
Back
to Course Main Page