辅导Cryptography程序、C/C++编程调试、c++设计程序讲解 解析Haskell程序|调试Web开发

- 首页 >> Java编程
PA 4 - Cryptography
In this programming assignment, which consists of two parts, you will build a few famous
encryption/decryption algorithms that can hide the content of messages, from a cipher used by
Julius Caesar to the famous Enigma from WWII. The goal is to practice the programming
constructs you are already familiar with, and add the concepts of arrays, strings and passing
array variables to functions.
Summary
For an introduction to cryptography, check out the Background section. This also contains a
detailed description of the three ciphers you need to implement: the substitution cipher, the
Caesar cipher and a version of the Enigma cipher. The details of the assignment will be
explained in the upcoming General Instructions.
Your deliverable consists of working versions of the functions listed below, which need to go in
the file ciphers.c. Note that we don’t provide you with exact function definitions, but a description
of what they should look like:
To submit the work, please follow the submission instructions for Part 1 and Part 2.
Part 1
index_of_char() arguments: input string, shift, character
returns: index
30 pt
char_at_index() arguments: input string, shift, index
returns: character
30 pt
cipher_substitution() arguments: input string, output string, key, encrypt flag 40 pt
Part 2
cipher_caesar() arguments: input string, output string, key, encrypt flag 40 pt
cipher_enigma() arguments: input string, output string, key, encrypt flag 60 pt
crack_caesar()
(Extra)
arguments: input string, output string, codeword string
returns: 1/0
25 pt
General Instructions
Before starting the assignment, it is important to read the background section on Cryptography
to become familiar with the basics of encryption/decryption and ciphers. The goal of this
assignment is to implement three famous ciphers. You will be writing a separate function for
each of them. The general format of these functions will be similar and has the following
arguments:
● An input message, basically a collection of symbols.
● An output message, also a collection of symbols.
● The key for the cipher. This will differ depending on the type of cipher.
● A flag (i.e., an input that can only have two possible values) that specifies whether the
relationship between the input and output message is an encryption or a decryption. So
for a particular cipher, you will only be writing one function, which will do an encrypt or a
decrypt, as specified by this flag.
For this entire PA, you may assume we will only test valid inputs. This means you do not need
to check this in your functions. The next section will talk more about the format of the input and
output messages.
Note that you are not allowed to use the library (or any other string library). The goal
of this PA is for you to learn how to work with strings directly. Also, throughout this PA, do not
use global variables.
Information about the Strings
For this assignment, we will limit ourselves to only 27 symbols for the messages: the 26 upper
case letters plus the blank space (the character ' '). So a possible message could be "THE DIE
HAS BEEN CAST", but not "You too Brutus?". Note that we do not allow any punctuation (other
than the space).
1
In terms of our C-programs, messages are implemented as strings. These strings are arrays of
characters, with the last one always the ‘\0’ termination character. In fact, we will be using
strings in two ways in this assignment:
1. The input and output messages will be strings. They are composed solely of the
allowed symbols (so the upper case letters plus the blank space). Their length is
variable, but we will assume fewer than 255 characters. As they also need to be able to
hold the final ‘\0’ termination character, you will need a character array of 256 elements.
The presence of the ‘\0’ character at the end is useful to determine the actual length of
the message.
1 This limitation is there to simplify the problem by restricting the possible character set. It also has
historical relevance; the Enigma, for example, only worked with upper case letters.
2. Some of the ciphers also require a code string. This holds all the distinct symbols that
can be encrypted, in some scrambled order. Since we limit ourselves to upper case
letters and the blank space as possible symbols, this means the code string holds 27
symbols plus the termination character. Note that for the Caesar cipher (as will be
explained further in the section on the Caesar Cipher), we do not encrypt the spaces; in
this case the code string only has to hold the 26 letters plus the termination character.
Important note: When you are creating output messages, you will be writing individual
characters to a char array. Make sure you explicitly add the ‘\0’ termination character as well.
Organizing your Code
For this assignment, your main() function will NOT be in the same file as the ciphers you are
implementing. Instead, all the functions you are asked to write, plus any helper functions you
may create, have to go in ciphers.c. Your main() should be in another file: encrypt.c. There
should be no main() function in ciphers.c. The Starter Code reflects this organization already.
We will only test the functions in ciphers.c. You will not be asked to submit your encrypt.c.
This kind of organization is common in software design. It also corresponds to how libraries are
created. We are essentially asking you to create a basic cryptography library. All the functions
are contained in files that are separate from the main program, so that they can be reused by
others. In this case, we will call the functions you wrote in our test and grading programs.
To support this organization, you also need to create a .h file. This is a header file that is
included in the two .c files, so that common definitions can be shared. The .h file contains
constants we need (they are already there as part of the starter code; do not modify them). You
will also have to add the function prototype declarations of your functions to this file. However,
only add the declarations of functions that are called from main(). In this case, all the functions
you are being graded on. For any additional helper functions you may decide to create, do not
put their prototype declarations in the .h file.
As mentioned before, for this entire assignment (parts 1 and 2), you may assume the inputs
provided to the functions are valid. You do not need to test for edge cases.
Starter Code
To get the starter code for this assignment, execute the following command in your home
directory:
$ getStarter PA4
This will create a new subdirectory called “PA4” with all the starter code. In addition to a
Makefile, you will find three files:
ciphers.c
In this file, you will write all the functions we will ask you to implement. The upcoming sections
will provide the detailed specifications for those functions. Note that the file contains the line
#include “ciphers.h”, which includes the ciphers.h header file. Do not delete this line. It is
fine to include additional header files (but not or any other string library).
ciphers.h
In this file, you need to put the function prototype declarations of the functions we are asking
you to write in ciphers.c. Also, there are two constant definitions already provided for you:
ENCRYPT and DECRYPT. We use these for our flag to indicate if the function will be doing an
encrypt or a decrypt operation. Do not modify or delete these constant definitions.
encrypt.c
This is where you will write your main() function. We will not specify what your main needs to
do, as we will not test it. However, you will probably want your main to call the different functions
you wrote in ciphers.c to test them. There is already sample code in there, just to illustrate some
basics of getting user inputs. It also shows how to call some of the cipher functions. You still
2
need to add your own code to actually test your ciphers.
For this assignment, we will not test your main(). We are only interested in the correct behavior
of the functions we will write in ciphers.c.
Compiling Your Code
To compile your code, run
$ gcc encrypt.c ciphers.c -o crypto
This command runs the gcc compiler on the two input .c files and creates an output executable
named crypto (the name that follows the -o flag). It illustrates how you can compile a C-program
in the case where our code is split across multiple files.
If you don’t want to keep on typing the compile command over and over again, note that if you
press the up-arrow on your keyboard when inside a terminal window, you can revisit past
commands you have executed. This is a handy shortcut to keep in mind.
2 The format string in the scanf() function, “ %[^\n]s”, reads a string until a newline is encountered. It lets
you read in strings that contain blank spaces (which is not possible with just “%s”). You do not need to
understand the specifics of how this format string works.
Creating Test Data
You need to thoroughly test that the functions you are writing actually work correctly. This
means that you have to write your own main() in encrypt.c that calls your functions and tests
them for a variety of inputs. The ability to test your own code is an important part of software
design.
While we provide some examples of inputs and the corresponding correct outputs in this
writeup, your testing should be much more extensive. To help you with this, we have provided
you with demo programs that you can use to generate test data.
To compile these demo programs, for example demo1, run:
$ make demo1
This will create the executable demo1. The procedure for the other demo programs is the same.
In addition to generating test data, these demo programs can also help you solidify your
understanding of the algorithms. Verify that the outputs correspond to how you understand the
algorithms should work. Without understanding an algorithm, it is impossible to program it.
These programs do not use any of your functions. They are standalone implementations that we
created to show you correct outputs for encryption (and the reverse, decryption).
demo1 Part 1: demo of char_at_index() and index_of_char()
demo2 Part 1: demo of cipher_substition()
demo3 Part 2: demo of cipher_caesar()
demo4 Part 2: demo of cipher_enigma()
Assignment Details - Part 1
Task 1a - Finding a Character in a Code String
Before building your encryption algorithms, we will ask you to create two helper functions.
These will come in very handy later. Both functions work on code strings. These are strings that
just have all the possible symbols in some order (see Information about the Strings). For
example, in the encrypt.c file in the starter code, the string “alphabet” is a code string.
char_at_index
The first function you need to implement is char_at_index. We will specify exactly what it
needs to do, but we won’t give you the actual function prototype, as we want you to figure this
out yourself. It is a function with three arguments:
char_at_index( arg1 , arg2 , arg3)
1. The first argument, arg1, is the code string (basically a string). It is not modified by this
function. The string does not need to contain unique characters.
2. The second argument, arg2, is an integer that indicates how much the array needs to be
shifted to the left.
3. The third argument, arg3, is an integer, which is an index in the shifted array.
4. The function returns the character that can be found in the shifted array at this index.
The idea is the following: We pass this function a string of symbols (we will use only 5 symbols
here for illustration purposes). The function returns the character at a particular index (position)
in this string, if the string were to be shifted first. This shifting is a “wrap-around” shift to the left,
as shown below (so characters that are shifted out on the left appear on the right).
Original code string
Code string shifted by 2
So, for the example shown above,
char_at_index(“ABCDE”
, 2, 3)
returns the character ‘A’ (indexes in C start at 0).
We don’t know the length of the code string a priori. For example, when using this function when
implementing your ciphers, it might be called with strings of 26 or 27 characters (the upper-case
letters, without or with the space). However, note that you can figure out the length of the string
by knowing that it ends with the ‘\0’ character. You may assume the string is not empty.
Note that the string you are passing should not be modified. The shift is basically something you
should imagine being applied to the input string prior to looking for the character at a certain
index. You don’t want to actually move all the items around. There is a better way to do this, that
does not involve an actual shift. Before you start coding, take the time to think about how you
can implement this function efficiently. It is important to come up with a good plan first.
Also, when you implement this function, make sure it works for both positive values of this shift
(so shifting the array to the left) and negative values (which corresponds to shifting the array to
the right). The shift values can be any number (as long as it is an integer, since it is a parameter
of type int). The index is always within the length of the string.
To get a better understanding of this helper function and the index_of_char below, you can
also run the demo1 program, as explained in Creating Test Data.
index_of_char
The second function, index_of_char, is related to the previous one. It is essentially the
complement of it. Again, it is a function with three arguments (as before, what you see below is
a description of the function, not the actual function prototype):
index_of_char( arg1 , arg2 , arg3)
1. The first argument, arg1, is the code string. The string contains unique characters; i.e.,
no character is repeated. It is not modified by this function.
2. The second argument, arg2, is an integer that indicates how much the array needs to be
shifted to the left.
3. The third argument, arg3, is a character that needs to be found. You may assume the
character is present in the array and is present exactly once.
4. The function returns an integer, which is the index of this character in the shifted array.
So, for the same example from before,
index_of_char(“ABCDE”
, 2,
‘D’)
returns the value 1 (remember indexes start at 0 in C).
Again, when implementing this function, you want to think about whether you actually need to
shift all symbols around. This function also needs to work for both positive and negative integer
values of the shift. As with char_at_index, we don’t know the length of the string a priori (but
it is not an empty string). You may also assume the character that needs to be found is actually
present in the string (and only once).
Task 1b - Substitution Cipher
The first actual cipher you will implement is the substitution cipher. The basic idea is that each
character maps exactly to one other character. For example, all A’s are replaced by T’s, all B’s
become G’s, etc. For a more detailed explanation, check out the Substitution Cipher
background section.
The way we specify how to do the substitution is through a code string. This string will have 27
symbols: the 26 upper case letters and the space. The interpretation of the code string is as
follows: the first symbol indicates what to map an A to, the second what to map a B to, etc. So
the code string “TGL ….” basically says that A’s become T’s, B’s become G’s, etc. when
encrypting (and the reverse when decrypting). Note that the blank space character is just
treated as another symbol in this cipher (as also explained in the Substitution Cipher
background section).
The goal is to implement a function that can encrypt/decrypt with a substitution cipher. The
same function should be able to do both, with a flag specifying whether the input needs to be
encrypted or decrypted. The functionality should be as follows:
cipher_substitution(arg1, arg2, arg3, arg4)
1. The first argument, arg1, is a string containing the data that needs to be encrypted or
decrypted. It is not modified by this function.
2. The second argument, arg2, is a string that will contain the result of the encryption or
decryption.
3. The third argument, arg3, is the code string used for our substitution cipher. So this has
27 entries (plus the ‘\0’ character). It is not modified by this function.
4. For the fourth argument, arg4, which is an integer, we either pass ENCRYPT or
DECRYPT, which are the two constants that we defined in ciphers.h (see Starter Code).
5. The function returns nothing.
For example (and again using just 5 characters in the code rather than the full 27 for simplicity
here), assume we want to encrypt some data (the 4th argument of the function is ENCRYPT). If
the data to be encrypted is “DDAB” and the code is “BDEAC”, the result of the encryption is
“AABD”. Also check out encrypt.c in the starter code. It illustrates how to call this function in
your code.
Hint: Consider calling the two functions you implemented before (Task 1 - Finding a Character
in a Code String) in your implementation of the substitution cipher. This was the reason why we
asked you to create those first. Also, you may want to create the string “ABCD ...Z “ (the
alphabet plus a blank space) in your function to help you as well.
Submitting Part 1
Testing your Code
To grade your work, we will compile your functions in your ciphers.c with our own main(). For
this to work, your function specifications have to be correct. We created test programs
which allow you to verify that this is indeed the case. They will compile your functions in
ciphers.c with a test main we wrote.
3
To compile the test programs, run the following command in your PA4 directory:
$ makeTest PA4a
If this is successful, you will see two new executables in your directory:
test1 This tests char_at_index() and index_of_char().
test2 This tests cipher_substitution().
If you get compilation errors running these tests, your code will not pass our grading scripts. The
errors could be due to mistakes in your functions or incompatible function specifications. If your
code compiles correctly with your own main() in encrypt.c, but does not with the test programs,
it is probably the latter.
These tests only check a very simple example. They are not meant to help you verify if your
code does the right thing. You need to do more extensive testing on your own. The test
programs are mainly there to make sure your code is compatible with our grading scripts. Note
that if you make changes to your code, you need to execute makeTest PA4a again, as
otherwise the executables test1 and test2 will not have been compiled with your latest code.
Submission Instructions
To submit Part 1 of your assignment, run the following command:
$ submit PA4a
This will submit your ciphers.c file. It is the only file we need. As mentioned before, we will be
testing your functions with our own main(). For Part 1, we will only test the functions that are
required there. We will ignore any ones that are for Part 2, so it is fine to have unfinished,
non-working code of Part 2 already in there as long as the program compiles. It is important to
re-emphasize that ciphers.c cannot contain a main() function (see Organizing your Code).
Make sure you have verified that your code compiles and runs correctly with the test programs
(see Testing your Code).
3
If you try these programs before you write any of your code, you will get errors as you haven’t
defined your functions yet.
Assignment Details - Part 2
For Part 2 of the PA, you are asked to create two more encryption algorithms. These will go in
the same ciphers.c file of Part 1. So, at the end of Part 2, your work for both Part 1 and 2 will be
there. The reason is to allow reuse of helper functions between the two parts.
Task 2a - Caesar Cipher
You have to implement a function that can encrypt/decrypt using a caesar cipher. Check out the
Caesar Cipher background section for more details on how it works. Essentially, each letter of
the alphabet is replaced by one that comes a certain number of places after it (this number is
the cipher’s key). Note that blank spaces simply remain blank spaces when using this cipher.
The function you need to implement has the following functionality:
cipher_caesar(arg1, arg2, arg3, arg4)
1. The first argument, arg1, is a string containing the data that needs to be encrypted or
decrypted. It is not modified by this function.
2. The second argument, arg2, is a string that will contain the result of the encryption or
decryption.
3. The third argument, arg3, is the key of the Caesar cipher (an integer). A positive value of
x means that during encryption a letter is replaced by one coming x positions after it in
the alphabet. This argument can be positive or negative, and you may assume it is
between -26 and 26.
4. For the fourth argument, arg4, which is an integer, we either pass ENCRYPT or
DECRYPT, which are the two constants that we defined (see Starter Code).
5. The function returns nothing.
For example, when calling this function with ENCRYPT and a key of 1, the data “HELLO” will be
encrypted as “IFMMP”.
As discussed in the background section, a caesar cipher is a special case of a substitution
cipher. So you could base your solution on the function for that cipher. However, is that the best
approach? You also have the two other functions you implemented in Task 1a - Finding a
Character in a Code String. It is worthwhile thinking a little bit about how we can implement our
Caesar cipher efficiently. This will also prepare you for the next problem: implementing the
Enigma cipher.
Hint: You may again want to build a helper string, but now “ABC ..Z” (without the blank space).
Task 2b - Enigma
For this problem, you need to implement a simplified version of the famous Enigma cipher that
was used by the axis powers in WWII. The cracking of the Enigma cipher also is a seminal
moment in the history of computer science. You can learn a bit more about this in the History
section on Enigma.
For a detailed explanation of the workings of this cipher, first read our write-up in the Enigma
Operation for this PA section. This will provide you with all the details you need. Make sure you
thoroughly understand the Enigma operation before you go on to the coding part. Note that the
Enigma treats spaces just like other characters (so we are again working with a set of 27
symbols, as with the substitution cipher).
The function you need to implement has the following functionality:
cipher_enigma(arg1, arg2, arg3, arg4)
1. The first argument, arg1, is a string containing the data that needs to be encrypted or
decrypted. It is not modified by this function.
2. The second argument, arg2, is a string that will contain the result of the encryption or
decryption.
3. The third argument, arg3, is a non-negative integer number (of at most 4 digits),
specifying the key of the Enigma cipher. The two right-most decimal digits (what are
called the least significant digits) of this number represent the initial shift (to right!) of the
inner rotor (what we called key_part_1 in Enigma Operation for this PA). The next two
digits give the initial shift (to the right) of the middle rotor (what we called key_part_2).
For example, the number 1304 means that the initial shift of the inner rotor is 4 positions
and that of the middle rotor is 13 positions .
4
4. For the fourth argument, arg4, which is an integer, we either pass ENCRYPT or
DECRYPT, which are the two constants that we defined.
5. The function returns nothing.
The default positions for all the rotors (i.e., without any initial shift) and the organization of the
symbols on each of the rotors is as shown in Enigma Operation for this PA. This information is
also already included in the ciphers.c starter code, so you don’t have to copy it from this
document.
When testing your Enigma, make sure it also works for messages longer than 27 symbols (to
capture the effects of the rotors moving after encrypting each symbol).
4
If you want to test it with the number like 0312, write it as 312 rather than with the leading 0. The reason
is that a number with a leading 0 is interpreted as an octal (i.e., base-8) number, rather than a decimal
(i.e., base-10) number.
Task 3 - Extra Challenge: Cracking the Code (Optional)
The Caesar cipher is not really secure, as you can imagine. You could simply examine each
possible shift and pick the one that results in a sensical message. For this extra challenge, it is
your job to write a program that can crack a message encrypted with a Caesar cipher and an
unknown key.
We are going to assume the following scenario. We suspect people are planning a coup against
our friend Julius and we are intercepting their messages. Ironically, they are using Caesar’s own
cipher, but with a key that is unknown to us. However, we are pretty sure they will mention the
word “CAESAR” somewhere in their message (they are planning a coup against him after all).
So we can just go try to figure out what shift to use by realizing that the correct decryption will
result in a message that has the word CAESAR in it somewhere. We will assume there is no
ambiguity -- basically that there is no word that with key1 maps to the same encrypted text as
the word CAESAR maps to when encrypted with key2 (this assumption turns out to be
reasonable).
So the goal now is to build a decryption function for a Caesar cipher when we don’t know the
key. However, we specify a keyword we suspect will appear in the decrypted text. In the
example above, this keyword was CAESAR, but you need to build a function for which we can
specify any keyword (up to 255 characters max, as our messages are never longer than that
anyway).
Note that we need to find an exact match for the keyword. It can not be part of another word.
Instead, we need to find the keyword as a standalone word (we are not interested in messages
about CAESARION). If the keyword was found, the function should decrypt the message and
return a value of 1 (signifying success). If the keyword could not be found, the function should
return a value of 0 (signifying failure).
The function you need to implement does decryption only. It has the following functionality:
crack_caesar(arg1, arg2, arg3)
1. The first argument, arg1, is a string containing the data that needs to be decrypted.
2. The second argument, arg2, is a string that will contain the result of the decryption. If the
keyword was not found, it does not matter what arg2 is.
3. The third argument, arg3, is a string that specifies the keyword to look for.
4. The function returns 1 if the decryption was successful; otherwise it returns 0.
For example, we could ask your program to crack this message:
CVKJ KRBV TRVJRI J KFZCVK GRGVI
With the keyword: CAESAR
Important: Make sure that you spell the function name correctly. Any compilation errors in our
grading script because of misspelled names are not eligible for regrades.
Submitting Part 2
Testing your Code
To compile the test programs for Part 2, run the following command in your PA4 directory:
$ makeTest PA4b
If this was successful, you will see two new executables in your directory:
test3 This tests cipher_caesar().
test4 This tests cipher_enigma().
Again, these simple scripts are not meant to provide extensive testing, but rather only to verify
that your code is compatible with our grading scripts.
Submission Instructions
To submit Part 2 of your assignment, run the following command:
$ submit PA4b
Again, this will submit only ciphers.c. Remember that ciphers.c cannot contain a main() function
(see Organizing your Code). Make sure you have verified that your code compiles and runs
correctly with the test programs (see Testing your Code).
Background
Cryptography
Encryption is the process of encoding a message such that only authorized people can
interpret it. You can think of it as scrambling the content. Encryption algorithms are also called
ciphers. They convert the original message, which is also referred to as plaintext, into an
encrypted message, also called the ciphertext.
The trick, of course, is to not only make the message unreadable, but also to be able to reverse
this process. This is called decrypting a message or decryption. Good encryption hides a
message in such a way that it is difficult to unscramble it without some extra information, called
a key. So even if your secret encrypted message were to fall into the wrong hands, they could
do nothing with it, as they wouldn’t be able to unscramble it without this key. Instead only
someone with the key is able to decrypt it. At least that is the goal. Breaking an encryption, or
cracking the code, is basically the ability to find out the message without knowing the key, or
equivalently, discovering the key somehow based on the encrypted message. Good encryption
algorithms are those that are hard to break.
The history of encryption is fascinating. Over the centuries, people have proposed many
ciphers, which others have consequently tried to crack to show their vulnerabilities. In this
assignment, you will implement a few ciphers that are important because of their historical
significance, but which nowadays are considered pretty vulnerable. Nevertheless, they provide
a good introduction to some of the basic elements of encryption, which are present in most
modern ciphers as well.
Also, the different ways in which people try to crack ciphers are really interesting and creative.
These are called attacks. It is like playing detective and eking out information from an array of
different sources to try to crack a code. For example, people have used the fact that not all
letters occur with the same frequency in typical messages, have looked at repetitions in
encrypted messages, or have measured the electrical power it would take to run a certain
algorithm, and much more. Data security is an active and fascinating research area to this day.
Caesar Cipher
One of the first known military uses of codes was by Julius Caesar in 50 - 60 BC. It is reported
he would scramble his messages by replacing each letter in the alphabet by one that would
follow three positions later. This is illustrated in the figure below. The message “ALEA IACTA
EST” would thus be encrypted as “DOHD LDFWD HVW”. In an era when most people couldn’t
read (or would assume this was simply another language), this was probably reasonably
effective at the time.
In general, a code where we shift all the letters in the alphabet by a fixed value (the key) is
nowadays referred to as a Caesar cipher. Another famous example of this is known as ROT13.
It is a Caesar cipher with a shift of 13. An interesting property is that, because there are 26
letters in the alphabet, applying ROT13 again on the ciphertext turns it back into the original
message, so encryption and decryption are basically the same.
For our assignment, we will only consider upper-case letters, such that there are only 26
symbols to consider. The one extra symbol we add is the blank space (to bring our total to 27).
We will do this for all ciphers in our assignment. Note that typically (or at least historically) for
Caesar ciphers only the letters are shifted, as also indicated in the figure. The blank space in
between
站长地图