代写CSC209、代做C/C++设计编程
- 首页 >> C/C++编程 1/11
A3: CSCSHELL
Due Sunday by 11:59p.m.
Points 200
Available after Feb 14 at 12a.m.
A3 – CSCSHELL
CSC209 Winter 2024
Last updated (Mar 15th): See 5.7
Table of Contents
1. Introduction
2. Starter Files
2.1. Actually getting started
3. CSCSHELL
4. Limitations (shell stuff you won’t have to implement)
5. Implementation and Requirements
5.1. Managing errors
5.2. Shell Variables
5.2.1. Creation
5.2.2. Usage
5.2.3. PATH
5.2.4. Error checking with variable assignment
5.3. Scripts and .cscsh_init
5.4. Execution
5.4.1. The cd command
5.5. File redirection
5.6. Piping
5.7. Notes and Clarifications (check here for any updates)
6. Marking
1. Introduction
In our first few lectures together we learned the basic “syntax” of the command line. We talked a little bit
about terminal programs, and this thing called BASH, or more generally the shell. In this assignment, you
will be implementing a basic shell that can perform many, if not all of the operations we’ve needed in the
course thus far (to my surprise, the man pages, and vi are also supported, at least on teach.cs and my
own machine). It will be able to use the system to start executables (ones with appropriate permissions)
found in any directory listed in a special $PATH variable as well as any with an absolute or relative path
CSCSHELL
2/11
(e.g. if you typed ./hello ) Of course, you’ll be able to also supply command line arguments to these
programs. This shell will be interactive as well as mildly scriptable. When “interactive”, typing into the
terminal program (itself running your shell program) will provide input to the stdin of other running
executables, as well as display the output of stdout and stderr . Really, what you're already used to at
the "command line", but written by you! Ultimately, it will support file redirection, piping, and reading
commands from a script file, so you are not solely limited to interactive usage of this shell.
While we won’t be able to exhaustively support any and all programs, that are say, installed on teach.cs
(this would require continual maintenance and bug-fixing), we will be able to use the basic system
programs we’ve learned to use, like cd (in fact, a special case), ls , mkdir, rm, grep, touch, man,
cat, wc etc. -- as well as all the programs you have developed for the course.
This assignment will feature a lot of C-string parsing, file i/o, as well as standard usage of the system call
(families): fork , exec , wait , pipe and dup .
2. Starter Files
The header file cscshell.h has lot’s more information to get you started. Please do not remove or edit
anything that is already there, but feel free to add more function declarations, macros, and any libraries
that don’t need linking and are available on teach.cs.
Do not change the struct definitions, the tester will be using these, so you're stuck with them.
In addition to cscshell.h , there is the accompanying cscshell.c which features the main() entry point
for your program. This source file should not need to be changed. If you do, you may only add
functionality, and not remove any existing functionality (also we won’t be testing any of the additional
functionality that you add).
The files parse.c and run.c are the two main parts of this assignment. Every function you implement
should be added to one of these two files, depending on if it is for parsing shell text, or running
processes/other executables.
2.1. Actually getting started
The material you will need for the system-call aspect of this assignment will not be covered in class until
weeks 7 and 8, so get started with parsing the shell syntax as soon as you can (since you have
everything you’ll need for this), and start asking yourself what would you need to be able to do the rest.
The source files include a lot of additional information meant to direct you towards successfully
completing the assignment, as well as some helpers and partially completed functions. Feel free to start
from scratch according to cscshell.h if you're that kind of person.
3. CSCSHELL
CSCSHELL
3/11
This shell will be similar to POSIX sh
(https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html) (much like BASH
(https://www.gnu.org/software/bash/) ; which you find on the teach.cs machines, Ubunutu, or the WSL; or
zsh
(https://wiki.archlinux.org/title/zsh#:~:text=Zsh%20is%20a%20powerful%20shell,improved%20tab%20comple
; which you find on your Apple products). We will however keep it profoundly simplified, focusing on the
following features:
1. Comments: anything trailing a # character on any line is a comment and thus ignored
2. Variable assignment and usage. See 5.2.
3. Script execution, including a mandatory init script that must initially define a ${PATH} variable at
some point
4. A cd command, as this is not an executable, it is system control access provided by a shell
5. Using system calls to start executables with arbitrary string arguments. Programs are the first part of
any usable line of CSCSHELL (provided it is not cd , variable assignment or a comment), and can be
either a name found in one of the ${PATH} directories, or an unambiguous path to the program
(relative or absolute are both acceptable).
6. File redirection from stdout to new (or re-creating existing) files or appending to existing ones, as
well as redirection from a file to stdin
7. Piping between the stdout of one program and the stdin of the next. All processes must run
simultaneously.
4. Limitations (shell stuff you won’t have to implement)
Even the most basic POSIX sh (https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html)
supports many more features, and is in fact, a fairly comprehensive scripting language. We will not be
implementing any features aside from what is listed in 3. Some obvious limitations:
1. No support for meta-characters/matching, e.g. '*'
2. Loops, conditionals, general programmatic scripting features
3. Command line arguments to the scripts (somewhat pointless without 2.)
4. Redirection to/with file descriptors numbers (through shell syntax)
5. Command history and tab-completion
Furthermore, having only one environment variable ( $PATH ) is fairly limiting, and so is maintaining the
variables as a linked list. If you are really into this work, I’d suggest extending the shell with some of
these missing features if you’re ever looking for exercises to do.
5. Implementation and Requirements
5.1. Managing errors
CSCSHELL
4/11
Errors that are due to system failures e.g. like malloc running out of memory should cause the shell to
terminate. However, errors pertaining to the usage of CSCSHELL should be handled gracefully without
ending the shell, these should just print an error message. The header cscshell.h lists a variety of
macros with (formatting) strings to print to stderr when something goes wrong. You will find examples in
cscshell.c about how to use them, specifically using ERR_PRINT . If you can’t decide which error to use,
use the general-purpose error:
#define ERR_EXECUTE_LINE "Could not execute line.\n"
Update: You may use any error message with the ERR_PRINT macro to pass the automated tests.
There need not be any confusion as to which message to use. Note that good messages will help you
debug.
5.2. Shell Variables
5.2.1. Creation
Shell variables must be specified as a name, consisting of only alphabetic characters and _
(underscore) characters. Variable values are a string that can contain any ASCII characters.
Note that the newline character for a line is not part of the variables value.
A single = must separate the variable name and the value, without any spaces in between the name
and '=' . There can be spaces in the variables value, and the value can also be empty (nothing, or
simply whitespace after '=' ). This means that you might also find extra whitespace, including '\t'
( '\n' and '\r' gets messy when using fgets to read input, so we will relax this and not worry about it).
This whitespace does not need to be removed from the value. Some extreme, but correct variable
assignments might look like this:
PATH=Some Text, that can definitely include spaces.
OTHER_VAR=Example Has Special(tab) Whitespace.
EMPTY_VAR_ONE=
EMPTY_VAR_TWO=
Note: EMPTY_VAR_TWO actually has some spaces as a value, EMPTY_VAR_ONE does not, highlight the
whitespace with your cursor to see more clearly.
Some examples of incorrect assignment are as follows:
BAD_VAR2=It has a number in the name
Badvar =Shouldn't have a space before the equal sign.
ls > THING=this is just very wrong, hopefully it is clear why!
Note: Quotations will be simply assumed to be part of the string, hopefully making your job way easier.
Therefore, in this example:
OTHR_VAR="hello world"
CSCSHELL
5/11
The name of the variable is OTHR_VAR and it’s value includes the quotes, storing: "hello world" (13
characters – and a null terminator as a C-string).
Finally, notice if you type the literal escape sequence '\t' or '\n' , that these are interpreted as two
characters, but since we don’t use TAB for autocomplete in this shell, using the TAB key will give you a
'\t' character, in the form of inserting a big whitespace (and the C char value '\t' ). Refer back to
OTHR_VAR above.
5.2.2. Usage
Clarification (Mar-04): it is acceptable in this assignments for variables to only ever be used in
commands, as in, usage is not considered during creation (don't need to support $VAR=Use some
$OTHER value, you can support this if you want).
1. Simple
Variable usage in the shell consists of a prefixed $ before the variable’s name. As in:
SPECIAL_FILE=/etc/password
echo "The file: $SPECIAL_FILE does not really exist."
Note: this is ambiguous if you’d like to combine the variable with symbols that are valid variable
characters. The end of the variable is clear because of the space, but instead, consider:
PROJECT_DIR=a3/solution/
gcc -Wall -std=gnu99 -c $PROJECT_DIRcscshell.c
The end of the variable is ambiguous!
2. Unambiguous
The unambiguous variable usage does not rely on a invalid name character to demarcate the end of
the variable, and instead adds curly braces {} on either side of the name to make this
unambiguous.
Consider how this fixes the issue in 5.2.2.1:
PROJECT_DIR=a3/solution/
gcc -Wall -std=gnu99 -g -o solution ${PROJECT_DIR}cscshell.c
5.2.3. PATH
This variable is just like the $PATH variable in the shell on teach.cs. Try printing it out with echo $PATH .
Essentially, it is a set of paths to directories, separated by colons. The trailing / is optional for any path
on the $PATH . In the interest of quick and syntactically seamless O(1) access, maintain this variable as
the head of the variable list!
5.2.4. Error checking with variable assignment
CSCSHELL
6/11
A few error conditions should be checked when parsing variable assignments. cscshell.h lists errors
that will need checking for convenient macros) to stderr . The following two macros are provided for
error checking pertaining to variable assignment specifically.
#define ERR_VAR_START "Assignment cannot start with '=' character.\n"
#define ERR_VAR_NAME "Variable names must only contain alphabetic characters and\
'_' chars.\n Got: %s\n"
1. (Hint) Don’t consider variable assignment mixed with pipes or redirection
Notice that the command MY_VAR="something" | grep foo is nonsensical, there is no stdin or stdout
relating to variable assignment.
Your solution should handle the above example by creating (or updating) the variable MY_VAR to have
the (string) value "something" | grep foo .
We will not test the similarly nonsensical ls | MY_VAR="something" .
2. (Hint) All the text to be parsed will be ASCII-formatted
Thus, you can use the character codes/numeric ranges to do error checking and parsing if you so
desire.
5.3. Scripts and .cscsh_init
Scripts should operate as if each line in the file were typed into the shell program one at a time. Except…
You won’t have to type it interactively.
Therefore the code provided for running the shell interactively should provide you a great example of
how to use the functions to implement run_script . Notice that we use run_script to start the shell’s
“init” file. Be sure to go over the source code we give you before starting!
Finally, you’ll eventually notice that running your shell program on teach.cs through ssh will probably
result in a broken backspace. Running the command stty sane should be enough to sort this issue, so
consider adding this to your CSCSHELL’s init file, as well as the following path variable assignment to
access the basic utilties:
PATH=/local/bin:/usr/bin:/bin/
stty sane
You can download this as a .cscshell_init file here (https://q.utoronto.ca/courses/337029/files/30784621?
wrap=1) (https://q.utoronto.ca/courses/337029/files/30784621/download?download_frd=1) . Note that the
leading dot '.' in the filename means it is a hidden file. You can see hidden files with the 'a' option in ls, as
in, with '-l': ls -la
5.4. Execution
CSCSHELL
7/11
The name (for an executable file in one of the $PATH paths) or path (relative or absolute) to an
executable file is the first non-whitespace aspect of any shell “command” (or line) unless it is one of
these three (four) exceptions:
1. Variable assignment
2. The cd command (https://q.utoronto.ca/#org6294c43)
3. Comments
4. (blank lines)
This means, aside from piping, in which case there are multiple commands, the name of the “command”,
i.e. some executable identified via $PATH or directly with an absolute or relative path, must be the first
part of each “command”. Subsequent strings on a line are arguments to this command, unless they
are cases of 5.6 or 5.5. Arguments are separated from each other (and the command) by one-or-more
spaces (not tabs or new-line or carriage whitespace etc).
Standard execution of lines shall require one child process per executable that needs to be run. The
parent process for the shell must wait (using the wait family of system calls) for the child process to
complete.
5.4.1. The cd command
Unlike ls , your other bread-and-butter shell command cd is not an executable. Notice that cd has no
sensical stdin/out .
We’ve provided you the helper function cd_cscshell , which you should use in execute_line to actually
change directories. Your cd command must accept either no arguments or one additional argument.
With no arguments, the command should move to the current user’s home directory. Type man getpwuid
and notice the pw_dir member of struct passwd , and take a moment to understand what makes
prompt() in cscshell.c work.
If one or more arguments are provided, your cd should use the first argument as a relative or absolute
path and just ignore the rest if provided. The special paths of '.' and '..' should also work.
5.5. File redirection
File redirection must be specified after all arguments for commands (see 5.4) have been specified. In
other words ls -l > out.txt is correct, but ls > out.txt -l is not (because the -l is after the
redirection). You must support all three redirection possibilities, and mixing up to one output redirection
and one input redirection:
1. Input redirection “<”, e.g. grep keyword < file_to_search.txt
2. Output redirection “>”, e.g. ls -l > output.txt
3. Output redirection as appending “>>”, e.g. ls -lah --author >> directory_history_snapshots.txt
With two re-directions, as in wc -l < contacts.txt > num_contacts.txt the order of input or output
redirection is irrelevant, either one may go first. You cannot mix both output redirection ( ">" ) as well as
CSCSHELL
8/11
appending the redirected output ( ">>" ).
We will not be implementing support for input appending ( << ).
5.6. Piping
Valid combinations of piping mixed with redirection should work, but there are invalid configurations
(though these can be overcome leveraging the program tee , see man tee ).
The following is an invalid combination, that BASH gracefully ignores:
ls -l > out.txt | grep l
You should not implement a tee -like solution (i.e. pipe stdout from ls and write to out.txt ) to handle
the above. You should either print an error, or simply write to out.txt without causing grep to hang
(waiting on a pipe).
So long as your shell program does not exit (i.e. segfault or intentionally exit operation), you may handle
invalid piping and redirection combinations as you see fit!
Otherwise, in general, a child process should be created for each executable listed on a single line. They
must, of course, be separated by a pipe character | (refreshing your memory on pipes in BASH my be
helpful).
The leftmost process should write its stdout to the write-end of a pipe, whose read-end is connected to
the stdin of the next program in the pipe sequence. The stdin for the first program is unaffected by
pipes, as is the final program’s stdout . That is, they should either connect to the default (terminal)
stdin/out or be subject to 5.5.
5.7. Notes and Clarifications (check here for any updates)
5.7.1. Before assignment release
Leading whitespace before commands or any other function are acceptable and must be simply
ignored.
5.7.2 Feb 28th run_command and some suggestions
It should have been clearer in the source code comments, but run_command 's return value should be
the child's PID (HINT: run_command should only start one command at a time, multiple processes are
expected to be managed in execute_line ). The parent process (the original shell process) should be
the one returning this PID. Another HINT: it should return this to execute_line where the waiting
happens on all the running children.
Note: The parent process is always the one that is providing prompts, or reading from a script.
Getting parse_line (and consequently replace_variables_mk_line ) to be bug free is going to take
time, but try not to leave run.c until the last minute. As of week-7, you have everything you need to
start new processes and exec executables, but it won't be until week-8 that you will learn how to
CSCSHELL
9/11
make the pipes that connect processes, and the file-descriptor management that goes along with
this.
5.7.3 Mar 4th The parse_line implementation, Using '=' outside of variable assignment, and other
variables in new variable creation
For simplicity, feel free to assume that if there is a valid (not-commented) ' =' character, that the
intent of a line is that it be a variable assignment. So you can assume if = in line; then variable
assignment, without penalty. This of course breaks compatibility with command line args of the form
--option=, but we will not worry about it. It shall, of course, also be acceptable to support =
signs well. As in, if = with no *space before it; then variable assignment; else it is command
execution. Thanks to Piazza @1702 for discussion.
The command struct returned by parse_line must be complete (as in, all fields contain their final
values), aside from the following: stdin_fd and stdout_fd may be used as you see fit, when you
see fit, and their values will not be tested directly (only the expected functionality of the shell itself will
be tested, as in, if I ls -l > file.txt then I'd expect at some point that stdout_fd be connected to
an open file called file.txt , but we will only test if the file actually gets the right bytes written to it!).
Thanks to Piazza @1708 for bringing this up.
5.2 has been updated, pertaining to: It is acceptable in this assignments for variables to only ever be
used in commands, as in, usage is not considered during creation (don't need to support $VAR=Use
some $OTHER value , you can support this if you want).
Other, more general clarifications:
The goal of this assignment it to complete a working shell
That is, after you run cscshell your terminal should work a lot like normal, just now it will be
slightly more limited (as described above), and use a shell implementation that you've written.
How are you expected to complete the assignment? There are no programmatic steps to follow...
Besides using the design provided.
Start from main, trace out what will need to happen to start a shell, which can execute
something like ls
The functions listed in cscshell.h are a bit of a puzzle to piece together through a couple
exercises like this
5.7.4 March 4th Part 2: Command structs
Clarification: The command struct's members will be tested and are as follows:
exec_path: (tests will examine this) The path to the literal executable run, returned by
resolve_executable
args: (tests will examine, but will ignore index 0) These are the argv arguments to your program.
You may re-use the exec_path for argv[0].
next: (tested to retrieve the whole list) the next command if a line contains multiple commands
stdin_fd: (untested) an integer to store fd's for the process' stdin
stdout_fd: (untested) same as above for stdout
CSCSHELL
10/11
redir_in_path: (tested) for input redirection, should match the exact (relative or absolute) path
provided on the original line
redir_out_path: (tested) same as above
redir_append: (tested) non-zero if and only if the command is redirecting and appending to the
destination
The only members considered/needed for a 'cd' command are exec_path and args[1], noting that
resolve_executable already returns a string literal for cd in this case.
5.7.5 Update to 5.3
The shell instruction suggestions originally in 5.3 were in the wrong order
It should be
PATH=/local/bin:/usr/bin:/bin/
stty sane
Download a copy of the file (https://q.utoronto.ca/courses/337029/files/30784478?wrap=1)
(https://q.utoronto.ca/courses/337029/files/30784478/download?download_frd=1)
5.7.6 Confirmation of ERR_PRINT and testing; invalid piping testing
Any string may be used to pass the automated tests, as long as you use the ERR_PRINT macro (or
equivalent).
Feel free to use whatever message makes the most sense to you, we will be checking for
"ERROR:" in stderr, or the lackthereof
Invalid combinations of piping and redirection will not be extensively tested, section 5.6 has been
clarified to this effect
"So long as your shell program does not exit (i.e. segfault or intentionally exit operation), you may
handle invalid piping and redirection combinations as you see fit!"
5.7.7 Parsing redirects and pipes without spacing
This question has come up a few times
If there are no spaces around redirect symbols or pipes, this is still considered valid
So you should handle cases like ls>out.txt and ls|grep file
5.7.8 Minor clarifications from Piazza
You don't need to support && or ; operators. Or while or for or if for that matter :)
If you can't make a valid command struct from the line, you should throw an error in parse_line
The sanity tests are there for some feedback, they reflect your code's conformance to my solution,
not the more flexible notion of cscshell behaviour, if it is "working" on teach.cs but not passing tests, it
could just be because, say, you managed your file descriptors in different functions than I did. My
guess is that you are calling dup in execute line rather than run_command, and that's fine, but the
sanity tests won't work.
We will run your shell as a standalone program for the remaining tests, and not rely on these sorts of
design decisions, so the final results will be different.
CSCSHELL
11/11
6. Marking
This assignment will be marked in a very similar fashion to what you’ve seen so far.
75% will test the core functionality outlined in this document
15% will be style graded by the TA as per the rest of the course
5% will be specifically awarded for the correct usage of perror for all system calls
5% will be specifically awarded for effective parallelism using pipes
A3: CSCSHELL
Due Sunday by 11:59p.m.
Points 200
Available after Feb 14 at 12a.m.
A3 – CSCSHELL
CSC209 Winter 2024
Last updated (Mar 15th): See 5.7
Table of Contents
1. Introduction
2. Starter Files
2.1. Actually getting started
3. CSCSHELL
4. Limitations (shell stuff you won’t have to implement)
5. Implementation and Requirements
5.1. Managing errors
5.2. Shell Variables
5.2.1. Creation
5.2.2. Usage
5.2.3. PATH
5.2.4. Error checking with variable assignment
5.3. Scripts and .cscsh_init
5.4. Execution
5.4.1. The cd command
5.5. File redirection
5.6. Piping
5.7. Notes and Clarifications (check here for any updates)
6. Marking
1. Introduction
In our first few lectures together we learned the basic “syntax” of the command line. We talked a little bit
about terminal programs, and this thing called BASH, or more generally the shell. In this assignment, you
will be implementing a basic shell that can perform many, if not all of the operations we’ve needed in the
course thus far (to my surprise, the man pages, and vi are also supported, at least on teach.cs and my
own machine). It will be able to use the system to start executables (ones with appropriate permissions)
found in any directory listed in a special $PATH variable as well as any with an absolute or relative path
CSCSHELL
2/11
(e.g. if you typed ./hello ) Of course, you’ll be able to also supply command line arguments to these
programs. This shell will be interactive as well as mildly scriptable. When “interactive”, typing into the
terminal program (itself running your shell program) will provide input to the stdin of other running
executables, as well as display the output of stdout and stderr . Really, what you're already used to at
the "command line", but written by you! Ultimately, it will support file redirection, piping, and reading
commands from a script file, so you are not solely limited to interactive usage of this shell.
While we won’t be able to exhaustively support any and all programs, that are say, installed on teach.cs
(this would require continual maintenance and bug-fixing), we will be able to use the basic system
programs we’ve learned to use, like cd (in fact, a special case), ls , mkdir, rm, grep, touch, man,
cat, wc etc. -- as well as all the programs you have developed for the course.
This assignment will feature a lot of C-string parsing, file i/o, as well as standard usage of the system call
(families): fork , exec , wait , pipe and dup .
2. Starter Files
The header file cscshell.h has lot’s more information to get you started. Please do not remove or edit
anything that is already there, but feel free to add more function declarations, macros, and any libraries
that don’t need linking and are available on teach.cs.
Do not change the struct definitions, the tester will be using these, so you're stuck with them.
In addition to cscshell.h , there is the accompanying cscshell.c which features the main() entry point
for your program. This source file should not need to be changed. If you do, you may only add
functionality, and not remove any existing functionality (also we won’t be testing any of the additional
functionality that you add).
The files parse.c and run.c are the two main parts of this assignment. Every function you implement
should be added to one of these two files, depending on if it is for parsing shell text, or running
processes/other executables.
2.1. Actually getting started
The material you will need for the system-call aspect of this assignment will not be covered in class until
weeks 7 and 8, so get started with parsing the shell syntax as soon as you can (since you have
everything you’ll need for this), and start asking yourself what would you need to be able to do the rest.
The source files include a lot of additional information meant to direct you towards successfully
completing the assignment, as well as some helpers and partially completed functions. Feel free to start
from scratch according to cscshell.h if you're that kind of person.
3. CSCSHELL
CSCSHELL
3/11
This shell will be similar to POSIX sh
(https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html) (much like BASH
(https://www.gnu.org/software/bash/) ; which you find on the teach.cs machines, Ubunutu, or the WSL; or
zsh
(https://wiki.archlinux.org/title/zsh#:~:text=Zsh%20is%20a%20powerful%20shell,improved%20tab%20comple
; which you find on your Apple products). We will however keep it profoundly simplified, focusing on the
following features:
1. Comments: anything trailing a # character on any line is a comment and thus ignored
2. Variable assignment and usage. See 5.2.
3. Script execution, including a mandatory init script that must initially define a ${PATH} variable at
some point
4. A cd command, as this is not an executable, it is system control access provided by a shell
5. Using system calls to start executables with arbitrary string arguments. Programs are the first part of
any usable line of CSCSHELL (provided it is not cd , variable assignment or a comment), and can be
either a name found in one of the ${PATH} directories, or an unambiguous path to the program
(relative or absolute are both acceptable).
6. File redirection from stdout to new (or re-creating existing) files or appending to existing ones, as
well as redirection from a file to stdin
7. Piping between the stdout of one program and the stdin of the next. All processes must run
simultaneously.
4. Limitations (shell stuff you won’t have to implement)
Even the most basic POSIX sh (https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html)
supports many more features, and is in fact, a fairly comprehensive scripting language. We will not be
implementing any features aside from what is listed in 3. Some obvious limitations:
1. No support for meta-characters/matching, e.g. '*'
2. Loops, conditionals, general programmatic scripting features
3. Command line arguments to the scripts (somewhat pointless without 2.)
4. Redirection to/with file descriptors numbers (through shell syntax)
5. Command history and tab-completion
Furthermore, having only one environment variable ( $PATH ) is fairly limiting, and so is maintaining the
variables as a linked list. If you are really into this work, I’d suggest extending the shell with some of
these missing features if you’re ever looking for exercises to do.
5. Implementation and Requirements
5.1. Managing errors
CSCSHELL
4/11
Errors that are due to system failures e.g. like malloc running out of memory should cause the shell to
terminate. However, errors pertaining to the usage of CSCSHELL should be handled gracefully without
ending the shell, these should just print an error message. The header cscshell.h lists a variety of
macros with (formatting) strings to print to stderr when something goes wrong. You will find examples in
cscshell.c about how to use them, specifically using ERR_PRINT . If you can’t decide which error to use,
use the general-purpose error:
#define ERR_EXECUTE_LINE "Could not execute line.\n"
Update: You may use any error message with the ERR_PRINT macro to pass the automated tests.
There need not be any confusion as to which message to use. Note that good messages will help you
debug.
5.2. Shell Variables
5.2.1. Creation
Shell variables must be specified as a name, consisting of only alphabetic characters and _
(underscore) characters. Variable values are a string that can contain any ASCII characters.
Note that the newline character for a line is not part of the variables value.
A single = must separate the variable name and the value, without any spaces in between the name
and '=' . There can be spaces in the variables value, and the value can also be empty (nothing, or
simply whitespace after '=' ). This means that you might also find extra whitespace, including '\t'
( '\n' and '\r' gets messy when using fgets to read input, so we will relax this and not worry about it).
This whitespace does not need to be removed from the value. Some extreme, but correct variable
assignments might look like this:
PATH=Some Text, that can definitely include spaces.
OTHER_VAR=Example Has Special(tab) Whitespace.
EMPTY_VAR_ONE=
EMPTY_VAR_TWO=
Note: EMPTY_VAR_TWO actually has some spaces as a value, EMPTY_VAR_ONE does not, highlight the
whitespace with your cursor to see more clearly.
Some examples of incorrect assignment are as follows:
BAD_VAR2=It has a number in the name
Badvar =Shouldn't have a space before the equal sign.
ls > THING=this is just very wrong, hopefully it is clear why!
Note: Quotations will be simply assumed to be part of the string, hopefully making your job way easier.
Therefore, in this example:
OTHR_VAR="hello world"
CSCSHELL
5/11
The name of the variable is OTHR_VAR and it’s value includes the quotes, storing: "hello world" (13
characters – and a null terminator as a C-string).
Finally, notice if you type the literal escape sequence '\t' or '\n' , that these are interpreted as two
characters, but since we don’t use TAB for autocomplete in this shell, using the TAB key will give you a
'\t' character, in the form of inserting a big whitespace (and the C char value '\t' ). Refer back to
OTHR_VAR above.
5.2.2. Usage
Clarification (Mar-04): it is acceptable in this assignments for variables to only ever be used in
commands, as in, usage is not considered during creation (don't need to support $VAR=Use some
$OTHER value, you can support this if you want).
1. Simple
Variable usage in the shell consists of a prefixed $ before the variable’s name. As in:
SPECIAL_FILE=/etc/password
echo "The file: $SPECIAL_FILE does not really exist."
Note: this is ambiguous if you’d like to combine the variable with symbols that are valid variable
characters. The end of the variable is clear because of the space, but instead, consider:
PROJECT_DIR=a3/solution/
gcc -Wall -std=gnu99 -c $PROJECT_DIRcscshell.c
The end of the variable is ambiguous!
2. Unambiguous
The unambiguous variable usage does not rely on a invalid name character to demarcate the end of
the variable, and instead adds curly braces {} on either side of the name to make this
unambiguous.
Consider how this fixes the issue in 5.2.2.1:
PROJECT_DIR=a3/solution/
gcc -Wall -std=gnu99 -g -o solution ${PROJECT_DIR}cscshell.c
5.2.3. PATH
This variable is just like the $PATH variable in the shell on teach.cs. Try printing it out with echo $PATH .
Essentially, it is a set of paths to directories, separated by colons. The trailing / is optional for any path
on the $PATH . In the interest of quick and syntactically seamless O(1) access, maintain this variable as
the head of the variable list!
5.2.4. Error checking with variable assignment
CSCSHELL
6/11
A few error conditions should be checked when parsing variable assignments. cscshell.h lists errors
that will need checking for convenient macros) to stderr . The following two macros are provided for
error checking pertaining to variable assignment specifically.
#define ERR_VAR_START "Assignment cannot start with '=' character.\n"
#define ERR_VAR_NAME "Variable names must only contain alphabetic characters and\
'_' chars.\n Got: %s\n"
1. (Hint) Don’t consider variable assignment mixed with pipes or redirection
Notice that the command MY_VAR="something" | grep foo is nonsensical, there is no stdin or stdout
relating to variable assignment.
Your solution should handle the above example by creating (or updating) the variable MY_VAR to have
the (string) value "something" | grep foo .
We will not test the similarly nonsensical ls | MY_VAR="something" .
2. (Hint) All the text to be parsed will be ASCII-formatted
Thus, you can use the character codes/numeric ranges to do error checking and parsing if you so
desire.
5.3. Scripts and .cscsh_init
Scripts should operate as if each line in the file were typed into the shell program one at a time. Except…
You won’t have to type it interactively.
Therefore the code provided for running the shell interactively should provide you a great example of
how to use the functions to implement run_script . Notice that we use run_script to start the shell’s
“init” file. Be sure to go over the source code we give you before starting!
Finally, you’ll eventually notice that running your shell program on teach.cs through ssh will probably
result in a broken backspace. Running the command stty sane should be enough to sort this issue, so
consider adding this to your CSCSHELL’s init file, as well as the following path variable assignment to
access the basic utilties:
PATH=/local/bin:/usr/bin:/bin/
stty sane
You can download this as a .cscshell_init file here (https://q.utoronto.ca/courses/337029/files/30784621?
wrap=1) (https://q.utoronto.ca/courses/337029/files/30784621/download?download_frd=1) . Note that the
leading dot '.' in the filename means it is a hidden file. You can see hidden files with the 'a' option in ls, as
in, with '-l': ls -la
5.4. Execution
CSCSHELL
7/11
The name (for an executable file in one of the $PATH paths) or path (relative or absolute) to an
executable file is the first non-whitespace aspect of any shell “command” (or line) unless it is one of
these three (four) exceptions:
1. Variable assignment
2. The cd command (https://q.utoronto.ca/#org6294c43)
3. Comments
4. (blank lines)
This means, aside from piping, in which case there are multiple commands, the name of the “command”,
i.e. some executable identified via $PATH or directly with an absolute or relative path, must be the first
part of each “command”. Subsequent strings on a line are arguments to this command, unless they
are cases of 5.6 or 5.5. Arguments are separated from each other (and the command) by one-or-more
spaces (not tabs or new-line or carriage whitespace etc).
Standard execution of lines shall require one child process per executable that needs to be run. The
parent process for the shell must wait (using the wait family of system calls) for the child process to
complete.
5.4.1. The cd command
Unlike ls , your other bread-and-butter shell command cd is not an executable. Notice that cd has no
sensical stdin/out .
We’ve provided you the helper function cd_cscshell , which you should use in execute_line to actually
change directories. Your cd command must accept either no arguments or one additional argument.
With no arguments, the command should move to the current user’s home directory. Type man getpwuid
and notice the pw_dir member of struct passwd , and take a moment to understand what makes
prompt() in cscshell.c work.
If one or more arguments are provided, your cd should use the first argument as a relative or absolute
path and just ignore the rest if provided. The special paths of '.' and '..' should also work.
5.5. File redirection
File redirection must be specified after all arguments for commands (see 5.4) have been specified. In
other words ls -l > out.txt is correct, but ls > out.txt -l is not (because the -l is after the
redirection). You must support all three redirection possibilities, and mixing up to one output redirection
and one input redirection:
1. Input redirection “<”, e.g. grep keyword < file_to_search.txt
2. Output redirection “>”, e.g. ls -l > output.txt
3. Output redirection as appending “>>”, e.g. ls -lah --author >> directory_history_snapshots.txt
With two re-directions, as in wc -l < contacts.txt > num_contacts.txt the order of input or output
redirection is irrelevant, either one may go first. You cannot mix both output redirection ( ">" ) as well as
CSCSHELL
8/11
appending the redirected output ( ">>" ).
We will not be implementing support for input appending ( << ).
5.6. Piping
Valid combinations of piping mixed with redirection should work, but there are invalid configurations
(though these can be overcome leveraging the program tee , see man tee ).
The following is an invalid combination, that BASH gracefully ignores:
ls -l > out.txt | grep l
You should not implement a tee -like solution (i.e. pipe stdout from ls and write to out.txt ) to handle
the above. You should either print an error, or simply write to out.txt without causing grep to hang
(waiting on a pipe).
So long as your shell program does not exit (i.e. segfault or intentionally exit operation), you may handle
invalid piping and redirection combinations as you see fit!
Otherwise, in general, a child process should be created for each executable listed on a single line. They
must, of course, be separated by a pipe character | (refreshing your memory on pipes in BASH my be
helpful).
The leftmost process should write its stdout to the write-end of a pipe, whose read-end is connected to
the stdin of the next program in the pipe sequence. The stdin for the first program is unaffected by
pipes, as is the final program’s stdout . That is, they should either connect to the default (terminal)
stdin/out or be subject to 5.5.
5.7. Notes and Clarifications (check here for any updates)
5.7.1. Before assignment release
Leading whitespace before commands or any other function are acceptable and must be simply
ignored.
5.7.2 Feb 28th run_command and some suggestions
It should have been clearer in the source code comments, but run_command 's return value should be
the child's PID (HINT: run_command should only start one command at a time, multiple processes are
expected to be managed in execute_line ). The parent process (the original shell process) should be
the one returning this PID. Another HINT: it should return this to execute_line where the waiting
happens on all the running children.
Note: The parent process is always the one that is providing prompts, or reading from a script.
Getting parse_line (and consequently replace_variables_mk_line ) to be bug free is going to take
time, but try not to leave run.c until the last minute. As of week-7, you have everything you need to
start new processes and exec executables, but it won't be until week-8 that you will learn how to
CSCSHELL
9/11
make the pipes that connect processes, and the file-descriptor management that goes along with
this.
5.7.3 Mar 4th The parse_line implementation, Using '=' outside of variable assignment, and other
variables in new variable creation
For simplicity, feel free to assume that if there is a valid (not-commented) ' =' character, that the
intent of a line is that it be a variable assignment. So you can assume if = in line; then variable
assignment, without penalty. This of course breaks compatibility with command line args of the form
--option=
signs well. As in, if = with no *space before it; then variable assignment; else it is command
execution. Thanks to Piazza @1702 for discussion.
The command struct returned by parse_line must be complete (as in, all fields contain their final
values), aside from the following: stdin_fd and stdout_fd may be used as you see fit, when you
see fit, and their values will not be tested directly (only the expected functionality of the shell itself will
be tested, as in, if I ls -l > file.txt then I'd expect at some point that stdout_fd be connected to
an open file called file.txt , but we will only test if the file actually gets the right bytes written to it!).
Thanks to Piazza @1708 for bringing this up.
5.2 has been updated, pertaining to: It is acceptable in this assignments for variables to only ever be
used in commands, as in, usage is not considered during creation (don't need to support $VAR=Use
some $OTHER value , you can support this if you want).
Other, more general clarifications:
The goal of this assignment it to complete a working shell
That is, after you run cscshell your terminal should work a lot like normal, just now it will be
slightly more limited (as described above), and use a shell implementation that you've written.
How are you expected to complete the assignment? There are no programmatic steps to follow...
Besides using the design provided.
Start from main, trace out what will need to happen to start a shell, which can execute
something like ls
The functions listed in cscshell.h are a bit of a puzzle to piece together through a couple
exercises like this
5.7.4 March 4th Part 2: Command structs
Clarification: The command struct's members will be tested and are as follows:
exec_path: (tests will examine this) The path to the literal executable run, returned by
resolve_executable
args: (tests will examine, but will ignore index 0) These are the argv arguments to your program.
You may re-use the exec_path for argv[0].
next: (tested to retrieve the whole list) the next command if a line contains multiple commands
stdin_fd: (untested) an integer to store fd's for the process' stdin
stdout_fd: (untested) same as above for stdout
CSCSHELL
10/11
redir_in_path: (tested) for input redirection, should match the exact (relative or absolute) path
provided on the original line
redir_out_path: (tested) same as above
redir_append: (tested) non-zero if and only if the command is redirecting and appending to the
destination
The only members considered/needed for a 'cd' command are exec_path and args[1], noting that
resolve_executable already returns a string literal for cd in this case.
5.7.5 Update to 5.3
The shell instruction suggestions originally in 5.3 were in the wrong order
It should be
PATH=/local/bin:/usr/bin:/bin/
stty sane
Download a copy of the file (https://q.utoronto.ca/courses/337029/files/30784478?wrap=1)
(https://q.utoronto.ca/courses/337029/files/30784478/download?download_frd=1)
5.7.6 Confirmation of ERR_PRINT and testing; invalid piping testing
Any string may be used to pass the automated tests, as long as you use the ERR_PRINT macro (or
equivalent).
Feel free to use whatever message makes the most sense to you, we will be checking for
"ERROR:" in stderr, or the lackthereof
Invalid combinations of piping and redirection will not be extensively tested, section 5.6 has been
clarified to this effect
"So long as your shell program does not exit (i.e. segfault or intentionally exit operation), you may
handle invalid piping and redirection combinations as you see fit!"
5.7.7 Parsing redirects and pipes without spacing
This question has come up a few times
If there are no spaces around redirect symbols or pipes, this is still considered valid
So you should handle cases like ls>out.txt and ls|grep file
5.7.8 Minor clarifications from Piazza
You don't need to support && or ; operators. Or while or for or if for that matter :)
If you can't make a valid command struct from the line, you should throw an error in parse_line
The sanity tests are there for some feedback, they reflect your code's conformance to my solution,
not the more flexible notion of cscshell behaviour, if it is "working" on teach.cs but not passing tests, it
could just be because, say, you managed your file descriptors in different functions than I did. My
guess is that you are calling dup in execute line rather than run_command, and that's fine, but the
sanity tests won't work.
We will run your shell as a standalone program for the remaining tests, and not rely on these sorts of
design decisions, so the final results will be different.
CSCSHELL
11/11
6. Marking
This assignment will be marked in a very similar fashion to what you’ve seen so far.
75% will test the core functionality outlined in this document
15% will be style graded by the TA as per the rest of the course
5% will be specifically awarded for the correct usage of perror for all system calls
5% will be specifically awarded for effective parallelism using pipes