Using an ssh tunnel
When you run your proxy server on turing, the port you open is not reachable from the outside world. However, you can make a "tunnel" to that port from your local machine (laptop, home computer, etc). Then set up your local web browser to use the local machine's tunnel endpoint.
How to configure your web browser to use a proxy server
In all of these, you'll need to set the HTTP server to "turing.slu.edu" and set the port to the port your proxy server is using (see the assignment for your range of ports).
Firefox
On turing and the linux lab machines, you can set up Firefox to use proxies.
Click the settings gear, choose Advanced, then click "Set up how Firefox connects to the internet", then choose Proxies.
Mac OS X
If you're using a Mac, you can set up an HTTP proxy in the Network pane
of System Preferences, by choosing Advanced and then Proxies.
Windows
I haven't tested this, but you should be able to set up a proxy via the Tools
menu of Internet Explorer. Choose Internet Options, then the Connections tab,
and then the Settings button. Let me know if this works!
Running proxy in the LinuxLab
If you sit down at a linux lab machine, you are not using turing. So if you run
a server, such as proxy or ilisten, that server is running at linuxlab##.mcs.slu.edu,
and you'll need to use that address to connect to it, for example:
telnet linuxlab4.mcs.slu.edu 9030
Since the linuxlab machines are on their own local network,
connecting to a server on one of those machines
will probably only work from turing or another machine in the lab.
Other notes:
Stream buffering
When you use fdopen() to turn a connected file descriptor into a FILE *, the
FILE stream is buffered, which means that it holds data in a buffer until
it has enough to be worth sending across the network. Read the man page
for setlinebuf() for details.
With proxy, you will need to send information to a web server, and you need
to ensure it gets through the buffer and actually sent across the network.
You can do this manually with fflush(), which ensures that data is flushed
out of the buffer and to the network. Or, you can use setlinebuf() to change
the buffering behavior of the stream.
Connection: Close
In HTTP/1.0, a TCP connection was created for each request/response, which
wasted resources creating lots of TCP connections.
With HTTP/1.1, clients can send more than one request and receive more than
one response over a TCP connection, using the Connection: Keep-Alive header.
This is tricky to handle properly. If you don't handle it properly, you'll
see pages load very (30+ sec) slowly as the browser waits for connections to
timeout. For an easy workaround, change all "Connection:" headers to
Connection: Close when forwarding the client request.
Don't use fgets/fputs to copy a server response
You're going to connect to a web server, send it a request, and then copy
it's response back to the client. That web server can send back binary data,
which may include 0 characters. fgets
cannot handle embedded 0
characters, since those terminate a C string.
If you use fgets/fputs, you'll find your proxy works for simple
text only websites and fails for more complicated sites that compress their
response or include images.
Instead, the easiest thing to do is to simply use read()
and
write()
calls on the socket (as opposed to the FILE *
created by fdopen()
).
A more sophisticated approach would be to use fgets
to read the
response header, parse it to find the Content Length:
field,
and only then switch to read/write
to copy the message body.
This approach is not without it's hazards - see the BUGS section of the
fgets man page if you plan to try it.
How do I return "404 Not Found"?
You need to generate a correctly formatted HTTP response.
At the very least, that's
HTTP/1.1 404 Not Found
[blank line]
(where each line is terminated with "\r\n").
However, I found that this didn't actually display in the browsers.
So, I sent a message body as well, which requires some extra work:
HTTP/1.1 404 Not Found
Content-Length: [calculated value]
Content-Type: text/html
[blank line]
and then printed some html code for the error response web page.
Actually, I printed the html code into a string (sprintf) so I could use strlen() to get the Content-Length, then printed the string.
You can see my messages by running my proxy server, connecting with telnet,
and giving it a request with a nonexistent Host: field, such as:
GET / HTTP/1.1
Host: not.really.a.host.at.all
[blank line]
You can see other webserver's error messages the same way.. use telnet to connect to port 80 on your favorite web server and send the above request.