Hoxy is a completely free, open source HTTP hacking API for Node.js. It operates in the same ballpark as Charles or Fiddler. A few key features:
Here are a few ideas for things you could do.
These docs are for Hoxy 3.x. See the incantations section for code samples.
Hoxy functions as a normal proxy standing between the client and server. You intercept traffic during either the request or response phase, or both. The phase lifecycle section contains a more complete description, but here's a simplified illustration.
time ==> ----------------------- server: 3 -------------/-\------- hoxy: 2 4 -----------/-----\----- client: 1 5
Hoxy works on Node 0.12.x or higher, and io.js 1.0 or higher. It may work in older Nodes to varying degrees. Here's a quick sample program that tampers with request headers.
let hoxy = require('hoxy'); let proxy = hoxy.createServer().listen(8080); proxy.intercept('request', req => { req.headers['x-unicorns'] = 'unicorns'; // server will now see the "x-unicorns" header });
After running the above, configure your client to proxy through yourmachine:8080
.
As long as you know JavaScript and HTTP, you can inspect and manipulate your own web traffic in arbitrary ways.
Hoxy is available on both npm and github. Tests pass on recent node and io.js.
$ npm install hoxy
var hoxy = require('hoxy');
var hoxy = require('hoxy');
The object returned from require('hoxy')
is just a namespace.
Represents a proxy server.
An instance of this is returned from hoxy.createServer()
.
Factory for a new proxy with given options. All options are optional.
var hoxy = require('hoxy'); var proxy = hoxy.createServer({ upstreamProxy: 'localhost:9090', reverse: 'http://example.com', certAuthority: { key, cert }, tls: { key, cert }, slow: { rate, latency, up, down } });
name | description |
---|---|
upstreamProxy |
Optional.
If present, this proxy will in turn use another proxy.
This allows Hoxy to play well with other proxies.
This value should take the form host:port .
|
reverse |
Optional.
If present, this proxy will run as a reverse proxy for the given server.
This allows you to point your client directly at the proxy, instead of configuring it in the client's proxy settings.
This value should take the form scheme://host:port .
|
certAuthority { key, cert } |
Optional. If present, this should contain a key/cert combo representing a certificate authority that your client trusts. See these instructions for how to generate these files. You'll then need to configure your client to use this proxy for https in addition to http. Once you've got all of that set up, Hoxy will generate fake keys/cert combos for every hostname you visit, caching them in memory for subsequent visits, thus allowing the proxy to handle https requests as cleartext. |
tls { key, cert } |
Optional.
Should only be used in comination with reverse .
If present, causes Hoxy to run as an https server.
Passed as opts to https.createServer(opts, function) (see the https module docs).
|
slow { rate, latency, up, down } |
Optional.
up and down are independent.
However, rate and up (if they both exist) form a pipeline; the slower of the two governing the total speed.
The same is true of rate and down .
These properties can also be get/set by using proxy.slow() .
|
Starts proxy listening on port
.
Returns itself.
var hoxy = require('hoxy'); var proxy = hoxy.createServer().listen(8080);
A callback may be provided, to run when the proxy has started listening.
var hoxy = require('hoxy'); var port = 8080; var proxy = hoxy.createServer().listen(port, function() { console.log('The proxy is listening on port ' + port + '.'); });
This method simply passes its arguments to Node's server.listen()
method.
Get/set proxy-level slow options.
If options
is provided, it's a setter.
proxy.slow({ rate, latency, up, down });
If options
is not provided, it's a getter.
var slowOpts = proxy.slow();
This is the entry point for intercepting and operating on requests and responses. This first example intercepts all requests.
proxy.intercept('request', req => console.log(req.url));
This is more verbose, but identical to the above.
proxy.intercept({ phase: 'request' }, function(req, resp, cycle) { console.log(req.url); });
The callback passed to intercept()
(AKA the interceptor) receives three arguments (req, resp, cycle)
which are instances of classes described below.
cycle
is also passed as context.
Thus:
proxy.intercept('request', function(req, resp, cycle) { console.log(this === cycle) // true });
Hoxy internally runs everything sequentially in series, including interceptors. That happens naturally when your interceptor logic is synchronous. But if it's asynchronous, you must signal that by returning a promise. As long as you do that, Hoxy will wait until the promise resolves and things will flow serially.
proxy.intercept('request', function(req, resp, cycle) { // return a promise from the interceptor return getThingA().then(function(thingA) { return getThingB(thingA); }).then(function(thingB) { // do something with "thingB" }); });
Future versions of JavaScript will allow async functions. Some transpilers allow you to use this syntax today. These can be used as interceptors, too. Written this way, your logic and your error handling are much cleaner.
proxy.intercept('request', async function(req, resp, cycle) { // this is an async function let thingA = await getThingA(); let thingB = await getThingB(thingA); // do something with "thingB" });
In current versions of io.js, generators can be used as a stand-in for async functions, but with the benefit of being natively-supported, spec-valid ES6 syntax. You must only yield promises from your generator. See co for more details.
proxy.intercept('request', function*(req, resp, cycle) { // this is a generator let thingA = yield getThingA(); let thingB = yield getThingB(thingA); // do something with "thingB" });
Note: Strictly speaking, Hoxy doesn't care whether your interceptor is an async function or a generator. It only cares what kind of thing the return value is:
Options affect how, or if, interceptors get called.
Here are a few examples.
The following will only intercept GET
requests to example.com
.
proxy.intercept({ phase: 'request', method: 'GET', hostname: 'example.com' }, function(req, resp, cycle) { console.log('request made to: '+req.url); });
The following only intercepts text/html
responses.
When it does, it exposes the response body as a readable/writable $
variable.
proxy.intercept({ phase: 'response', mimeType: 'text/html', as: '$' }, function(req, resp, cycle) { resp.$('title') .text('all your titles are belong to us'); });
The following only intercepts responses with a declared charset.
When it does, it exposes the response body as the string
variable.
proxy.intercept({ phase: 'response', contentType: /charset/i, as: 'string' }, function(req, resp, cycle) { console.log(resp.string); });
The following only intercepts responses from an /api/users
JSON endpoint.
When it does, it exposes the response body as the json
variable.
It uses a URL pattern to match the endpoint URL instead of a regex.
proxy.intercept({ phase: 'response', fullUrl: 'http://example.com/api/users/:id', mimeType: 'application/json', as: 'json' }, function(req, resp, cycle) { console.log(resp.json.email_address); });
name | type | required | description |
---|---|---|---|
phase |
string | yes |
Which phase to intercept.
See phase lifecycle for more info.
Accepted values:
|
as |
string | no |
Expose the request or response body (depending on the phase) as data of a certain type.
If there's an error parsing the body into this form, the intercept action is skipped and a warning is logged.
Hoxy normally streams request and response bodies through.
If as is present, hoxy buffers the request or response body into memory.
Accepted values:
'json' with mimeType:'application/json' .
|
Filtering options (these are logically ANDed together) | |||
name | type | required | description |
protocol |
string, regex, function | no | Match the request protocol. |
method |
string, regex, function | no | Match the all-uppercase HTTP request method. |
hostname |
string, regex, function | no | Match the host, not including :port. |
port |
number, string, regex, function | no | Match the port number. |
url |
string, regex, function | no |
Match the request URL.
Patterns like /foo/* are allowed.
See route-pattern.
|
fullUrl |
string, regex, function | no |
Match the full request URL including protocol and hostname.
Patterns like /foo/* are allowed.
See route-pattern.
|
contentType |
string, regex, function | no |
Match the full content-type header of the request or response (depending on the phase).
|
requestContentType |
string, regex, function | no |
Same as contentType but only matches request.
|
responseContentType |
string, regex, function | no |
Same as contentType but only matches response.
|
mimeType |
string, regex, function | no |
Match just the mime type portion of the content-type header of the request or response (depending on the phase). I.e., if the entire header is "text/html; charset=utf-8" , just match the "text/html" part.
|
requestMimeType |
string, regex, function | no |
Same as mimeType but only matches request.
|
responseMimeType |
string, regex, function | no |
Same as mimeType but only matches response.
|
Note: filtering options can be different types.
==
) match.
Deals with various logging events.
This first example listens for error
, warn
, and debug
logging events, and prints them to stderr
.
proxy.log('error warn debug');
Or, print logging events to various writable streams.
proxy.log('error warn debug', process.stderr); proxy.log('info', process.stdout);
Or, explicitly handle logging events.
proxy.log('error warn', function(event) { console.error(event.level + ': ' + event.message); if (event.error) console.error(event.error.stack); });
name | description |
---|---|
error |
When something bad happens that we wish wouldn't have happened. |
warn |
When something iffy happened that we can probably tolerate. |
info |
When something noteworthy happened that normal people care about. |
debug |
When something boring happened that developers care about. |
Stops proxy receiving requests. Finalizes and/or cleans up any resources the proxy uses internally.
proxy.close(function(err) { // optional callback if (err) { throw err; } console.log('The proxy is no longer accepting new connections.'); });
Represents a request.
An instance of this is passed as the first argument to every interceptor.
Altering its values changes what the server sees.
If you change the hostname
, it changes which server sees the request.
Protocol of the request.
Destination server hostname, sans port.
Destination server port.
All-caps HTTP method used. Lowercase values are converted to uppercase.
HTTP request header name/value JS object.
These are all-lowercase, e.g. accept-encoding
.
Root-relative request URL, including query string, like /foo/bar?baz=qux
An object representing querystring params in the URL.
For example if the URL is /foo/bar?baz=qux
, then this object will look like { baz: 'qux' }
.
Request body parsed as JSON.
This is only present if you intercept the request as:'json'
.
Changes made to this object will be seen by the server.
proxy.intercept({ phase: 'request', method: 'PUT', fulUrl: 'http://example.com/users/:id', as: 'json' }, function(req, resp, cycle) { req.json.prefs.subscriptions = 'all'; });
Request body parsed as form-url-encoded params.
This will be a key/value POJO.
This object will only present if you intercept the request as:'params'
.
Changes made to this object will be seen by the server.
proxy.intercept({ phase: 'request', method: 'POST', mimeType: 'application/x-www-form-urlencoded', as: 'params' }, function(req, resp, cycle) { console.log(req.params.email); });
Note, parameters from the URL querystring are not included in this object.
Request body string.
This is only present if you intercept the request as:'string'
.
Overwriting this will overwrite the request body sent to the server.
proxy.intercept({ phase: 'request', fulUrl: 'http://example.com/users/:id', as: 'string' }, function(req, resp, cycle) { console.log(req.string); });
Request body binary buffer.
This is only present if you intercept the request as:'buffer'
.
Changes made to this object will be seen by the server.
proxy.intercept({ phase: 'request', method: 'POST', fullUrl: 'http://example.com/images', as: 'buffer' }, function(req, resp, cycle) { // req.buffer contains uploaded image });
Simulates slowness during request phase. With this method you can set a minimum latency and/or maximum transfer rate. Since these are minimum/maximum, if your native connection is already slower than these values, this method will have no effect.
// Simulate upload speed of 10,000 bytes per second proxy.intercept('request', function(req, resp, cycle) { req.slow({rate:10000}); // bytes per second });
// Simulate a 500-1000ms delay on every request. proxy.intercept('request', function(req, resp, cycle) { req.slow({latency:randint(500, 1000)}); });
If url
is provided, sets the request's absolute protocol, hostname, port and url.
Otherwise it returns the absolute URL of this request.
This is mainly a convenience method.
// Log every request through the proxy. proxy.intercept('request', function(req, resp, cycle) { console.log(req.fullUrl()); });
Whatever request body gets sent to the server, tee()
pipes an identical copy to your writable stream.
Your stream is held in memory, and only gets written to if and when the request is sent to the server.
In other words, your stream sees whatever the server sees.
If the server sees nothing, your stream sees nothing.
You can tee()
as many times as you want.
proxy.intercept({ phase: 'request', method: /post|put/i, mimeType: 'application/json' }, function(req, resp, cycle) { req.tee(fs.createWriteStream('./uploaded.json')); });
Represents a response.
An instance of this is passed as the second argument to every interceptor.
This object is unpopulated during the request
and request-sent
phases.
Altering its values changes the response to the client.
HTTP status code being sent to the client.
HTTP response header name/value JS object.
Header names are all-lowercase, such as 'content-type'
.
Response body parsed as DOM.
This object is only present if you intercept the response as:'$'
.
This is a cheerio object, which provides a jQuery-like API.
Changes made to it will be seen by the client.
proxy.intercept({ phase: 'response', fullUrl: 'http://example.com/page.html', as: '$' }, function(req, resp, cycle) { // change the title of the page resp.$('title').text('Fake Title!'); });
Response body parsed as JSON.
This is only present if you intercept the response as:'json'
.
Changes to this object will be seen by the client.
proxy.intercept({ phase: 'response', method: 'GET', fullUrl: 'http://example.com/users/123', as: 'json' }, function(req, resp, cycle) { // arbitrarily manipulate the response json resp.json.foo = 'bar'; });
Response body string.
This is only present if you intercept the response as:'string'
.
Overwriting this will overwrite the response body sent to the client.
proxy.intercept({ phase: 'response', fullUrl: 'http://example.com/page.html', as: 'string' }, function(req, resp, cycle) { // print page to log console.log(resp.string); });
Response body binary buffer.
This is only present if you intercept the response as:'buffer'
.
Changes made to this object will be seen by the client.
proxy.intercept({ phase: 'response', fullUrl: 'http://example.com/image.jpg', as: 'buffer' }, function(req, resp, cycle) { // resp.buffer contains a jpg });
Simulates a slow response. With this method you can set a minimum latency and/or maximum transfer rate. Since these are minimum/maximum, if your native connection is already slower than these values, this method has no effect.
// Simulate download speed of 100,000 bytes per second proxy.intercept('response', function(req, resp, cycle) { resp.slow({rate:100000}); // bytes per second });
// Simulate a 500-1000ms delay on every response. proxy.intercept('response', function(req, resp, cycle) { resp.slow({latency:randint(500, 1000)}); });
Whatever response body gets sent to the client, tee()
pipes an identical copy to your writable stream.
Your stream is held in memory, and only gets written to when the response is sent to the client.
In other words, your stream sees whatever the client sees.
You can tee()
as many times as you want.
proxy.intercept({ phase: 'response', mimeType: 'image/gif' }, function(req, resp, cycle) { resp.tee(fs.createWriteStream('./image.gif')); });
Represents a whole request/response cycle.
A Cycle instance is this
in all interceptor calls, and the same instance is shared across an entire request/response cycle.
It's also passed as the third argument, in order to support arrow functions.
It provides a small number of methods not associated specifically to either the request or response.
Provisions responses from the local filesystem.
Generally, the reason you'd do this is to be able to edit those files locally and test them as if they were live on the remote server.
This action populates the response object; see response population for more info.
The completion of this action is asynchronous, so serve()
returns a promise.
Example.
proxy.intercept({ phase: 'request', fullUrl: 'http://example.com/main.js' }, function(req, resp, cycle) { return cycle.serve('/Users/gr123/test/main.js'); });
Or the more verbose-but-identical...
proxy.intercept({ phase: 'request', fullUrl: 'http://example.com/main.js' }, function(req, resp, cycle) { return cycle.serve({ path: '/Users/gr123/test/main.js' }); });
name | type | required | description |
---|---|---|---|
path | string | no |
Which file to serve.
Defaults to the request URL.
Normally this would be used in mutual exclusion with docroot .
Strictly speaking, path is always rooted to docroot , which defaults to "/" .
|
docroot | string | no |
Which local directory to serve out of.
Defaults to filesystem root "/" .
|
strategy | string | no |
Mainly relevant when using the docroot option.
Describes the relationship between the local docroot and the remote one.
Strictly speaking, this controls what happens when the local docroot is missing a requested file.
Accepted values:
|
The returned promise resolves after the response has been populated. There are at least three use cases worth mentioning.
Use case #1: Serve a specific file. First:
$ curl http://example.com/js/main.js > main.js
...then:
proxy.intercept({ phase: 'request', fullUrl: 'http://example.com/js/main.js' }, function(req, resp, cycle) { return cycle.serve(__dirname + '/main.js'); });
Use case #2: Serve out of a local docroot. First:
$ mkdir js $ curl http://example.com/js/main.js > js/main.js $ curl http://example.com/js/some-lib.js > js/some-lib.js $ curl http://example.com/js/other-lib.js > js/other-lib.js ...
...then:
proxy.intercept({ phase: 'request', hostname: 'example.com', url: /^\/js\/.*/ }, function(req, resp, cycle) { return cycle.serve({ docroot: __dirname, strategy: 'overlay' }); });
Use case #3: Serve out of a local docroot. Similar to above, but automatically downloads the files for you, instead of having to curl them as in the above example. You control which files get downloaded locally by the filtering options you provide.
proxy.intercept({ phase: 'request', hostname: 'example.com', url: /^\/js\/.*/ }, function(req, resp, cycle) { return cycle.serve({ docroot: __dirname, strategy: 'mirror' }); });
Stores and retrieves data on a cycle instance. This is useful since the same instance is shared across all interceptors for a given request/response cycle, allowing you to share related data across disparate scopes. With two params this method behaves as a setter, with one param as a getter.
['request','request-sent','response','response-sent'] .forEach(function(phase) { proxy.intercept(phase, function(req, resp, cycle) { cycle.data(phase, Date.now()); }); }); proxy.intercept('response-sent', function(req, resp, cycle) { var reqReceived = cycle.data('request'); var reqSent = cycle.data('request-sent'); var respReceived = cycle.data('response'); var respSent = cycle.data('response-sent'); // now print some profiling data });
hoxy.createServer({ reverse: 'https://www.google.com', tls: { key: fs.readFileSync('path/to/my-server.key.pem), cert: fs.readFileSync('path/to/my-server.crt.pem') } }).listen(8080)
The tls
option is passed to the underlying Node HTTPS server.
Hoxy is not Magic Cargo From The Sky Gods™. It has a fairly simple core architecture, and works in a predictable way. In order to assist the training of acolytes, however, the Magic Sky Gods have inspired the prophets to write down these incantations in The Book Of Spells.
This incantation makes all titles say "Unicorns!" by manipulating the response as DOM.
proxy.intercept({ phase: 'response', mimeType: 'text/html', as: '$' }, function(req, resp) { resp.$('title').text('Unicorns!'); });
This spell simulates a slow connection. This is done at the proxy level since two simultaneous requests would share the same connection. Thus for example a limit of 100000 bps for two requests would result in a speed of 50000 bps for each.
hoxy.createServer({ slow: { rate: 100000 } }).listen(8080);
This spell simulates a slow website. It's as if just one website were slow, even if the rest of them run at normal speed.
proxy.intercept({ phase: 'response', hostname: 'www.google.com' }, function(req, resp) { resp.slow({ rate: 10000 }); });
One can invoke the deep magic of generator functions. Which are actually just functions that can pause while they're running. This is useful because async stuff can happen while things are paused. But you need to yield promises, which means using an adapter to turn callbacks into promises.
var adapt = require('ugly-adapter'); var fs = require('fs'); proxy.intercept({ phase: 'response', hostname: 'mysite.com', mimeType: 'text/html', as: '$' }, function*(req, resp) { var headerHtml = yield adapt(fs.readFile, 'path/to/header.html'); resp.$('#header').html(headerHtml); });
If one speaks the language of Babel, one can invoke the even deeper magic of async functions.
var adapt = require('ugly-adapter'); var fs = require('fs'); proxy.intercept({ phase: 'response', hostname: 'mysite.com', mimeType: 'text/html', as: '$' }, async function(req, resp) { var headerHtml = await adapt(fs.readFile, 'path/to/header.html'); resp.$('#header').html(headerHtml); });
There's an incantation for intercepting HTTPS traffic, too. But you have to burn some extra incense. First, create your very own self-signed Certificate Authority. (You should only ever need to do this once.)
# Create the key openssl genrsa -out ~/.ssh/my-private-root-ca.key.pem 2048 # Create the cert openssl req -x509 -new -nodes -key ~/.ssh/my-private-root-ca.key.pem -days 1024 -out ~/.ssh/my-private-root-ca.crt.pem -subj "/C=US/ST=Utah/L=Provo/O=ACME Signing Authority Inc/CN=example.com"
Next, add the above root cert to your list of trusted cert authorities. How to do this varies. For example, Firefox maintains its own trusted list, while Chrome uses the OS's list. For specific details, consult the Ninth Scroll of P'Ki (or search the web). Once that's done, launch Hoxy, passing in your trusted cert and its private key.
hoxy.createServer({ certAuthority: { key: fs.readFileSync('/Users/you/.ssh/my-private-root-ca.key.pem'), cert: fs.readFileSync('/Users/you/.ssh/my-private-root-ca.crt.pem') } }).listen(8080);
Finally, configure your client to proxy both HTTP and HTTPS through localhost:8080
.
Hoxy will use the fake cert authority to spoof certificates from any HTTPS sites it encounters, decrypting and intercepting them as cleartext using the existing intercept API.
See also: basic explanation of HTTPS proxying.
Suppose you want to intercept your own web traffic to reddit.com.
One option is to create a proxy, configure your client to use it, then visit reddit.com.
If client configuration isn't feasible, another option is to use a reverse proxy.
In this case you'd visit a url like http://localhost:8080/
which would basically mirror the content of reddit.com.
This is fairly easy to set up:
hoxy.createServer({ reverse: 'http://www.reddit.com' }).listen(8080);
Now you can visit http://localhost:8080/
and reddit will be there.
Strictly speaking, by visiting the proxy directly in your browser rather than configuring it as a proxy, you're depriving the proxy of the scheme ("http:") and host ("www.reddit.com") information in the URL.
The reverse
option provides those missing pieces of information, allowing you to visit the proxy directly.
Suppose you wanted to reverse proxy to an HTTPS site. In that case you'll likely want the URL in your browser to be HTTPS as well. For that, just provide a key and cert that your client trusts. The instructions above already show how to create your own self-signed root CA. Do that, then use it to create your server's key and cert:
# Create the key openssl genrsa -out ./my-server.key.pem 2048 # Create the certificate signing request openssl req -new -key ./my-server.key.pem -out ./my-server.csr.pem -subj "/C=US/ST=Utah/L=Provo/O=ACME Tech Inc/CN=localhost" # Create the cert openssl x509 -req -in ./my-server.csr.pem -CA ~/.ssh/my-private-root-ca.crt.pem -CAkey ~/.ssh/my-private-root-ca.key.pem -CAcreateserial -out ./my-server.crt.pem -days 500
Then, launch your server with a tls
option, like so:
hoxy.createServer({ reverse: 'https://www.google.com', tls: { key: fs.readFileSync('path/to/my-server.key.pem), cert: fs.readFileSync('path/to/my-server.crt.pem') } }).listen(8080);
The tls
option is passed to the underlying Node HTTPS server.
See also: basic explanation of HTTPS proxying.
# | phase | description |
---|---|---|
1 |
request
|
The proxy has received the request headers, but the request body (if there is one) hasn't started streaming in yet.
Or, if you've intercepted a request as:'json' (for example) then the request body will have been fully buffered into memory and be available as request.json (for example).
See intercepts for more info.
|
2 |
request-sent
|
The proxy has finished sending the entire request, including the request body if present, to the server.
Everything is read-only during this phase.
The main reason for its existence is to be able to measure the time it takes to upload the request by comparing it with the previous phase.
|
3 |
response
|
The proxy has received the response headers, but the response body hasn't started streaming in yet.
Or, if you've intercepted a response as:'$' (for example) then the response body will have been fully buffered into memory and be available as response.$ (for example).
See intercepts for more info.
|
4 |
response-sent
|
The proxy has finished sending the entire response, including the response body, to the client.
Everything is read-only during this phase.
The main reason for its existence is to be able to measure the time it takes to download the response by comparing it with the previous phase.
|
Intercept handlers are called in the order of their phase
.
Intercept handlers of the same phase
are called in the order they're declared.
request
and response
objects are either read-only or readable/writable, depending on phase
.
Attempts to write a read-only thing will fail silently, generating an error
log event.
Readability / writability matrix | Object | ||
---|---|---|---|
request |
response |
||
Phase | request |
writable | writable |
request-sent |
read-only | read-only | |
response |
read-only | writable | |
response-sent |
read-only | read-only |
Hoxy normally populates responses by sending the request to the destination server during the normal request/response lifecycle.
Alternatively, if you modify any aspect of the response before this happens (i.e. during the request
phase) hoxy treats the entire response as populated, and skips the call to the destination server.
If you modify any aspect of the response after this happens (i.e. during the response
phase) it overwrites just that aspect of the response from the server.
proxy.intercept('request', function(req, resp, cycle) { response.string = 'Hello'; // The response is now populated so the // server call is skipped. Status code will // default to 200. });
proxy.intercept('request', function(req, resp, cycle) { response.statusCode = 200; // The response is now populated so the // server call is skipped. Response body // defaults to empty. });
proxy.intercept('response', function(req, resp, cycle) { response.string = 'Hello'; // The response was already populated, // we just overwrite its body. Status // code remains the same. });
proxy.intercept('response', function(req, resp, cycle) { response.statusCode = 666; // The response was already populated, // we just overwrote the status. I // wonder how the browser will react // to this unexpected turn of events. });
Changes to requests and responses are cumulative over the whole request/response cycle.
Among other things, this affects loading content as
a certain type, and filtering.
Change accumulation example.
proxy.intercept('request', function(req, resp, cycle) { req.headers['cache-control'] = undefined; }); proxy.intercept('request', function(req, resp, cycle) { console.log(req.headers['cache-control']); // undefined });
Change accumulation affects filtering.
proxy.intercept('request', function(req, resp, cycle) { req.hostname = 'example.com'; }); proxy.intercept({ phase: 'request', hostname: 'other.com' }, function(req, resp, cycle) { // never called! });
Change accumulation affecting as
parameters.
proxy.intercept({ phase: 'response', mimeType: 'application/json', as: 'json' }, function(req, resp, cycle) { ... }); proxy.intercept({ phase: 'response', mimeType: 'application/json' }, function(req, resp, cycle) { console.log(typeof resp.json); // 'object' }); proxy.intercept({ phase: 'response', mimeType: 'application/json', as: 'string' }, function(req, resp, cycle) { ... }); proxy.intercept({ phase: 'response', mimeType: 'application/json' }, function(req, resp, cycle) { console.log(typeof resp.json); // 'undefined' console.log(typeof resp.string); // 'string' });
How does HTTPS proxying work in Hoxy? First, let's review how HTTPS proxying works in general. Suppose you want to insert a proxy between yourself and the website https://example.com. There are two ways to do this: direct and reverse proxying.
To set up a direct HTTPS proxy, you'd launch a proxy on port 8080 and set your browser to do HTTPS proxying through localhost:8080
, then visit the site directly into your browser.
What happens here differs radically from HTTP proxying.
During HTTP proxying, the client sends this:
GET http://www.example.com/foo.html HTTP/1.1
In English this means go to example.com and get /foo.html for me. But during HTTPS proxying, the client sends this:
CONNECT example.com:443 HTTP/1.1
Which in English means connect me to example.com, I want to have a private conversation. Once the proxy establishes the pipe, the client TLS-handshakes the server on that connection. Then, it sends normal HTTP traffic over it like this:
GET /foo.html HTTP/1.1
...but it's encrypted, so the proxy can't see it. Anything might be happening on that connection, from the proxy's POV.
This is formally known as HTTP CONNECT tunneling. Since the proxy isn't privy to the conversation, there's no need for it to be an HTTPS server itself, even though "https://" appears in the browser's URL bar. Its job is just to shovel TCP packets back and forth, which happen to contain undecypherable TLS traffic.
A reverse HTTPS proxy is a different animal altogether. To set it up, you'd launch a proxy on port 8080. Since this is a reverse HTTPS proxy, you'll need a few more startup options:
CN=localhost
."https://example.com"
.The key and the cert in particular are necessary because a reverse proxy—from the client's POV—isn't a proxy at all, but an HTTPS webserver that speaks TLS. And because of the constraints of PKI, the client needs to trust the cert provided above. So one of two things needs to happen:
Either way you'll end up with a signed, trusted cert and your reverse proxy should work.
Hoxy follows these patterns, but with one major deviation having to do with direct proxying and HTTPS CONNECT tunneling. Instead of connecting the client to the remote server, Hoxy connects the client to its own private HTTPS server. This is a separate server instance from the proxy itself and solely exists to spoof TLS.
Normally, the client would immediately realize something fishy is going on. The spoofing server, being TLS, has provided a cert to the client that hasn't been signed by a client-trusted CA. Thus, to use Hoxy as an HTTPS direct proxy, you must create your own self-signed root CA, make your client trust it, then pass it to Hoxy as a startup option.
With this in mind, let's rewind to just before the client realizes something fishy is going on. Hoxy, having in its possession both the cert AND the private key of a client-trusted CA, will use it to generate spoofed keys and certs, signed by that CA, for each HTTPS domain the client visits. From the client's POV now, nothing fishy is going on. As far as it knows, its CONNECT requests are being honored by the proxy, and it's talking directly to the remote website in a private channel.
But in reality, the client is tunneling to an imposter server, its traffic is being decrypted and intercepted, then being re-encrypted and sent onward. Neither the client nor the server are the wiser.
You may have noticed there are two key/cert options that can be passed to Hoxy, with different names: tls
and certAuthority
.
It should now make sense why.
The former is for reverse proxying, and is just Node's config option for running an HTTPS webserver.
The latter is for direct proxying.
It's a meta-key/cert combo, used to spoof actual key/cert combos on the fly.
Incidentally, these two options are mutually exclusive.
It wouldn't make sense to use both at the same time.
To see some openssl commands and HTTPS proxy specimens, see the examples section.