Hoxy: Introduction

Hoxy is a completely free, open source HTTP hacking API for Node.js. It operates in the same ballpark as Charles or Fiddler. A few key features:

  • Intercept requests and/or responses.
  • Observe and alter all aspects of requests/responses.
  • Simulate network slowness and latency.
  • Supports both direct and reverse proxying.
  • Built-in helpers for replacing remote files with local ones.
  • Request/response body types: JSON, string, jQuery, buffer, etc.
  • Supports both HTTP and HTTPS.

Here are a few ideas for things you could do.

  • Replace your Facebook frineds with pictures of unicorns.
  • Add bizarre HTTP request headers to confuse server admins.
  • Remove ads from web pages for personal browsing.
  • Fuzz test your site's load sequence by adding random amounts of latency.
  • Test changes in production, without pushing to production.

These docs are for Hoxy 3.x. See the incantations section for code samples.

How it works

Hoxy functions as a normal proxy standing between the client and server. You intercept traffic during either the request or response phase, or both. The phase lifecycle section contains a more complete description, but here's a simplified illustration.

  time ==>
  -----------------------
  server:       3
  -------------/-\-------
  hoxy:       2   4
  -----------/-----\-----
  client:   1       5
  1. Client sends request.
  2. Hoxy intercepts the request.
  3. Server receives request and sends response.
  4. Hoxy intercepts the response.
  5. Client receives response.

Getting Started

Hoxy works on Node 0.12.x or higher, and io.js 1.0 or higher. It may work in older Nodes to varying degrees. Here's a quick sample program that tampers with request headers.

  let hoxy = require('hoxy');
  let proxy = hoxy.createServer().listen(8080);
  proxy.intercept('request', req => {
    req.headers['x-unicorns'] = 'unicorns';
    // server will now see the "x-unicorns" header
  });

After running the above, configure your client to proxy through yourmachine:8080. As long as you know JavaScript and HTTP, you can inspect and manipulate your own web traffic in arbitrary ways.

Download & Installation

Hoxy is available on both npm and github. Tests pass on recent node and io.js.

  $ npm install hoxy
  var hoxy = require('hoxy');

API documentaiton

Main module

  var hoxy = require('hoxy');

The object returned from require('hoxy') is just a namespace.

Class Proxy

Represents a proxy server. An instance of this is returned from hoxy.createServer().

hoxy.createServer(options)

Factory for a new proxy with given options. All options are optional.

      var hoxy = require('hoxy');
      var proxy = hoxy.createServer({
        upstreamProxy: 'localhost:9090',
        reverse: 'http://example.com',
        certAuthority: { key, cert },
        tls: { key, cert },
        slow: { rate, latency, up, down }
      });
    
Options Detail
name description
upstreamProxy Optional. If present, this proxy will in turn use another proxy. This allows Hoxy to play well with other proxies. This value should take the form host:port.
reverse Optional. If present, this proxy will run as a reverse proxy for the given server. This allows you to point your client directly at the proxy, instead of configuring it in the client's proxy settings. This value should take the form scheme://host:port.
certAuthority { key, cert } Optional. If present, this should contain a key/cert combo representing a certificate authority that your client trusts. See these instructions for how to generate these files. You'll then need to configure your client to use this proxy for https in addition to http. Once you've got all of that set up, Hoxy will generate fake keys/cert combos for every hostname you visit, caching them in memory for subsequent visits, thus allowing the proxy to handle https requests as cleartext.
tls { key, cert } Optional. Should only be used in comination with reverse. If present, causes Hoxy to run as an https server. Passed as opts to https.createServer(opts, function) (see the https module docs).
slow { rate, latency, up, down } Optional.
  • latency imposes a delay (in milliseconds) between all requests and responses.
  • rate imposes a single rate-limiting bottleneck (in bytes per second) on all throughput.
  • up imposes a single rate-limiting bottleneck (in bytes per second) on all uploads.
  • down imposes a single rate-limiting bottleneck (in bytes per second) on all downloads.
Note: up and down are independent. However, rate and up (if they both exist) form a pipeline; the slower of the two governing the total speed. The same is true of rate and down. These properties can also be get/set by using proxy.slow().

proxy.listen(port, [hostname], [backlog], [callback])

Starts proxy listening on port. Returns itself.

      var hoxy = require('hoxy');
      var proxy = hoxy.createServer().listen(8080);
    

A callback may be provided, to run when the proxy has started listening.

      var hoxy = require('hoxy');
      var port = 8080;
      var proxy = hoxy.createServer().listen(port, function() {
        console.log('The proxy is listening on port ' + port + '.');
      });
    

This method simply passes its arguments to Node's server.listen() method.

proxy.slow([options])

Get/set proxy-level slow options. If options is provided, it's a setter.

      proxy.slow({ rate, latency, up, down });
    

If options is not provided, it's a getter.

      var slowOpts = proxy.slow();
    

proxy.intercept(options, handler)

This is the entry point for intercepting and operating on requests and responses. This first example intercepts all requests.

      proxy.intercept('request', req => console.log(req.url));
    

This is more verbose, but identical to the above.

      proxy.intercept({
        phase: 'request'
      }, function(req, resp, cycle) {
        console.log(req.url);
      });
    

Interceptors

The callback passed to intercept() (AKA the interceptor) receives three arguments (req, resp, cycle) which are instances of classes described below. cycle is also passed as context. Thus:

      proxy.intercept('request', function(req, resp, cycle) {
        console.log(this === cycle) // true
      });
    

Sync versus async interceptors

Hoxy internally runs everything sequentially in series, including interceptors. That happens naturally when your interceptor logic is synchronous. But if it's asynchronous, you must signal that by returning a promise. As long as you do that, Hoxy will wait until the promise resolves and things will flow serially.

      proxy.intercept('request', function(req, resp, cycle) {
        // return a promise from the interceptor
        return getThingA().then(function(thingA) {
          return getThingB(thingA);
        }).then(function(thingB) {
          // do something with "thingB"
        });
      });
    

Future versions of JavaScript will allow async functions. Some transpilers allow you to use this syntax today. These can be used as interceptors, too. Written this way, your logic and your error handling are much cleaner.

      proxy.intercept('request', async function(req, resp, cycle) {
        // this is an async function
        let thingA = await getThingA();
        let thingB = await getThingB(thingA);
        // do something with "thingB"
      });
    

In current versions of io.js, generators can be used as a stand-in for async functions, but with the benefit of being natively-supported, spec-valid ES6 syntax. You must only yield promises from your generator. See co for more details.

      proxy.intercept('request', function*(req, resp, cycle) {
        // this is a generator
        let thingA = yield getThingA();
        let thingB = yield getThingB(thingA);
        // do something with "thingB"
      });
    

Note: Strictly speaking, Hoxy doesn't care whether your interceptor is an async function or a generator. It only cares what kind of thing the return value is:

  • Promise: wait for it to resolve, then proceed.
  • Iterator: turn it into a promise, wait for it to resolve, then proceed.
  • None of the above: proceed immediately.

Options

Options affect how, or if, interceptors get called. Here are a few examples. The following will only intercept GET requests to example.com.

      proxy.intercept({
        phase: 'request',
        method: 'GET',
        hostname: 'example.com'
      }, function(req, resp, cycle) {
        console.log('request made to: '+req.url);
      });
    

The following only intercepts text/html responses. When it does, it exposes the response body as a readable/writable $ variable.

      proxy.intercept({
        phase: 'response',
        mimeType: 'text/html',
        as: '$'
      }, function(req, resp, cycle) {
        resp.$('title')
        .text('all your titles are belong to us');
      });
    

The following only intercepts responses with a declared charset. When it does, it exposes the response body as the string variable.

      proxy.intercept({
        phase: 'response',
        contentType: /charset/i,
        as: 'string'
      }, function(req, resp, cycle) {
        console.log(resp.string);
      });
    

The following only intercepts responses from an /api/users JSON endpoint. When it does, it exposes the response body as the json variable. It uses a URL pattern to match the endpoint URL instead of a regex.

      proxy.intercept({
        phase: 'response',
        fullUrl: 'http://example.com/api/users/:id',
        mimeType: 'application/json',
        as: 'json'
      }, function(req, resp, cycle) {
        console.log(resp.json.email_address);
      });
    
Option Descriptions
name type required description
phase string yes Which phase to intercept. See phase lifecycle for more info. Accepted values:
  • 'request' - Proxy has received request.
  • 'request-sent' - Proxy has sent request.
  • 'response' - Proxy has received response.
  • 'response-sent' - Proxy has sent response.
as string no Expose the request or response body (depending on the phase) as data of a certain type. If there's an error parsing the body into this form, the intercept action is skipped and a warning is logged. Hoxy normally streams request and response bodies through. If as is present, hoxy buffers the request or response body into memory. Accepted values:
  • '$' - A DOM object similar to jQuery. See cheerio.
  • 'json' - A JS object containing JSON.
  • 'string' - A plain string.
  • 'buffer' - A buffer containing the entity body in its raw binary form.
  • 'params' - A JS object containing name/value pairs parsed from the application/x-www-form-urlencoded entity body.
Use filtering in tandem with these. E.g. use 'json' with mimeType:'application/json'.
Filtering options (these are logically ANDed together)
name type required description
protocol string, regex, function no Match the request protocol.
method string, regex, function no Match the all-uppercase HTTP request method.
hostname string, regex, function no Match the host, not including :port.
port number, string, regex, function no Match the port number.
url string, regex, function no Match the request URL. Patterns like /foo/* are allowed. See route-pattern.
fullUrl string, regex, function no Match the full request URL including protocol and hostname. Patterns like /foo/* are allowed. See route-pattern.
contentType string, regex, function no Match the full content-type header of the request or response (depending on the phase).
requestContentType string, regex, function no Same as contentType but only matches request.
responseContentType string, regex, function no Same as contentType but only matches response.
mimeType string, regex, function no Match just the mime type portion of the content-type header of the request or response (depending on the phase). I.e., if the entire header is "text/html; charset=utf-8", just match the "text/html" part.
requestMimeType string, regex, function no Same as mimeType but only matches request.
responseMimeType string, regex, function no Same as mimeType but only matches response.

Note: filtering options can be different types.

  • string: a loose (==) match.
  • regex: regex match of the string-coerced value.
  • function: truthiness of return value.

proxy.log(levels, [handler])

Deals with various logging events. This first example listens for error, warn, and debug logging events, and prints them to stderr.

      proxy.log('error warn debug');
    

Or, print logging events to various writable streams.

      proxy.log('error warn debug', process.stderr);
      proxy.log('info', process.stdout);
    

Or, explicitly handle logging events.

      proxy.log('error warn', function(event) {
        console.error(event.level + ': ' + event.message);
        if (event.error) console.error(event.error.stack);
      });
    
Description of logging events
name description
error When something bad happens that we wish wouldn't have happened.
warn When something iffy happened that we can probably tolerate.
info When something noteworthy happened that normal people care about.
debug When something boring happened that developers care about.

proxy.close([callback])

Stops proxy receiving requests. Finalizes and/or cleans up any resources the proxy uses internally.

      proxy.close(function(err) { // optional callback
        if (err) {
          throw err;
        }
        console.log('The proxy is no longer accepting new connections.');
      });
    

Class Request

Represents a request. An instance of this is passed as the first argument to every interceptor. Altering its values changes what the server sees. If you change the hostname, it changes which server sees the request.

request.protocol

Protocol of the request.

request.hostname

Destination server hostname, sans port.

request.port

Destination server port.

request.method

All-caps HTTP method used. Lowercase values are converted to uppercase.

request.headers

HTTP request header name/value JS object. These are all-lowercase, e.g. accept-encoding.

request.url

Root-relative request URL, including query string, like /foo/bar?baz=qux

request.query

An object representing querystring params in the URL. For example if the URL is /foo/bar?baz=qux, then this object will look like { baz: 'qux' }.

request.json

Request body parsed as JSON. This is only present if you intercept the request as:'json'. Changes made to this object will be seen by the server.

      proxy.intercept({
        phase: 'request',
        method: 'PUT',
        fulUrl: 'http://example.com/users/:id',
        as: 'json'
      }, function(req, resp, cycle) {
        req.json.prefs.subscriptions = 'all';
      });
    

request.params

Request body parsed as form-url-encoded params. This will be a key/value POJO. This object will only present if you intercept the request as:'params'. Changes made to this object will be seen by the server.

      proxy.intercept({
        phase: 'request',
        method: 'POST',
        mimeType: 'application/x-www-form-urlencoded',
        as: 'params'
      }, function(req, resp, cycle) {
        console.log(req.params.email);
      });
    

Note, parameters from the URL querystring are not included in this object.

request.string

Request body string. This is only present if you intercept the request as:'string'. Overwriting this will overwrite the request body sent to the server.

      proxy.intercept({
        phase: 'request',
        fulUrl: 'http://example.com/users/:id',
        as: 'string'
      }, function(req, resp, cycle) {
        console.log(req.string);
      });
    

request.buffer

Request body binary buffer. This is only present if you intercept the request as:'buffer'. Changes made to this object will be seen by the server.

      proxy.intercept({
        phase: 'request',
        method: 'POST',
        fullUrl: 'http://example.com/images',
        as: 'buffer'
      }, function(req, resp, cycle) {
        // req.buffer contains uploaded image
      });
    

request.slow(options)

Simulates slowness during request phase. With this method you can set a minimum latency and/or maximum transfer rate. Since these are minimum/maximum, if your native connection is already slower than these values, this method will have no effect.

      // Simulate upload speed of 10,000 bytes per second
      proxy.intercept('request', function(req, resp, cycle) {
        req.slow({rate:10000}); // bytes per second
      });
    
      // Simulate a 500-1000ms delay on every request.
      proxy.intercept('request', function(req, resp, cycle) {
        req.slow({latency:randint(500, 1000)});
      });
    

request.fullUrl([url])

If url is provided, sets the request's absolute protocol, hostname, port and url. Otherwise it returns the absolute URL of this request. This is mainly a convenience method.

      // Log every request through the proxy.
      proxy.intercept('request', function(req, resp, cycle) {
        console.log(req.fullUrl());
      });
    

request.tee(writable)

Whatever request body gets sent to the server, tee() pipes an identical copy to your writable stream. Your stream is held in memory, and only gets written to if and when the request is sent to the server. In other words, your stream sees whatever the server sees. If the server sees nothing, your stream sees nothing. You can tee() as many times as you want.

      proxy.intercept({
        phase: 'request',
        method: /post|put/i,
        mimeType: 'application/json'
      }, function(req, resp, cycle) {
        req.tee(fs.createWriteStream('./uploaded.json'));
      });
    

Class Response

Represents a response. An instance of this is passed as the second argument to every interceptor. This object is unpopulated during the request and request-sent phases. Altering its values changes the response to the client.

response.statusCode

HTTP status code being sent to the client.

response.headers

HTTP response header name/value JS object. Header names are all-lowercase, such as 'content-type'.

response.$

Response body parsed as DOM. This object is only present if you intercept the response as:'$'. This is a cheerio object, which provides a jQuery-like API. Changes made to it will be seen by the client.

      proxy.intercept({
        phase: 'response',
        fullUrl: 'http://example.com/page.html',
        as: '$'
      }, function(req, resp, cycle) {
        // change the title of the page
        resp.$('title').text('Fake Title!');
      });
    

response.json

Response body parsed as JSON. This is only present if you intercept the response as:'json'. Changes to this object will be seen by the client.

      proxy.intercept({
        phase: 'response',
        method: 'GET',
        fullUrl: 'http://example.com/users/123',
        as: 'json'
      }, function(req, resp, cycle) {
        // arbitrarily manipulate the response json
        resp.json.foo = 'bar';
      });
    

response.string

Response body string. This is only present if you intercept the response as:'string'. Overwriting this will overwrite the response body sent to the client.

      proxy.intercept({
        phase: 'response',
        fullUrl: 'http://example.com/page.html',
        as: 'string'
      }, function(req, resp, cycle) {
        // print page to log
        console.log(resp.string);
      });
    

response.buffer

Response body binary buffer. This is only present if you intercept the response as:'buffer'. Changes made to this object will be seen by the client.

      proxy.intercept({
        phase: 'response',
        fullUrl: 'http://example.com/image.jpg',
        as: 'buffer'
      }, function(req, resp, cycle) {
        // resp.buffer contains a jpg
      });
    

response.slow(options)

Simulates a slow response. With this method you can set a minimum latency and/or maximum transfer rate. Since these are minimum/maximum, if your native connection is already slower than these values, this method has no effect.

      // Simulate download speed of 100,000 bytes per second
      proxy.intercept('response', function(req, resp, cycle) {
        resp.slow({rate:100000}); // bytes per second
      });
    
      // Simulate a 500-1000ms delay on every response.
      proxy.intercept('response', function(req, resp, cycle) {
        resp.slow({latency:randint(500, 1000)});
      });
    

response.tee(writable)

Whatever response body gets sent to the client, tee() pipes an identical copy to your writable stream. Your stream is held in memory, and only gets written to when the response is sent to the client. In other words, your stream sees whatever the client sees. You can tee() as many times as you want.

      proxy.intercept({
        phase: 'response',
        mimeType: 'image/gif'
      }, function(req, resp, cycle) {
        resp.tee(fs.createWriteStream('./image.gif'));
      });
    

Class Cycle

Represents a whole request/response cycle. A Cycle instance is this in all interceptor calls, and the same instance is shared across an entire request/response cycle. It's also passed as the third argument, in order to support arrow functions. It provides a small number of methods not associated specifically to either the request or response.

cycle.serve(options)

Provisions responses from the local filesystem. Generally, the reason you'd do this is to be able to edit those files locally and test them as if they were live on the remote server. This action populates the response object; see response population for more info. The completion of this action is asynchronous, so serve() returns a promise. Example.

      proxy.intercept({
        phase: 'request',
        fullUrl: 'http://example.com/main.js'
      }, function(req, resp, cycle) {
        return cycle.serve('/Users/gr123/test/main.js');
      });
    

Or the more verbose-but-identical...

      proxy.intercept({
        phase: 'request',
        fullUrl: 'http://example.com/main.js'
      }, function(req, resp, cycle) {
        return cycle.serve({
          path: '/Users/gr123/test/main.js'
        });
      });
    
Options
name type required description
path string no Which file to serve. Defaults to the request URL. Normally this would be used in mutual exclusion with docroot. Strictly speaking, path is always rooted to docroot, which defaults to "/".
docroot string no Which local directory to serve out of. Defaults to filesystem root "/".
strategy string no Mainly relevant when using the docroot option. Describes the relationship between the local docroot and the remote one. Strictly speaking, this controls what happens when the local docroot is missing a requested file. Accepted values:
  • replace - (default) Replaces the remote docroot with the local one. In other words, if a requested file doesn't exist locally, it populates the response with a 404, even if it would have been found remotely.
  • overlay - Overlays the local docroot on top of the remote one. In other words, if a requested file doesn't exist locally, the request will transparently fall through to the remote server.
  • mirror - Automatically mirror the remote docroot locally. In other words, if a requested file doesn't exist locally, it's copied to the local docroot from the remote one, and will be found locally on subsequent requests.

The returned promise resolves after the response has been populated. There are at least three use cases worth mentioning.

Use case #1: Serve a specific file. First:

      $ curl http://example.com/js/main.js > main.js
    

...then:

      proxy.intercept({
        phase: 'request',
        fullUrl: 'http://example.com/js/main.js'
      }, function(req, resp, cycle) {
        return cycle.serve(__dirname + '/main.js');
      });
    

Use case #2: Serve out of a local docroot. First:

      $ mkdir js
      $ curl http://example.com/js/main.js > js/main.js
      $ curl http://example.com/js/some-lib.js > js/some-lib.js
      $ curl http://example.com/js/other-lib.js > js/other-lib.js
      ...
    

...then:

      proxy.intercept({
        phase: 'request',
        hostname: 'example.com',
        url: /^\/js\/.*/
      }, function(req, resp, cycle) {
        return cycle.serve({
          docroot: __dirname,
          strategy: 'overlay'
        });
      });
    

Use case #3: Serve out of a local docroot. Similar to above, but automatically downloads the files for you, instead of having to curl them as in the above example. You control which files get downloaded locally by the filtering options you provide.

      proxy.intercept({
        phase: 'request',
        hostname: 'example.com',
        url: /^\/js\/.*/
      }, function(req, resp, cycle) {
        return cycle.serve({
          docroot: __dirname,
          strategy: 'mirror'
        });
      });
    

cycle.data(name, [value])

Stores and retrieves data on a cycle instance. This is useful since the same instance is shared across all interceptors for a given request/response cycle, allowing you to share related data across disparate scopes. With two params this method behaves as a setter, with one param as a getter.

      ['request','request-sent','response','response-sent']
      .forEach(function(phase) {
        proxy.intercept(phase, function(req, resp, cycle) {
          cycle.data(phase, Date.now());
        });
      });
      proxy.intercept('response-sent', function(req, resp, cycle) {
        var reqReceived = cycle.data('request');
        var reqSent = cycle.data('request-sent');
        var respReceived = cycle.data('response');
        var respSent = cycle.data('response-sent');
        // now print some profiling data
      });
    
  hoxy.createServer({
    reverse: 'https://www.google.com',
    tls: {
      key: fs.readFileSync('path/to/my-server.key.pem),
      cert: fs.readFileSync('path/to/my-server.crt.pem')
    }
  }).listen(8080)

The tls option is passed to the underlying Node HTTPS server.

Incantations

Hoxy is not Magic Cargo From The Sky Gods™. It has a fairly simple core architecture, and works in a predictable way. In order to assist the training of acolytes, however, the Magic Sky Gods have inspired the prophets to write down these incantations in The Book Of Spells.

Unicorns!

This incantation makes all titles say "Unicorns!" by manipulating the response as DOM.

  proxy.intercept({
    phase: 'response',
    mimeType: 'text/html',
    as: '$'
  }, function(req, resp) {
    resp.$('title').text('Unicorns!');
  });

Slow Connection

This spell simulates a slow connection. This is done at the proxy level since two simultaneous requests would share the same connection. Thus for example a limit of 100000 bps for two requests would result in a speed of 50000 bps for each.

  hoxy.createServer({
    slow: { rate: 100000 }
  }).listen(8080);

Slow Website

This spell simulates a slow website. It's as if just one website were slow, even if the rest of them run at normal speed.

  proxy.intercept({
    phase: 'response',
    hostname: 'www.google.com'
  }, function(req, resp) {
    resp.slow({ rate: 10000 });
  });

Generator Interceptors

One can invoke the deep magic of generator functions. Which are actually just functions that can pause while they're running. This is useful because async stuff can happen while things are paused. But you need to yield promises, which means using an adapter to turn callbacks into promises.

  var adapt = require('ugly-adapter');
  var fs = require('fs');
  proxy.intercept({
    phase: 'response',
    hostname: 'mysite.com',
    mimeType: 'text/html',
    as: '$'
  }, function*(req, resp) {
    var headerHtml = yield adapt(fs.readFile, 'path/to/header.html');
    resp.$('#header').html(headerHtml);
  });

Async Interceptors

If one speaks the language of Babel, one can invoke the even deeper magic of async functions.

  var adapt = require('ugly-adapter');
  var fs = require('fs');
  proxy.intercept({
    phase: 'response',
    hostname: 'mysite.com',
    mimeType: 'text/html',
    as: '$'
  }, async function(req, resp) {
    var headerHtml = await adapt(fs.readFile, 'path/to/header.html');
    resp.$('#header').html(headerHtml);
  });

Intercept HTTPS

There's an incantation for intercepting HTTPS traffic, too. But you have to burn some extra incense. First, create your very own self-signed Certificate Authority. (You should only ever need to do this once.)

  # Create the key
  openssl genrsa -out ~/.ssh/my-private-root-ca.key.pem 2048
  # Create the cert
  openssl req -x509 -new -nodes -key ~/.ssh/my-private-root-ca.key.pem -days 1024 -out ~/.ssh/my-private-root-ca.crt.pem -subj "/C=US/ST=Utah/L=Provo/O=ACME Signing Authority Inc/CN=example.com"

Next, add the above root cert to your list of trusted cert authorities. How to do this varies. For example, Firefox maintains its own trusted list, while Chrome uses the OS's list. For specific details, consult the Ninth Scroll of P'Ki (or search the web). Once that's done, launch Hoxy, passing in your trusted cert and its private key.

  hoxy.createServer({
    certAuthority: {
      key: fs.readFileSync('/Users/you/.ssh/my-private-root-ca.key.pem'),
      cert: fs.readFileSync('/Users/you/.ssh/my-private-root-ca.crt.pem')
    }
  }).listen(8080);

Finally, configure your client to proxy both HTTP and HTTPS through localhost:8080. Hoxy will use the fake cert authority to spoof certificates from any HTTPS sites it encounters, decrypting and intercepting them as cleartext using the existing intercept API.

See also: basic explanation of HTTPS proxying.

Create a reverse proxy

Suppose you want to intercept your own web traffic to reddit.com. One option is to create a proxy, configure your client to use it, then visit reddit.com. If client configuration isn't feasible, another option is to use a reverse proxy. In this case you'd visit a url like http://localhost:8080/ which would basically mirror the content of reddit.com. This is fairly easy to set up:

  hoxy.createServer({
    reverse: 'http://www.reddit.com'
  }).listen(8080);

Now you can visit http://localhost:8080/ and reddit will be there. Strictly speaking, by visiting the proxy directly in your browser rather than configuring it as a proxy, you're depriving the proxy of the scheme ("http:") and host ("www.reddit.com") information in the URL. The reverse option provides those missing pieces of information, allowing you to visit the proxy directly.

Create an HTTPS reverse proxy

Suppose you wanted to reverse proxy to an HTTPS site. In that case you'll likely want the URL in your browser to be HTTPS as well. For that, just provide a key and cert that your client trusts. The instructions above already show how to create your own self-signed root CA. Do that, then use it to create your server's key and cert:

  # Create the key
  openssl genrsa -out ./my-server.key.pem 2048
  # Create the certificate signing request
  openssl req -new -key ./my-server.key.pem -out ./my-server.csr.pem -subj "/C=US/ST=Utah/L=Provo/O=ACME Tech Inc/CN=localhost"
  # Create the cert
  openssl x509 -req -in ./my-server.csr.pem -CA ~/.ssh/my-private-root-ca.crt.pem -CAkey ~/.ssh/my-private-root-ca.key.pem -CAcreateserial -out ./my-server.crt.pem -days 500

Then, launch your server with a tls option, like so:

  hoxy.createServer({
    reverse: 'https://www.google.com',
    tls: {
      key: fs.readFileSync('path/to/my-server.key.pem),
      cert: fs.readFileSync('path/to/my-server.crt.pem')
    }
  }).listen(8080);

The tls option is passed to the underlying Node HTTPS server.

See also: basic explanation of HTTPS proxying.

Appendix

Phase lifecycle

Full description of phases
# phase description
1 request The proxy has received the request headers, but the request body (if there is one) hasn't started streaming in yet. Or, if you've intercepted a request as:'json' (for example) then the request body will have been fully buffered into memory and be available as request.json (for example). See intercepts for more info.
2 request-sent The proxy has finished sending the entire request, including the request body if present, to the server. Everything is read-only during this phase. The main reason for its existence is to be able to measure the time it takes to upload the request by comparing it with the previous phase.
3 response The proxy has received the response headers, but the response body hasn't started streaming in yet. Or, if you've intercepted a response as:'$' (for example) then the response body will have been fully buffered into memory and be available as response.$ (for example). See intercepts for more info.
4 response-sent The proxy has finished sending the entire response, including the response body, to the client. Everything is read-only during this phase. The main reason for its existence is to be able to measure the time it takes to download the response by comparing it with the previous phase.

Intercept handlers are called in the order of their phase. Intercept handlers of the same phase are called in the order they're declared. request and response objects are either read-only or readable/writable, depending on phase. Attempts to write a read-only thing will fail silently, generating an error log event.

Readability / writability matrix Object
request response
Phase request writable writable
request-sent read-only read-only
response read-only writable
response-sent read-only read-only

Response population

Hoxy normally populates responses by sending the request to the destination server during the normal request/response lifecycle. Alternatively, if you modify any aspect of the response before this happens (i.e. during the request phase) hoxy treats the entire response as populated, and skips the call to the destination server. If you modify any aspect of the response after this happens (i.e. during the response phase) it overwrites just that aspect of the response from the server.

  proxy.intercept('request', function(req, resp, cycle) {
    response.string = 'Hello';
    // The response is now populated so the
    // server call is skipped. Status code will
    // default to 200.
  });
  proxy.intercept('request', function(req, resp, cycle) {
    response.statusCode = 200;
    // The response is now populated so the
    // server call is skipped. Response body
    // defaults to empty.
  });
  proxy.intercept('response', function(req, resp, cycle) {
    response.string = 'Hello';
    // The response was already populated,
    // we just overwrite its body. Status
    // code remains the same.
  });
  proxy.intercept('response', function(req, resp, cycle) {
    response.statusCode = 666;
    // The response was already populated,
    // we just overwrote the status. I
    // wonder how the browser will react
    // to this unexpected turn of events.
  });

Change accumulation

Changes to requests and responses are cumulative over the whole request/response cycle. Among other things, this affects loading content as a certain type, and filtering.

Change accumulation example.

  proxy.intercept('request', function(req, resp, cycle) {
    req.headers['cache-control'] = undefined;
  });

  proxy.intercept('request', function(req, resp, cycle) {
    console.log(req.headers['cache-control']); // undefined
  });

Change accumulation affects filtering.

  proxy.intercept('request', function(req, resp, cycle) {
    req.hostname = 'example.com';
  });

  proxy.intercept({
    phase: 'request',
    hostname: 'other.com'
  }, function(req, resp, cycle) {
    // never called!
  });

Change accumulation affecting as parameters.

  proxy.intercept({
    phase: 'response',
    mimeType: 'application/json',
    as: 'json'
  }, function(req, resp, cycle) { ... });

  proxy.intercept({
    phase: 'response',
    mimeType: 'application/json'
  }, function(req, resp, cycle) {
    console.log(typeof resp.json); // 'object'
  });

  proxy.intercept({
    phase: 'response',
    mimeType: 'application/json',
    as: 'string'
  }, function(req, resp, cycle) { ... });

  proxy.intercept({
    phase: 'response',
    mimeType: 'application/json'
  }, function(req, resp, cycle) {
    console.log(typeof resp.json); // 'undefined'
    console.log(typeof resp.string); // 'string'
  });

HTTPS Proxying

How does HTTPS proxying work in Hoxy? First, let's review how HTTPS proxying works in general. Suppose you want to insert a proxy between yourself and the website https://example.com. There are two ways to do this: direct and reverse proxying.

Direct HTTPS proxying

To set up a direct HTTPS proxy, you'd launch a proxy on port 8080 and set your browser to do HTTPS proxying through localhost:8080, then visit the site directly into your browser. What happens here differs radically from HTTP proxying. During HTTP proxying, the client sends this:

  GET http://www.example.com/foo.html HTTP/1.1

In English this means go to example.com and get /foo.html for me. But during HTTPS proxying, the client sends this:

  CONNECT example.com:443 HTTP/1.1

Which in English means connect me to example.com, I want to have a private conversation. Once the proxy establishes the pipe, the client TLS-handshakes the server on that connection. Then, it sends normal HTTP traffic over it like this:

  GET /foo.html HTTP/1.1

...but it's encrypted, so the proxy can't see it. Anything might be happening on that connection, from the proxy's POV.

This is formally known as HTTP CONNECT tunneling. Since the proxy isn't privy to the conversation, there's no need for it to be an HTTPS server itself, even though "https://" appears in the browser's URL bar. Its job is just to shovel TCP packets back and forth, which happen to contain undecypherable TLS traffic.

Reverse HTTPS proxying

A reverse HTTPS proxy is a different animal altogether. To set it up, you'd launch a proxy on port 8080. Since this is a reverse HTTPS proxy, you'll need a few more startup options:

  1. A private key.
  2. A certificate signed by the above key, with CN=localhost.
  3. The reverse proxy target, consisting of "https://example.com".

The key and the cert in particular are necessary because a reverse proxy—from the client's POV—isn't a proxy at all, but an HTTPS webserver that speaks TLS. And because of the constraints of PKI, the client needs to trust the cert provided above. So one of two things needs to happen:

  1. The cert needs to be signed by a known CA, which requires giving somebody money.
  2. You must self-sign your own CA, make your client trust it, and sign the above cert with it.

Either way you'll end up with a signed, trusted cert and your reverse proxy should work.

So how does Hoxy figure into this?

Hoxy follows these patterns, but with one major deviation having to do with direct proxying and HTTPS CONNECT tunneling. Instead of connecting the client to the remote server, Hoxy connects the client to its own private HTTPS server. This is a separate server instance from the proxy itself and solely exists to spoof TLS.

Normally, the client would immediately realize something fishy is going on. The spoofing server, being TLS, has provided a cert to the client that hasn't been signed by a client-trusted CA. Thus, to use Hoxy as an HTTPS direct proxy, you must create your own self-signed root CA, make your client trust it, then pass it to Hoxy as a startup option.

With this in mind, let's rewind to just before the client realizes something fishy is going on. Hoxy, having in its possession both the cert AND the private key of a client-trusted CA, will use it to generate spoofed keys and certs, signed by that CA, for each HTTPS domain the client visits. From the client's POV now, nothing fishy is going on. As far as it knows, its CONNECT requests are being honored by the proxy, and it's talking directly to the remote website in a private channel.

But in reality, the client is tunneling to an imposter server, its traffic is being decrypted and intercepted, then being re-encrypted and sent onward. Neither the client nor the server are the wiser.

"tls" versus "certAuthority"

You may have noticed there are two key/cert options that can be passed to Hoxy, with different names: tls and certAuthority. It should now make sense why. The former is for reverse proxying, and is just Node's config option for running an HTTPS webserver. The latter is for direct proxying. It's a meta-key/cert combo, used to spoof actual key/cert combos on the fly. Incidentally, these two options are mutually exclusive. It wouldn't make sense to use both at the same time.

To see some openssl commands and HTTPS proxy specimens, see the examples section.

Other info

Fork me on GitHub