Installation and Deployment

Installation Guides

IB Gateway

QuantRocket connects to IB's servers through IB Gateway, IB's lightweight alternative to Trader Workstation. You can run one or more IB Gateway services through QuantRocket, where each gateway instance is associated with a different username and password.

Benefits of multiple gateways

IB imposes rate limits on market data requests. The more IB Gateway services you run, the better your effective rate limits, since QuantRocket can spread the requests among several gateways. Even if you're an individual trader with one account, consider structuring the account as a Friends and Family account, which can yield 3 account logins: the master/advisor login, the master/advisor second user login, and a client trading login. If you then subscribe to market data under all 3 logins, you can triple your concurrency.

Start/stop IB Gateway

QuantRocket's IB Gateway service utilizes IBController, a popular tool for automating the startup and shutdown of IB Gateway or Trader Workstation. IBController is best suited for running a single, manually-configured instance of IB Gateway or Trader Workstation on a desktop (i.e. non-headless) computer. By running IBController inside a Docker service, QuantRocket adds extra functionality that allows for cloud as well as local deployments: automated configuration; headless installation with VNC access for troubleshooting; and the ability to run and control multiple IB Gateway instances via a REST API.

Launchpad is the name of the QuantRocket service used for launching and stopping IB Gateway. You can check the current status of your IB Gateway services (QuantRocket services use this endpoint to determine which gateways to connect to when requesting market data):

$ quantrocket launchpad status
ibg1: running
ibg2: running
ibg3: stopped
>>> from quantrocket.launchpad import list_gateway_statuses
>>> list_gateway_statuses()
{
    u'ibg1': u'running',
    u'ibg2': u'running',
    u'ibg3': u'stopped'
}
$ curl -X GET 'http://houston:1969/launchpad/gateways'
{
    "ibg1": "running",
    "ibg2": "running",
    "ibg3": "stopped"
}

Although IB Gateway is advertised as not having to be restarted once a day like Trader Workstation, it's not unusual for IB Gateway to display unexpected behavior (such as not returning market data when requested) which is then resolved simply by restarting IB Gateway. Therefore you might find it beneficial to restart your gateways from time to time, which you could do via countdown, QuantRocket's cron service:

# Restart IB Gateways nightly at 1AM
0 1 * * * quantrocket launchpad stop --wait && quantrocket launchpad start

Or, perhaps you use one of your IB logins during the day to monitor the market using Trader Workstation, but in the evenings you'd like to use this login to add concurrency to your historical data downloads. You could start and stop the IB Gateway service in conjunction with the download:

# Download data in the evenings using all logins, but then disconnect from ibg2
30 17 * * 1-5 quantrocket launchpad start --wait --gateways ibg2 && quantrocket history fetch "nasdaq_eod" && quantrocket launchpad stop --gateways ibg2

Market data permissions

Generally, loading your market data permissions into QuantRocket is only necessary when you are running multiple IB Gateway services with different market data permissions for each.

To retrieve market data from IB, you must subscribe to the appropriate market data subscriptions in Account Management. Since QuantRocket can't identify your subscriptions via API, you must tell QuantRocket about your subscriptions by loading a YAML configuration file. If you don't load a configuration file, QuantRocket will assume you have market data permissions for any data you request through QuantRocket. If you only run one IB Gateway service, this is probably sufficient and you can skip the configuration file. However, if you run multiple IB Gateway services with separate market data permissions for each, you will probaby want to load a configuration file so QuantRocket can route your requests to the appropriate IB Gateway service. You should also update your configuration file whenever you modify your market data permissions in IB Account Management.

You can load your permissions into a running deployment as follows:

$ quantrocket launchpad config /path/to/my_new_config.yml
status: the config will be loaded asynchronously
>>> from quantrocket.launchpad import load_launchpad_config
>>> load_launchpad_config("/path/to/my_new_config.yml")
{u'status': u'the config will be loaded asynchronously'}
$ curl -X PUT 'http://houston:1969/launchpad/config' --upload-file /path/to/my_new_config.yml
{"status": "the config will be loaded asynchronously"}
The format of the config file is shown below:
# each top-level key is the name of an IB Gateway service
ibg1:
    # list the exchanges, by security type, this gateway has permission for
    marketdata:
        STK:
            - NYSE
            - ISLAND
            - TSEJ
        FUT:
            - GLOBEX
            - OSE
        CASH:
            - IDEALPRO
    # list the research services this gateway has permission for
    # (options: reuters, wsh)
    research:
        - reuters
        - wsh
    # list the number of simultaneous market data lines this gateway is allowed
    # (default 100)
    max_tickers: 260
# if you have multiple IB Gateway services, include a section for each
ibg2:
    marketdata:
        STK:
            - NYSE
    max_tickers: 100
You can also view the current config:
$ quantrocket launchpad config
ibg1:
  marketdata:
    CASH:
    - IDEALPRO
    FUT:
    - GLOBEX
    - OSE
    STK:
    - NYSE
    - ISLAND
    - TSEJ
  max_tickers: 260
  research:
  - reuters
  - wsh
ibg2:
  marketdata:
    STK:
    - NYSE
  max_tickers: 100
>>> from quantrocket.launchpad import get_launchpad_config
>>> get_launchpad_config()
{
    'ibg1': {
        'marketdata': {
            'CASH': [
                'IDEALPRO'
            ],
            'FUT': [
                'GLOBEX',
                'OSE'
            ],
            'STK': [
                'NYSE',
                'ISLAND',
                'TSEJ'
            ]
        },
        'max_tickers': 260,
        'research': [
            'reuters',
            'wsh'
        ]
    },
    'ibg2': {
        'marketdata': {
            'STK': [
                'NYSE'
            ]
        },
        'max_tickers': 100
    },
 }
$ curl -X GET 'http://houston:1969/launchpad/config'
{
    "ibg1": {
        "marketdata": {
            "CASH": [
                "IDEALPRO"
            ],
            "FUT": [
                "GLOBEX",
                "OSE"
            ],
            "STK": [
                "NYSE",
                "ISLAND",
                "TSEJ"
            ]
        },
        "max_tickers": 260,
        "research": [
            "reuters",
            "wsh"
        ]
    },
    "ibg2": {
        "marketdata": {
            "STK": [
                "NYSE"
            ]
        },
        "max_tickers": 100
    },
 }

The market data configuration file, if you upload one, is stored in QuantRocket as quantrocket.launchpad.permissions.yml. This is the filename you should use if you wish to store the configuration file in a Git repository and have QuantRocket automatically load it at the time of deployment using the codeload service.

IB Gateway GUI

Normally you won't need to access the IB Gateway GUI. However, you might need access to troubleshoot a login issue, or if you've enabled two-factor authentication for IB Gateway.

To allow access to the IB Gateway GUI, QuantRocket uses NoVNC, which uses the WebSockets protocol to support VNC connections in the browser. First start IB gateway if it's not already running:

$ quantrocket launchpad start -g ibg1 --wait
ibg1:
  status: running

To open an IB Gateway GUI connection in your browser, click the Commands menu in JupyterLab, search for "QuantRocket", and click "IB Gateway GUI". The IB Gateway GUI will open in a new window (make sure your browser doesn't block the pop-up).

IB GUI

To quit the VNC session but leave IB Gateway running, simply close your browser tab.

For improved security for cloud deployments, QuantRocket doesn't directly expose any VNC ports to the outside. By proxying VNC connections through houston using NoVNC, such connections are protected by Basic Auth and SSL, just like every other request sent through houston.

IB Gateway log files

You can enable and view IB log files using docker exec (currently these commands are not exposed via the REST API or CLI).

Assuming the service name is ibg1 and container name is quantrocket_ibg1_1, list available API settings:

$ docker exec quantrocket_ibg1_1 gateway-ctl list-api-settings

Enable API message log file without rebuilding container:

$ docker exec quantrocket_ibg1_1 gateway-ctl edit-api-settings --createApiMsgLogFile=true

Save your API message log file to the host:

$ docker exec quantrocket_ibg1_1 gateway-ctl message-logfile --client-id=1 --weekday=Fri > api.1.Fri.log

If the message log file you choose doesn't exist, quantrocket will list the available choices. Or you can call:

$ docker exec quantrocket_ibg1_1 gateway-ctl message-logfile --list

Set IB log level to Detail without rebuilding container:

$ docker exec quantrocket_ibg1_1 gateway-ctl edit-api-settings --logLevel=5

Save the log file to the host:

$ docker exec quantrocket_ibg1_1 gateway-ctl logfile > ibgateway.log

You can also grab a different date:

$ docker exec quantrocket_ibg1_1 gateway-ctl logfile --date=20161116 > ibgateway.log

Universe Selection

Trading starts with the selection of your trading universe. Many trading platforms assume you already have a list of symbols you want to trade and expect you to hand-enter them into the platform. With IB supporting dozens of global exchanges and thousands upon thousands of individual listings, QuantRocket doesn't assume you already know the ticker symbols of every instrument you might want to trade. QuantRocket makes it easy to retrieve all available listings and flexibly group them into universes that make sense for your trading strategies.

Fetch listings

First, decide which exchange(s) you want to work with. You can view exchange listings on the IB website or use QuantRocket to summarize the IB website by security type:
$ quantrocket master exchanges --regions asia --sec-types STK
STK:
  Australia:
  - ASX
  - CHIXAU
  Hong Kong:
  - SEHK
  - SEHKNTL
  - SEHKSZSE
  India:
  - NSE
  Japan:
  - CHIXJ
  - JPNNEXT
  - TSEJ
  Singapore:
  - SGX
>>> from quantrocket.master import list_exchanges
>>> list_exchanges(regions=["asia"], sec_types=["STK"])
{'STK': {'Australia': ['ASX', 'CHIXAU'],
         'Hong Kong': ['SEHK', 'SEHKNTL', 'SEHKSZSE'],
         'India': ['NSE'],
         'Japan': ['CHIXJ', 'JPNNEXT', 'TSEJ'],
         'Singapore': ['SGX']}}
$ curl 'http://houston:1969/master/exchanges?regions=asia&sec_types=STK'
{"STK": {"Australia": ["ASX", "CHIXAU"], "Hong Kong": ["SEHK", "SEHKNTL", "SEHKSZSE"], "India": ["NSE"], "Japan": ["CHIXJ", "JPNNEXT", "TSEJ"], "Singapore": ["SGX"]}}
Let's download contract details for all stock listings on the Hong Kong Stock Exchange:
$ quantrocket master listings --exchange SEHK --sec-types STK
status: the listing details will be fetched asynchronously
>>> from quantrocket.master import fetch_listings
>>> fetch_listings(exchange="SEHK", sec_types=["STK"])
{'status': 'the listing details will be fetched asynchronously'}
$ curl -X POST 'http://houston:1969/master/listings?exchange=SEHK&sec_types=STK'
{"status": "the listing details will be fetched asynchronously"}
QuantRocket uses the IB website to collect all symbols for the requested exchange then downloads contract details from the IB API. The download runs asynchronously; check Papertrail or use the CLI to monitor the progress:.
$ quantrocket flightlog stream --hist 5
12:07:40 quantrocket.master: INFO Fetching SEHK STK listings from IB website
12:08:29 quantrocket.master: INFO Requesting details for 2220 SEHK listings found on IB website
12:10:06 quantrocket.master: INFO Saved 2215 SEHK listings to securities master database
The number of listings fetched from the IB website might be larger than the number of listings actually saved to the database. This is because the IB website lists all symbols that trade on a given exchange, even if the exchange is not the primary listing exchange. For example, the primary listing exchange for Alcoa (AA) is NYSE, but the IB website also lists Alcoa under the BATS exchange because Alcoa also trades on BATS (and many other US exchanges). QuantRocket downloads and saves Alcoa's contract details when you fetch NYSE listings, not when you fetch BATS listings. For futures, the number of contracts saved to the database will typically be larger than the number of listings found on the IB website because the website only lists underlyings but QuantRocket saves all available expiries for each underlying.

Define universes

Once you've fetched listings that interest you, you can group them into meaningful universes. Universes provide a convenient way to refer to and manipulate large groups of securities when fetching historical data, running a trading strategy, etc. You can create universes based on exchanges, security types, sectors, liquidity, or any criteria you like.

There are different ways to create a universe. You can download a CSV of securities, manually pare it down to the desired securities, and create the universe from the edited list:

$ quantrocket master get --exchanges SEHK --outfile hongkong_securities.csv
$ # edit the CSV, then:
$ quantrocket master universe "hongkong" --infile hongkong_securities_edited.csv
code: hongkong
inserted: 2216
provided: 2216
total_after_insert: 2216
>>> from quantrocket.master import download_master_file, create_universe
>>> download_master_file("hongkong_securities.csv", exchanges=["SEHK"])
>>> # edit the CSV, then:
>>> create_universe("hongkong", infilepath_or_buffer="hongkong_securities_edited.csv")
{'code': 'hongkong',
 'inserted': 2216,
 'provided': 2216,
 'total_after_insert': 2216}
$ curl -X GET 'http://houston:1969/master/securities.csv?exchanges=SEHK' > hongkong_securities.csv
$ # edit the CSV, then:
$ curl -X PUT 'http://houston:1969/master/universes/hongkong' --upload-file hongkong_securities_edited.csv
{"code": "hongkong", "provided": 2216, "inserted": 2216, "total_after_insert": 2216}

Using the CLI, you can create a universe in one-line by piping the downloaded CSV to the universe command:

$ quantrocket master get --exchanges SEHK --sectors "Financial" | quantrocket master universe "hongkong-fin" --infile -
code: hongkong-fin
inserted: 416
provided: 416
total_after_insert: 416

You can also create a universe from existing universes:

$ quantrocket master universe "asx" --from-universes "asx-sml" "asx-mid" "asx-lrg"
code: asx
inserted: 1604
provided: 1604
total_after_insert: 1604
>>> from quantrocket.master import create_universe
>>> create_universe("asx", from_universes=["asx-sml", "asx-mid", "asx-lrg"])
{'code': 'asx',
 'inserted': 1604,
 'provided': 1604,
 'total_after_insert': 1604}
$ curl -X PUT 'http://houston:1969/master/universes/asx?from_universes=asx-sml&from_universes=asx-mid&from_universes=asx-lrg'
{"code": "asx", "provided": 1604, "inserted": 1604, "total_after_insert": 1604}

Use market data to define universes

If you want to split up the universes by liquidity or some other kind of market data or fundamental data, the best approach is to create a universe comprising the entire pool of relevant securities, fetch the needed data for this universe, then create the sub-universes.

For example, suppose we've fetched all NYSE stock listings and want to create 3 universes - smallcaps, midcaps, and largecaps - based on the 90-day average dollar volume. First, we should define a universe of all NYSE stocks, then create a history database and fetch historical data for all NYSE stocks (see the Historical Data section for more detail).

$ quantrocket master get -e 'NYSE' -t 'STK' | quantrocket master universe 'nyse-stk' -f -
code: nyse-stk
inserted: 3109
provided: 3109
total_after_insert: 3109
$ quantrocket history create-db 'nyse-eod' --bar-size '1 day' --universes 'nyse-stk'
status: successfully created quantrocket.history.nyse-eod.sqlite
$ quantrocket history fetch 'nyse-eod'
status: the historical data will be fetched asynchronously
Once the historical data has been fetched (monitor flightlog for status), you can use pandas and the Python client to determine average dollar volume and create your universes. First, query the history database and load into pandas:
>>> from quantrocket.history import download_history_file
>>> from quantrocket.master import create_universe
>>> import pandas as pd
>>> import io
>>> f = io.StringIO()
>>> download_history_file("nyse-eod", f, fields=["Close", "Volume"])
>>> prices = pd.read_csv(f, parse_dates=["Date"])
>>> prices.head()
       ConId       Date   Close    Open   Volume
0    5026974 2017-01-25  173.43  174.76   749100
1  173253787 2017-01-25   29.32   29.29  1769600
2  189797411 2017-01-25   24.70   24.70    47600
3  121250909 2017-01-25   71.16   70.90   146700
4   13881236 2017-01-25   14.04   14.15    36200

Let's get a DataFrame of closes and another of volumes, with the dates as the index and the conids as the columns:

>>> prices = prices.pivot(index="ConId", columns="Date").T
>>> closes = prices.loc["Close"]
>>> volumes = prices.loc["Volume"]
>>> closes.head()
ConId            4200       4205       4211       4227    ...
Date                                                      ...  
2017-01-25       5.56      45.92      62.97      70.27    ...
2017-01-26       5.46      46.17      63.21      70.35    ...
2017-01-27       5.48      47.70      64.06      69.99    ...
2017-01-30       5.55      47.40      63.09      67.14    ...
2017-01-31       5.49      48.65      63.78      67.83    ...

Next we calculate daily dollar volume and take a 90-day average:

>>> dollar_volumes = closes * volumes
>>> avg_dollar_volumes = dollar_volumes.rolling(window=90).mean()
>>> # we'll make our universes based on the latest day's averages
>>> avg_dollar_volumes = avg_dollar_volumes.iloc[-1]
>>> avg_dollar_volumes.describe()
Out[60]:
count    2.255000e+03
mean     3.609773e+07
std      9.085866e+07
min      3.270559e+04
25%      1.058080e+06
50%      6.229675e+06
75%      3.399090e+07
max      1.719344e+09
Name: 2017-08-15 00:00:00, dtype: float64

Let's make universes of $1-5M, $5-25M, and $25M+:

>>> sml = avg_dollar_volumes[(avg_dollar_volumes >= 1000000) & (avg_dollar_volumes < 5000000)]
>>> mid = avg_dollar_volumes[(avg_dollar_volumes >= 5000000) & (avg_dollar_volumes < 25000000)]
>>> lrg = avg_dollar_volumes[avg_dollar_volumes >= 25000000]

The DataFrame indexes contain the conids which are needed to make the universes, so we write the DataFrames to in-memory CSVs and pass the CSVs to the master service:

>>> f = io.StringIO()
>>> sml.to_csv(f, header=True)
>>> create_universe("nyse-sml", infilepath_or_buffer=f)
{'code': 'nyse-sml',
 'inserted': 509,
 'provided': 509,
 'total_after_insert': 509}
>>> f = io.StringIO()
>>> mid.to_csv(f, header=True)
>>> create_universe("nyse-mid", infilepath_or_buffer=f)
 {'code': 'nyse-mid',
  'inserted': 530,
  'provided': 530,
  'total_after_insert': 530}
>>> f = io.StringIO()
>>> lrg.to_csv(f, header=True)
>>> create_universe("nyse-lrg", infilepath_or_buffer=f)
{'code': 'nyse-lrg',
'inserted': 665,
'provided': 665,
'total_after_insert': 665}

On a side note, now that you've created different universes for different market caps, a typical workflow might involve creating a history database for each universe. As described more fully in the Historical Data documentation, you can seed your databases for each market cap segment from the historical data you've already fetched, saving you the trouble of re-fetching the data from scratch.

$ quantrocket history create-db 'nyse-sml-eod' --bar-size '1 day' --universes 'nyse-sml'
status: successfully created quantrocket.history.nyse-sml-eod.sqlite
$ quantrocket history get 'nyse-eod' --universes 'nyse-sml' | quantrocket history load 'nyse-sml-eod'
db: nyse-sml-eod
loaded: 572081

See the Historical Data documentation for more details on copying data from one history database to another.

Futures rollover rules

You can define rollover rules for the futures contracts you trade, and QuantRocket will automatically calculate the rollover date for each expiry and store it in the securities master database. Your rollover rules are used to determine the front month contract when stitching together continuous futures contracts and when automating position rollover.

The format of the rollover rules configuration file is shown below:

# quantrocket.master.rollover.yml

# each top level key is an exchange code
GLOBEX:
  # each second-level key is an underlying symbol
  ES:
    # the rollrule key defines how to derive the rollover date
    # from the expiry/LastTradeDate; the arguments will be passed
    # to bdateutil.relativedelta. For valid args, see:
    # https://dateutil.readthedocs.io/en/stable/relativedelta.html
    # https://github.com/ryanss/python-bdateutil#documentation
    rollrule:
      # roll 8 calendar days before expiry
      days: -8
    # if the same rollover rules apply to numerous futures contracts,
    # you can save typing and enter them all at once under the same_for key
    same_for:
      - NQ
      - RS
      - YM
  MXP:
    # If you want QuantRocket to ignore certain contract months,
    # you can specify the months you want (using numbers not letters)
    # Only the March, June, Sept, and Dec MXP contracts are liquid
    only_months:
      - 3
      - 6
      - 9
      - 12
    rollrule:
      # roll 7 calendar days before expiry
      days: -7
    same_for:
      - GBP
      - JPY
      - AUD
  HE:
    rollrule:
      # roll on 27th day of month prior to expiry month
      months: -1
      day: 27
NYMEX:
  RB:
    rollrule:
      # roll 2 business days before expiry
      bdays: -2

You can load your rollover rules into a running deployment as follows:

$ quantrocket master rollrules /path/to/quantrocket.master.rollover.yml
status: the config will be loaded asynchronously
>>> from quantrocket.master import load_rollrules_config
>>> load_rollrules_config("/path/to/quantrocket.master.rollover.yml")
{u'status': u'the config will be loaded asynchronously'}
$ curl -X PUT 'http://houston:1969/master/config/rollover' --upload-file /path/to/quantrocket.master.rollover.yml
{"status": "the config will be loaded asynchronously"}

The rollover rules configuration file, if you upload one, is stored in QuantRocket as quantrocket.master.rollover.yml. This is the filename you should use if you wish to store the configuration file in a Git repository and have QuantRocket automatically load it at the time of deployment using the codeload service.

You can query your rollover dates:

$ quantrocket master get --exchanges GLOBEX --symbols ES --sec-types FUT --fields Symbol LastTradeDate RolloverDate
ConId,Symbol,LastTradeDate,RolloverDate
206848474,ES,2016-12-16,2016-12-08
215465490,ES,2017-03-17,2017-03-09
225652200,ES,2017-06-16,2017-06-08
236950077,ES,2017-09-15,2017-09-07
247950613,ES,2017-12-15,2017-12-07
258973438,ES,2018-03-16,2018-03-08
269745169,ES,2018-06-15,2018-06-07
279396694,ES,2018-09-21,2018-09-13
>>> from quantrocket.master import download_master_file
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["GLOBEX"], symbols=["ES"], sec_types=["FUT"], fields=["Symbol", "LastTradeDate", "RolloverDate"])
>>> df = pd.read_csv(f)
>>> df.tail()
       ConId Symbol LastTradeDate RolloverDate
8   236950077     ES    2017-09-15   2017-09-07
9   247950613     ES    2017-12-15   2017-12-07
10  258973438     ES    2018-03-16   2018-03-08
11  269745169     ES    2018-06-15   2018-06-07
12  279396694     ES    2018-09-21   2018-09-13
$ curl 'http://houston:1969/master/securities.csv?exchanges=GLOBEX&symbols=ES&sec_types=FUT&fields=Symbol&fields=LastTradeDate&fields=RolloverDate'
ConId,Symbol,LastTradeDate,RolloverDate
206848474,ES,2016-12-16,2016-12-08
215465490,ES,2017-03-17,2017-03-09
225652200,ES,2017-06-16,2017-06-08
236950077,ES,2017-09-15,2017-09-07
247950613,ES,2017-12-15,2017-12-07
258973438,ES,2018-03-16,2018-03-08
269745169,ES,2018-06-15,2018-06-07
279396694,ES,2018-09-21,2018-09-13
Or query only the front month contract:
$ quantrocket master get --exchanges GLOBEX --symbols ES --sec-types FUT --frontmonth --pretty
           ConId = 236950077
          Symbol = ES
         SecType = FUT
             Etf = 0
     PrimaryExchange = GLOBEX
        Currency = USD
     LocalSymbol = ESU7
    TradingClass = ES
      MarketName = ES
        LongName = E-mini S&P 500
        Timezone = America/Chicago
          Sector =
        Industry =
        Category =
         MinTick = 0.25
  PriceMagnifier = 1
MdSizeMultiplier = 1
   LastTradeDate = 2017-09-15
    RolloverDate = 2017-09-07
   ContractMonth = 201709
      Multiplier = 50
         LotSize =
        Delisted = 0
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["GLOBEX"], symbols=["ES"], sec_types=["FUT"], frontmonth=True, output="txt")
>>> print(f.getvalue())
           ConId = 236950077
          Symbol = ES
         SecType = FUT
             Etf = 0
     PrimaryExchange = GLOBEX
        Currency = USD
     LocalSymbol = ESU7
    TradingClass = ES
      MarketName = ES
        LongName = E-mini S&P 500
        Timezone = America/Chicago
          Sector =
        Industry =
        Category =
         MinTick = 0.25
  PriceMagnifier = 1
MdSizeMultiplier = 1
   LastTradeDate = 2017-09-15
    RolloverDate = 2017-09-07
   ContractMonth = 201709
      Multiplier = 50
         LotSize =
        Delisted = 0
$ curl 'http://houston:1969/master/securities.txt?exchanges=GLOBEX&symbols=ES&sec_types=FUT&frontmonth=true'
           ConId = 236950077
          Symbol = ES
         SecType = FUT
             Etf = 0
     PrimaryExchange = GLOBEX
        Currency = USD
     LocalSymbol = ESU7
    TradingClass = ES
      MarketName = ES
        LongName = E-mini S&P 500
        Timezone = America/Chicago
          Sector =
        Industry =
        Category =
         MinTick = 0.25
  PriceMagnifier = 1
MdSizeMultiplier = 1
   LastTradeDate = 2017-09-15
    RolloverDate = 2017-09-07
   ContractMonth = 201709
      Multiplier = 50
         LotSize =
        Delisted = 0

Option chains

To fetch option chains, first fetch listings for the underlying securities:

$ quantrocket master listings --exchange 'NASDAQ' --sec-types 'STK' --symbols 'GOOG' 'FB' 'AAPL'
status: the listing details will be fetched asynchronously
>>> from quantrocket.master import fetch_listings
>>> fetch_listings(exchange="NASDAQ", sec_types=["STK"], symbols=["GOOG", "FB", "AAPL"])
{'status': 'the listing details will be fetched asynchronously'}
$ curl -X POST 'http://houston:1969/master/listings?exchange=NASDAQ&sec_types=STK&symbols=GOOG&symbols=FB&symbols=AAPL'
{"status": "the listing details will be fetched asynchronously"}
Then request option chains for the underlying stocks:
$ quantrocket master get -e 'NASDAQ' -t 'STK' -s 'GOOG' 'FB' 'AAPL' | quantrocket master options --infile -
status: the option chains will be fetched asynchronously
>>> from quantrocket.master import download_master_file, fetch_option_chains
>>> import io
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["NASDAQ"], sec_types=["STK"], symbols=["GOOG", "FB", "AAPL"])
>>> fetch_option_chains(infilepath_or_buffer=f)
{'status': 'the option chains will be fetched asynchronously'}
$ curl -X GET 'http://houston:1969/master/securities.csv?exchanges=NASDAQ&sec_types=STK&symbols=GOOG&symbols=FB&symbols=AAPL' > nasdaq_mega.csv
$ curl -X POST 'http://houston:1969/master/options' --upload-file nasdaq_mega.csv
{"status": "the option chains will be fetched asynchronously"}
Once the options request has finished, you can query the options like any other security:
$ quantrocket master get -s 'GOOG' 'FB' 'AAPL' -t 'OPT' --outfile 'options.csv'
>>> from quantrocket.master import download_master_file
>>> download_master_file("options.csv", symbols=["GOOG", "FB", "AAPL"], sec_types=["OPT"])
$ curl -X GET 'http://houston:1969/master/securities.csv?symbols=GOOG&symbols=FB&symbols=AAPL&sec_types=OPT' > options.csv
Option chains often consist of hundreds, sometimes thousands of options per underlying security. Be aware that requesting option chains for large universes of underlying securities, such as all stocks on the NYSE, can take numerous hours to complete, add hundreds of thousands of rows to the securities master database, increase the database file size by several hundred megabytes, and potentially add latency to database queries.

Maintain listings

Listings change over time and QuantRocket helps you keep your securities master database up-to-date. Your can monitor for changes to your existing listings (such as a company moving its listing from one exchange to another), you can delist securities to exclude them from your backtests and trading (without deleting them), and you can look for new listings.

Listings diffs

Security listings can change - for example, a stock might be delisted from Nasdaq and start trading OTC - and we probably want to be alerted when this happens. We can flag securities where the details as stored in our database differ from the latest details available from IB.
$ quantrocket master diff --universes "nasdaq"
status: the diff, if any, will be logged to flightlog asynchronously
>>> from quantrocket.master import diff_securities
>>> diff_securities(universes=["nasdaq"])
{'status': 'the diff, if any, will be logged to flightlog asynchronously'}
$ curl -X GET 'http://houston:1969/master/diff?universes=nasdaq'
{"status": "the diff, if any, will be logged to flightlog asynchronously"}

If any listings have changed, they'll be logged to flightlog at the WARNING level with a description of what fields have changed. You may wish to schedule this command on your countdown service and monitor Papertrail:

Papertrail log message

Delist stocks

Perhaps a stock has moved to the pink sheets and we're not interested in it anymore. We can delist it, which will retain the data but allow it to be excluded from our backtests and trading.

$ quantrocket master delist --conid 194245757
msg: delisted conid 194245757
>>> from quantrocket.master import delist_security
>>> delist_security(conid=194245757)
{'msg': 'delisted conid 194245757'}
$ curl -X DELETE 'http://houston:1969/master/securities?conids=194245757'
{"msg": "delisted conid 194245757"}

If you want to automate the delisting, you can run quantrocket master diff with the --delist-missing option, which delists securities that are no longer available from IB, and with the --delist-exchanges option, which delists securities associated with the exchanges you specify (note that IB uses the "VALUE" exchange as a placeholder for some delisted symbols):

$ quantrocket master diff --universes "nasdaq" --delist-missing --delist-exchanges VALUE PINK

When you delist a security, QuantRocket doesn't delete it but simply marks it as delisted so you can exclude it from your queries. If you wish, you can still include it in your queries by using the --delisted option:

$ # By default, exclude delisted securities that would otherwise match the query
$ quantrocket master get --universes "nasdaq" --outfile nasdaq_active.csv
$ # Or include delisted securities
$ quantrocket master get --universes "nasdaq" --delisted --outfile nasdaq_all.csv

Ticker symbol changes

Sometimes when a ticker symbol changes IB will preserve the conid (contract ID); in this case, to incorporate the changes into our database, we can simply fetch the listing details for the symbol we care about, which will overwrite the old (stale) listing details:

$ # Look up the symbol's conid and fetch the listings for just that conid
$ quantrocket master get --exchanges TSE --symbols OLD --pretty --fields ConId
ConId = 123456
$ quantrocket master listings -i 123456
status: the listing details will be fetched asynchronously

However, sometimes IB will issue a new conid. In this case, if you want to continue trading the symbol, you should delist the old symbol, fetch the new listing, and append the new symbol to the universe(s) you care about:

$ quantrocket master delist --exchange TSE --symbol OLD
msg: delisted conid 123456
$ quantrocket master listings --exchange TSE --symbols NEW --sec-types STK
$ # check flightlog and wait for listing download to complete, then:
$ quantrocket master get -e TSE -s NEW -t STK | quantrocket master universe "canada" --append --infile -

The above examples expect you to take action in response to individual ticker changes, but what if your universes consist of thousands of stocks and you don't want to deal with them individually? Use quantrocket master diff --delist-missing to automate the delisting of symbols that go missing, as described in the previous section, and use quantrocket master listings to periodically fetch any listings that might belong in your universe(s), as described in the next section. If any symbols go missing due to ticker changes that cause IB to issue a new conid, you'll pick up the new listings the next time you run quantrocket master listings.

Add new listings

What if you want to look for new listings that IB has added since your initial universe creation and add them to your universe? First, fetch all listings again from IB:
$ quantrocket master listings --exchange SEHK --sec-types STK
status: the listing details will be fetched asynchronously
>>> from quantrocket.master import fetch_listings
>>> fetch_listings(exchange="SEHK", sec_types=["STK"])
{'status': 'the listing details will be fetched asynchronously'}
$ curl -X POST 'http://houston:1969/master/listings?exchange=SEHK&sec_types=STK'
{"status": "the listing details will be fetched asynchronously"}
You can see what's new by excluding what you already have:
$ quantrocket master get --exchanges SEHK --exclude-universes "hongkong" --outfile new_hongkong_securities.csv
>>> from quantrocket.master import download_master_file
>>> download_master_file("new_hongkong_securities.csv", exchanges=["SEHK"], exclude_universes=["hongkong"])
$ curl -X GET 'http://houston:1969/master/securities.csv?exchanges=SEHK&exclude_universes=hongkong' > new_hongkong_securities.csv
If you like what you see, you can then append the new listings to your universe:
$ quantrocket master universe "hongkong" --infile new_hongkong_securities.csv
code: hongkong
inserted: 10
provided: 10
total_after_insert: 2226
>>> from quantrocket.master import create_universe
>>> create_universe("hongkong", infilepath_or_buffer="new_hongkong_securities.csv", append=True)
{'code': 'hongkong',
 'inserted': 10,
 'provided': 10,
 'total_after_insert': 2226}
$ curl -X PATCH 'http://houston:1969/master/universes/hongkong' --upload-file new_hongkong_securities.csv
{"code": "hongkong", "provided": 10, "inserted": 10, "total_after_insert": 2226}
For futures, IB provides several years of future expiries. From time to time, you should fetch the listings again for your futures exchange(s) in order to collect the new expiries, then add them to any universes you may wish to include them in.

Historical Data

QuantRocket makes it easy to retrieve and work with IB's abundant, global historical market data. (Appropriate IB market data subscriptions required.) Simply define your historical data requirements, and QuantRocket will retrieve data from IB according to your requirements and store it in a database for fast, flexible querying. You can create as many databases as you need for your backtesting and trading.

Create historical databases

Create a database by defining, at minimum, the bar size you want and the universe of securities to include. Suppose we've used the master service to define a universe of banking stocks on the Tokyo Stock Exchange, and now we want to collect end-of-day historical data for those stocks. First, create the database:

$ quantrocket history create-db 'japan-bank-eod' --universes 'japan-bank' --bar-size '1 day'
status: successfully created quantrocket.history.japan-bank-eod.sqlite
>>> from quantrocket.history import create_db
>>> create_db("japan-bank-eod", universes=["japan-bank"], bar_size="1 day")
{'status': 'successfully created quantrocket.history.japan-bank-eod.sqlite'}
$ curl -X PUT 'http://houston:1969/history/databases/japan-bank-eod?universes=japan-bank&bar_size=1 day'
{"status": "successfully created quantrocket.history.japan-bank-eod.sqlite"}
Then, fill up the database with data from IB:
$ quantrocket history fetch 'japan-bank-eod'
status: the historical data will be fetched asynchronously
>>> from quantrocket.history import fetch_history
>>> fetch_history("japan-bank-eod")
{'status': 'the historical data will be fetched asynchronously'}
$ curl -X POST 'http://houston:1969/history/queue?codes=japan-bank-eod'
{"status": "the historical data will be fetched asynchronously"}
QuantRocket will first query the IB API to determine how far back historical data is available for each security, then query the IB API again to fetch the data for that date range. Depending on the bar size and the number of securities in the universe, fetching data can take from several minutes to several hours. If you're running multiple IB Gateway services, QuantRocket will spread the requests among the services to speed up the process. Based on how quickly the IB API is responding to requests, QuantRocket will periodically estimate how long it will take to fetch the data. You can monitor flightlog via the command line or Papertrail to track progress:
$ quantrocket flightlog stream
2017-08-22 13:24:09 quantrocket.history: INFO [japan-bank-eod] Determining how much history is available from IB for japan-bank-eod
2017-08-22 13:25:45 quantrocket.history: INFO [japan-bank-eod] Fetching history from IB for japan-bank-eod
2017-08-22 13:26:11 quantrocket.history: INFO [japan-bank-eod] Expected remaining runtime to fetch japan-bank-eod history based on IB response times so far: 0:23:11
2017-08-22 13:55:00 quantrocket.history: INFO [japan-bank-eod] Saved 468771 total records for 85 total securities to quantrocket.history.japan-bank-eod.sqlite
In addition to bar size and universe(s), you can optionally define the type of data you want (for example, trades, bid/ask, midpoint, etc.), a fixed start date instead of "as far back as possible", whether to include trades from outside regular trading hours, whether to use consolidated prices or primary exchange prices, and more. For a complete list of options, view the API Reference. As you become interested in new exchanges or want to test new ideas, you can keep adding as many new databases with as many different configurations as you like.
Once you've created a database, you can't edit the configuration; you can only add new databases. If you made a mistake or no longer need an old database, you can use the CLI to drop the database and its associated config.

Initial data collection

Depending on the bar size, number of securities, and date range of your historical database, initial data collection from the IB API can take some time. For this reason, it's a good idea to estimate the runtime in advance so you know what to expect and can optimize your data collection strategy. After the initial data collection, keeping your database up to date is much faster and much easier.

QuantRocket fills your historical database by making a series of requests to the IB API. Each request fetches a portion of the historical data for a given security, from older data to newer data. For example, to get 1-day bars for a date range of 2014-06-01 to 2017-06-01, QuantRocket would make 3 sequential requests for 1 year of data each:

  • Request 1: 2014-06-01 to 2015-06-01
  • Request 2: 2015-06-01 to 2016-06-01
  • Request 3: 2016-06-01 to 2017-06-01

The amount of data fetched per request depends on the bar size: the smaller the bar size, the shorter the date span of each request, and consequently the greater number of requests that are required to fetch the total date range.

IB API response times vary, but 2 seconds per request is typical. The table below shows each available bar size, how much data can be collected per request, and the approximate runtime and resulting database size for 1 year's worth of 1 security's data.

Bar sizeAverage response time per requestData collected per requestRuntime per ticker-yearDatabase size per ticker-year
1 month2 sec1 year2 sec4 KB
1 week2 sec1 year2 sec15 KB
1 day2 sec1 year2 sec75 KB
8 hours2 sec1 month24 sec75 KB
4 hours2 sec1 month24 sec150 KB
3 hours2 sec1 month24 sec225 KB
2 hours2 sec1 month24 sec300 KB
1 hour2 sec1 month24 sec530 KB
30 min2 sec1 month24 sec980 KB
20 min2 sec1 week104 sec1.5 MB
15 min2 sec1 week104 sec2 MB
10 min2 sec1 week104 sec3 MB
5 min2 sec1 week104 sec6 MB
3 min2 sec1 week104 sec9.8 MB
2 min2 sec2 days365 sec14.7 MB
1 min2 sec1 day730 sec29 MB
Bar sizes smaller than 1 minute are available but are limited to 6 months of history and are not shown in the above table.

Using the above table, you can estimate the total runtime and total database size for your historical database using the following formulas.

Runtime

The formula for total runtime is:

runtime per ticker-year X number of tickers X number of years to collect / number of IB Gateways

This gives you the total seconds, divide by 3600 to get total hours.

Note that the number of IB Gateways is used in the denominator of the above formula: if you run multiple IB Gateways, each with appropriate IB market data subscriptions, QuantRocket splits the requests between the gateways which results in a proportionate reduction in runtime.

Database size

The formula for total database size is:

database size per ticker-year X number of tickers X number of years to collect

The following table shows estimated runtimes and database sizes for a variety of historical database configurations:

Bar sizeNumber of tickersNumber of yearsNumber of IB GatewaysTotal runtimeTotal database size
1 day500 tickers5 years11.4 hours187 MB
1 day3000 tickers10 years117 hours2.25 GB
30 min1000 tickers2 years113 hours2 GB
30 min1500 tickers5 years12 days, 2 hours7.3 GB
5 min1000 tickers2 years12 days, 10 hours12 GB
1 min200 tickers2 years13 days, 9 hours11.6 GB
1 min2000 tickers10 years1169 days580 GB

The last line is included to demonstrate that some quantities of data, while perhaps possible to obtain, aren't very practical.

Data collection best practices

All of the best practices listed below can be summarized in one principle: try to keep your databases as small as possible. Small databases are faster to fill initially, take up less disk space, and, most importantly, are faster and easier to work with in research, backtesting, and trading.

Daily bars before intraday bars

Suppose you want to collect intraday bars for the top 1000 liquid securities trading on NYSE and NASDAQ. Collecting intraday bars for all NYSE and NASDAQ securities then filtering out illiquid ones will be time-consuming and inefficient. Instead, use this approach:

  • collect just under a year's worth of daily bars for all NYSE and NASDAQ securities (this requires only 1 request to the IB API per security and will run much faster than collecting multiple years of intraday bars)
  • in a notebook, query the daily bars and use them to calculate dollar volume, then create a universe of liquid securities only (see usage guide section on using price data to define universes)
  • fetch intraday bars for the universe of liquid securities only

You can periodically repeat this process to update the universe constituents.

Filter by availability of fundamentals

Suppose you have a strategy that requires intraday bars and fundamental data and utilizes a universe of small-cap stocks. For many small-cap stocks, fundamental data won't be available, so it doesn't make sense to spend time collecting intraday historical data for stocks that won't have fundamental data. Instead, collect the fundamental data first and filter your universe to stocks with fundamentals, then fetch the historical intraday data. For example:

  • create a universe of all Japanese small-cap stocks called 'japan-sml'
  • fetch fundamentals for the universe 'japan-sml'
  • in a notebook, query the fundamentals for 'japan-sml' and use the query results to create a new universe called 'japan-sml-with-fundamentals'
  • fetch intraday price history for 'japan-sml-with-fundamentals'

Earliest history before later history

Suppose you want to collect numerous years of intraday bars. But you'd like to test your ideas on a smaller date range first in order to decide if collecting the full history is worthwhile. This can be done as follows. First, define your desired start date when you create the database:

$ quantrocket history create-db 'usa-liquid-15min' -u 'usa-liquid' -z '15 mins' -s '2011-01-01'

The above database is designed to fetch data back to 2011-01-01 and up to the present. However, you can temporarily specify an end date when fetching the data:

$ quantrocket history fetch 'usa-liquid-15min' -e '2012-01-01'

In this example, only a year of data will be fetched (that is, from the start date of 2011-01-01 specified when the database was created to the end date of 2012-01-01 specified in the above command). That way you can start your research sooner. Later, you can repeat this command with a later end date or remove the end date entirely to bring the database current.

In contrast, it's a bad idea to use a temporary start date to shorten the date range and speed up the data collection, with the intention of going back later to get the earlier data. Since data is filled from back to front (that is, from older dates to newer), once you've collected a later portion of data for a given security, you can't append an earlier portion of data without starting over.

Small universes before large universes

Another option to get you researching and backtesting sooner is to collect a subset of your target universe before collecting the entire universe. For example, instead of collecting intraday bars for 1000 securities, collect bars for 100 securities and start testing with those while collecting the remaining data.

Numerous small databases are better than fewer large databases

Prefer creating a larger number of smaller databases to creating a smaller number of larger databases. For example, instead of creating a giant historical database of NYSE and NASDAQ securities, find ways to split up the securities into smaller universes—for examples based on exchange, sector, dollar volume, etc.—and create separate historical databases for each universe. Performance will be much better.

Prioritize historical data requests

You can queue as many historical data requests as you wish, and they will be processed in sequential order, one at a time:

$ quantrocket history fetch 'aus-lrg-eod' 'singapore-15min' 'germany-1hr-bid-ask'
status: the historical data will be fetched asynchronously
>>> from quantrocket.history import fetch_history
>>> fetch_history(["aus-lrg-eod", "singapore-15min", "germany-1hr-bid-ask"])
{'status': 'the historical data will be fetched asynchronously'}
$ curl -X POST 'http://houston:1969/history/queue?codes=aus-lrg-eod&codes=singapore-15min&codes=germany-1hr-bid-ask'
{"status": "the historical data will be fetched asynchronously"}
You can view the current queue:
$ quantrocket history queue
priority: []
standard:
- aus-lrg-eod
- singapore-15min
- germany-1hr-bid-ask
>>> quantrocket.history import get_history_queue
>>> get_history_queue()
{'priority': [],
 'standard': ['aus-lrg-eod', 'singapore-15min', 'germany-1hr-bid-ask']}
$ curl -X GET 'http://houston:1969/history/queue'
{"priority": [], "standard": ["aus-lrg-eod", "singapore-15min", "germany-1hr-bid-ask"]}
Maybe you're regretting that the Germany request is at the end of the queue because you'd like to get that data first and start analyzing it. You can cancel the requests in front of it then add them to the end of the queue:
$ quantrocket history cancel 'aus-lrg-eod' 'singapore-15min'
priority: []
standard:
- germany-1hr-bid-ask
$ quantrocket history fetch 'aus-lrg-eod' 'singapore-15min'
status: the historical data will be fetched asynchronously
$ quantrocket history queue
priority: []
standard:
- germany-1hr-bid-ask
- aus-lrg-eod
- singapore-15min
>>> from quantrocket.history import get_history_queue, cancel_history_requests, fetch_history
>>> cancel_history_requests(codes=["aus-lrg-eod", "singapore-15min"])
{'priority': [],
 'standard': ['germany-1hr-bid-ask']}
>>> fetch_history(["aus-lrg-eod", "singapore-15min"])
{'status': 'the historical data will be fetched asynchronously'}
>>> get_history_queue()
{'priority': [],
 'standard': ['germany-1hr-bid-ask', 'aus-lrg-eod', 'singapore-15min']}
$ curl -X DELETE 'http://houston:1969/history/queue?codes=aus-lrg-eod&codes=singapore-15min'
{"priority": [], "standard": ["germany-1hr-bid-ask"]}
$ curl -X POST 'http://houston:1969/history/queue?codes=aus-lrg-eod&codes=singapore-15min'
{"status": "the historical data will be fetched asynchronously"}
$ curl -X GET 'http://houston:1969/history/queue'
{"priority": [], "standard": ["germany-1hr-bid-ask", "aus-lrg-eod", "singapore-15min"]}

There's another way to control queue priority: QuantRocket provides a standard queue and a priority queue. The standard queue will only be processed when the priority queue is empty. This can be useful when you're trying to collect a large amount of historical data for backtesting but you don't want it to interfere with daily updates to the databases you use for trading. First, schedule your daily updates on your countdown (cron) service, using the --priority flag to route them to the priority queue:

# fetch some US data each weekday at 5:30 pm
30 17 * * 1-5 quantrocket history fetch --priority nyse-lrg-eod nyse-mid-eod nyse-sml-eod

Then, queue your long-running requests on the standard queue:

$ quantrocket history fetch nyse-1min # many symbols + small granularity = slow

At 5:30pm, when several requests are queued on the priority queue, the long-running request on the standard queue will pause until the priority queue is empty again, and then resume.

Split adjustments

IB adjusts its historical data for splits, so your data will be split-adjusted when you initially retrieve it into your history database. However, if a split occurs after the initial retrieval, the data that was already stored needs to be adjusted for the split. QuantRocket handles this circumstance by comparing a recent price in the database to the equivalently-timestamped price from IB. If the prices differ, this indicates either that a split has occurred or in some other way the vendor has adjusted their data since QuantRocket stored it. Regardless of the reason, QuantRocket deletes the data for that particular security and re-fetches the entire history from IB, in order to make sure the database stays synced with IB.

Dividend adjustments

By default, IB historical data is not dividend-adjusted. However, dividend-adjusted data is available from IB using the ADJUSTED_LAST bar type. This bar type has an important limitation: it is only available with a 1 day bar size.

With ADJUSTED_LAST, QuantRocket handles dividend adjustments in the same way it handles split adjustments: whenever IB applies a dividend adjustment, QuantRocket will detect the discrepancy between the IB data and the data as stored in the history database, and will delete the stored data and re-sync with IB.

Query historical data

You can download a CSV of historical data from your history database, optionally filtering by date, security, or field:

$ quantrocket history get 'japan-bank-eod' --start-date '2017-01-01' --fields 'Open' 'Close' 'Volume' --outfile 'japan-bank-2017.csv'
>>> from quantrocket.history import download_history_file
>>> download_history_file("japan-bank-eod", filepath_or_buffer="japan-bank-2017.csv", start_date="2017-01-01", fields=["Open","Close", "Volume"])
$ curl -X GET 'http://houston:1969/history/japan-bank-eod.csv?start_date=2017-01-01&fields=Open&fields=Close&fields=Volume' > japan-bank-2017.csv

Using the Python client, you can load the data into Pandas:

>>> from quantrocket.history import download_history_file
>>> import pandas as pd
>>> import io
>>> f = io.StringIO()
>>> download_history_file("japan-bank-eod", f, start_date="2017-01-01", fields=["Open","High","Low","Close", "Volume"])
>>> prices = pd.read_csv(f, parse_dates=["Date"])
>>> prices.head()
       ConId       Date     Open    High     Low   Close   Volume
0   85433033 2017-01-04   1370.0  1391.0  1370.0  1391.0     8400
1  114841991 2017-01-04    198.0   201.0   198.0   198.0   121600
2  165779122 2017-01-04   4115.0  4240.0  4115.0  4205.0   109700
3  205624376 2017-01-04    801.0   824.0   797.0   823.0   744200
4  206408214 2017-01-04   1413.0  1439.0  1413.0  1428.0  2760800

>>> prices = prices.pivot(index="ConId", columns="Date").T
>>> closes = prices.loc["Close"]
>>> closes.head()
ConId        13857206   13905935   13905959   14018330   14018335   14018354   \
Date                                                                       
2017-01-04      623.7      744.7     1025.0      544.0     2763.0      746.0   
2017-01-05      637.4      748.3     1027.0      536.0     2723.0      747.0   
2017-01-06      627.3      738.1     1012.0      533.0     2689.0      736.0   
2017-01-10      616.0      727.2     1002.0      528.0     2682.0      725.0   
2017-01-11      626.7      737.7     1009.0      530.0     2736.0      734.0

Timezone of historical data

Historical data with a bar size of 1 day or higher is stored and returned in YYYY-MM-DD format. This is straightforward and not much needs to be said about it.

Intraday historical data is stored in the database in ISO-8601 format, which consists of the date followed by the time in the local timezone of the exchange, followed by a UTC offset. For example, a 9:30 AM bar for a stock trading on the NYSE might have a timestamp of 2017-07-25T09:30:00-04:00, where -04:00 indicates that New York is 4 hours behind Greenwich Mean Time/UTC. However, you can optionally have QuantRocket return the data without the UTC offset, which in this example would be 2017-07-25T09:30:00. Let's consider which option is preferable when loading the data into pandas.

Loading historical data with UTC offsets will cause pandas to parse the dates into UTC timestamps, which is essential for accuracy if your data includes multiple timezones. For example, suppose we've devised an arbitrage strategy to exploit price differences between AAPL as traded on Nasdaq and as traded on the Mexican Stock Exchange. We may have gathered some 5-minute bars for each security like so:

$ # get listings for AAPL on both exchanges...
$ quantrocket master listings --exchange NASDAQ --symbols AAPL --sec-types STK
status: the listing details will be fetched asynchronously
$ quantrocket master listings --exchange MEXI --symbols AAPL --sec-types STK
status: the listing details will be fetched asynchronously
$ # monitor flightlog for listing details to be fetched, then make a universe:
$ quantrocket master get -e MEXI NASDAQ -s AAPL -t STK | quantrocket master universe 'aapl-arb' -f -
code: aapl-arb
inserted: 2
provided: 2
total_after_insert: 2
$ # get 5 minute bars for both stocks
$ quantrocket history create-db 'aapl-arb-5min' -u aapl-arb -z '5 mins' -s 2017-07-26
status: successfully created quantrocket.history.aapl-arb-5min.sqlite
$ quantrocket history fetch 'aapl-arb-5min'
status: the historical data will be fetched asynchronously

The NASDAQ trading session runs from 9:30-16:00, while the MEXI trading session runs from 8:30-15:00, but since Mexico City is 1 hour behind New York, the sessions actually coincide. When we query the data with UTC offsets (the default behavior), pandas will parse the dates into UTC timestamps, properly aligning the data:

>>> from quantrocket.history import download_history_file
>>> import io
>>> f = io.StringIO()
>>> download_history_file("aapl-arb-5min", f)
>>> prices = pd.read_csv(f, parse_dates=["Date"])
>>> prices = prices.pivot(index="ConId", columns="Date").T
>>> closes = prices.loc["Close"]
>>> # 13:30 UTC = 9:30 New York time = 8:30 Mexico City time
>>> closes.head()
ConId                265598    38708077
Date                                   
2017-07-26 13:30:00    153.62    2715.0
2017-07-26 13:35:00    153.46    2730.0
2017-07-26 13:40:00    153.21    2725.0
2017-07-26 13:45:00    153.28    2725.0
2017-07-26 13:50:00    153.18    2725.0

Suppose, instead, that our historical data is limited to a single timezone. The UTC timestamps that pandas will generate are accurate, but we might find it inconvenient to have to remember that a timestamp of 2017-07-25 13:30:00 for a Nasdaq stock really represents the 9:30 AM bar in New York time. If we find this inconvenient, we can use the tz_naive option to tell QuantRocket to return the data without the UTC offset, which will cause pandas to leave it in the timezone of the exchange:

>>> f = io.StringIO()
>>> # limit the query to the NASDAQ AAPL, and don't include the UTC offset
>>> download_history_file("aapl-arb-5min", f, conids=[265598], tz_naive=True)
>>> prices = pd.read_csv(f, parse_dates=["Date"])
>>> prices = prices.pivot(index="ConId", columns="Date").T
>>> closes = prices.loc["Close"]
>>> closes.head()
ConId                265598
Date                       
2017-07-26 09:30:00  153.62
2017-07-26 09:35:00  153.46
2017-07-26 09:40:00  153.21
2017-07-26 09:45:00  153.28
2017-07-26 09:50:00  153.18

The rule with tz_naive is, it's okay to be naive if you're only working with 1 timezone, but don't be naive if you're working with multiple timezones.

Continuous Futures

You can use QuantRocket to query futures as continuous contracts. QuantRocket fetches and stores data for each individual futures expiry, but can optionally stitch the data into a continuous contract at query time.

Suppose we've created a universe of all expiries of KOSPI 200 futures, trading on the Korea Stock Exchange:

$ quantrocket master listings --exchange 'KSE' --sec-types 'FUT' --symbols 'K200'
status: the listing details will be fetched asynchronously
$ # wait for listings to be fetched, then:
$ quantrocket master get -e 'KSE' -t 'FUT' -s 'K200' | quantrocket master universe 'k200' -f '-'
code: k200
inserted: 15
provided: 15
total_after_insert: 15
>>> from quantrocket.master import fetch_listings, create_universe, download_master_file
>>> import io
>>> fetch_listings(exchange="KSE", sec_types=["FUT"], symbols=["K200"])
{'status': 'the listing details will be fetched asynchronously'}
>>> # wait for listings to be fetched, then:
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["KSE"], sec_types=["FUT"], symbols=["K200"])
>>> create_universe("k200", infilepath_or_buffer=f)
{'code': 'k200', 'inserted': 15, 'provided': 15, 'total_after_insert': 15}
$ curl -X POST 'http://houston:1969/master/listings?exchange=KSE&sec_types=FUT&symbols=K200'
{"status": "the listing details will be fetched asynchronously"}
$ # wait for listings to be fetched, then:
$  curl -X GET 'http://houston:1969/master/securities.csv?exchanges=KSE&sec_types=FUT&symbols=K200' > k200.csv
$ curl -X PUT 'http://houston:1969/master/universes/k200' --upload-file k200.csv
{"code": "k200", "provided": 15, "inserted": 15, "total_after_insert": 15}
We can create a history database and fetch historical data for each expiry:
$ quantrocket history create-db 'k200-1h' --universes 'k200' --bar-size '1 hour'
status: successfully created quantrocket.history.k200-1h.sqlite
$ quantrocket history fetch 'k200-1h'
status: the historical data will be fetched asynchronously
>>> from quantrocket.history import create_db, fetch_history
>>> create_db("k200-1h", universes=["k200"], bar_size="1 hour")
{'status': 'successfully created quantrocket.history.k200-1h.sqlite'}
>>> fetch_history("k200-1h")
{'status': 'the historical data will be fetched asynchronously'}
$ curl -X PUT 'http://houston:1969/history/databases/k200-1h?universes=k200&bar_size=1 hour'
{"status": "successfully created quantrocket.history.k200-1h.sqlite"}
$ curl -X POST 'http://houston:1969/history/queue?codes=k200-1h'
{"status": "the historical data will be fetched asynchronously"}
The historical prices for each futures expiry are stored separately and by default are returned separately at query time, but we can optionally tell QuantRocket to stitch the contracts together at query time. The fastest way of stitching the contracts together is using simple concatenation:
$ quantrocket history get 'k200-1h' --fields 'Open' 'Close' 'Volume' --outfile 'k200_1h.csv' --cont-fut 'concat'
>>> from quantrocket.history import download_history_file
>>> download_history_file("k200-1h", filepath_or_buffer="k200_1h.csv", fields=["Open","Close", "Volume"], cont_fut="concat")
$ curl -X GET 'http://houston:1969/history/k200-1h.csv?fields=Open&fields=Close&fields=Volume&cont_fut=concat' > k200_1h.csv

The contracts will be stitched together according to the rollover dates as configured in the master service, and the continuous contract will be returned under the conid of the current front-month contract.

A history database need not contain only futures in order to use the continuous futures query option. The option will be ignored for any non-futures, which will be returned as stored. Any futures in the database will be grouped together by symbol, exchange, currency, and multiplier in order to create the continuous contracts. The continuous contracts will be returned alongside the non-futures.

Build a history database from other data sources

Instead of filling a history database with data fetched from IB, you can also create an empty history database and load data into it from other sources. These sources might include market data snapshots from the realtime service, the results of a continuous futures query, or another history database.

For example, suppose we have a database of 1-minute bars of Canadian stocks, and we'd like to copy only the energy stocks to a new database without having to fetch the data all over again. To create the new database, use the no-config option to tell QuantRocket that you won't be fetching IB data for this database.

$ quantrocket history create-db 'canada-enr-1min' --no-config
status: successfully created quantrocket.history.canada-enr-1min.sqlite
>>> from quantrocket.history import create_db
>>> create_db("canada-enr-1min", no_config=True)
{'status': 'successfully created quantrocket.history.canada-enr-1min.sqlite'}
$ curl -X PUT 'http://houston:1969/history/databases/canada-enr-1min?no_config=true'
{"status": "successfully created quantrocket.history.canada-enr-1min.sqlite"}
Assuming we've already defined a universe of Canadian energy stocks called 'canada-enr', we can download a CSV of just those stocks from our existing database of 1-minute bars, then upload the CSV to our new database:
$ quantrocket history get 'canada-1min' --universes 'canada-enr' | quantrocket history load 'canada-enr-1min'
db: canada-enr-1min
loaded: 34029
>>> from quantrocket.history import download_history_file, load_history_from_file
>>> import io
>>> f = io.StringIO()
>>> download_history_file("canada-1min", f, universes=["canada-enr"])
>>> load_history_from_file("canada-enr-1min", f)
{'db': 'canada-enr-1min', 'loaded': 34029}
$ curl -X GET 'http://houston:1969/history/canada-1min.csv?universes=canada-enr' > canada-enr-1min.csv
$ curl -X PATCH 'http://houston:1969/history/canada-enr-1min.csv' --upload-file 'canada-enr-1min.csv'
{"db": "canada-enr-1min", "loaded": 34029}

Survivorship bias

IB historical data contains survivorship bias, that is, IB only provides historical data for active securities, not for delisted securities. Companies that went bankrupt or were acquired, for example, won't be reflected in backtests. Some strategies are more sensitive to survivorship bias than others.

You have a few options for gauging your strategy's sensitivity to survivorship bias.

Build your own survivorship-bias-free database

When IB delists a security, the historical data you've already downloaded remains available to your backtests. Thus, over time you will build up a survivorship-bias-free database. You can run backtests which include and exclude the delisted securities and compare the result in order to gauge your strategy's sensitivity to survivorship bias. From a universe containing active and delisted tickers, you can create a second universe containing only the active tickers:

$ quantrocket master universe 'hong-kong-active' --from-universes 'hong-kong' --exclude-delisted
code: hong-kong-active
inserted: 2201
provided: 2201
total_after_insert: 2201
>>> from quantrocket.master import create_universe
>>> create_universe("hong-kong-active", from_universes=["hong-kong"], exclude_delisted=True)
{'code': 'hong-kong-active',
 'inserted': 2201,
 'provided': 2201,
 'total_after_insert': 2201}
$ curl -X PUT 'http://houston:1969/master/universes/hong-kong-active?from_universes=hong-kong&exclude_delisted=true'
{"code": "hong-kong-active", "provided": 2201, "inserted": 2201, "total_after_insert": 2201}

Now run backtests comparing the performance of "hong-kong" and "hong-kong-active".

Fundamental Data

QuantRocket can fetch Reuters fundamental data from IB and store it in a database for analysis, backtesting, and trading (IB subscription to Reuters Worldwide Fundamentals is required but is usually free). There are 2 types of fundamental data available:

  • Financial statements: provides cash flow, balance sheet, and income metrics. Time-indexed to the relevant fiscal period as well as the filing date for point-in-time backtesting.
  • Estimates and actuals: provides analyst estimates and actuals for a variety of indicators. Actuals include the announcement date, for point-in-time backtesting.

Financial statements

The Reuters financial statements dataset available through IB provides over 125 income, balance sheet, and cash flow metrics.

Balance sheetIncomeCash flow
Accounts PayableCost of Revenue, TotalAmortization
Accounts Receivable - Trade, NetDPS - Common Stock Primary IssueCapital Expenditures
Accrued ExpensesDepreciation/AmortizationCash Interest Paid
Accumulated Depreciation, TotalDiluted EPS Excluding ExtraOrd ItemsCash Payments
Additional Paid-In CapitalDiluted Net IncomeCash Receipts
Capital Lease ObligationsDiluted Normalized EPSCash Taxes Paid
CashDiluted Weighted Average SharesCash from Financing Activities
Cash & Due from BanksDilution AdjustmentCash from Investing Activities
Cash & EquivalentsEquity In AffiliatesCash from Operating Activities
Cash and Short Term InvestmentsGain (Loss) on Sale of AssetsChanges in Working Capital
Common Stock, TotalGross ProfitDeferred Taxes
Current Port. of LT Debt/Capital LeasesIncome Available to Com Excl ExtraOrdDepreciation/Depletion
Deferred Income TaxIncome Available to Com Incl ExtraOrdFinancing Cash Flow Items
ESOP Debt GuaranteeInterest Exp.(Inc.),Net-Operating, TotalForeign Exchange Effects
Goodwill, NetInterest Inc.(Exp.),Net-Non-Op., TotalIssuance (Retirement) of Debt, Net
Intangibles, NetInterest Income, BankIssuance (Retirement) of Stock, Net
Long Term DebtLoan Loss ProvisionNet Change in Cash
Long Term InvestmentsMinority InterestNet Income/Starting Line
Minority InterestNet IncomeNon-Cash Items
Net LoansNet Income After TaxesOther Investing Cash Flow Items, Total
Note Receivable - Long TermNet Income Before Extra. ItemsTotal Cash Dividends Paid
Notes Payable/Short Term DebtNet Income Before Taxes
Other Assets, TotalNet Interest Inc. After Loan Loss Prov.
Other Bearing Liabilities, TotalNet Interest Income
Other Current Assets, TotalNon-Interest Expense, Bank
Other Current liabilities, TotalNon-Interest Income, Bank
Other Earning Assets, TotalOperating Income
Other Equity, TotalOther Operating Expenses, Total
Other Liabilities, TotalOther Revenue, Total
Other Long Term Assets, TotalOther, Net
Payable/AccruedProvision for Income Taxes
Preferred Stock - Non Redeemable, NetResearch & Development
Prepaid ExpensesRevenue
Property/Plant/Equipment, Total - GrossSelling/General/Admin. Expenses, Total
Property/Plant/Equipment, Total - NetTotal Adjustments to Net Income
Redeemable Preferred Stock, TotalTotal Extraordinary Items
Retained Earnings (Accumulated Deficit)Total Interest Expense
Short Term InvestmentsTotal Operating Expense
Tangible Book Value per Share, Common EqTotal Revenue
Total AssetsU.S. GAAP Adjustment
Total Common Shares OutstandingUnusual Expense (Income)
Total Current Assets
Total Current Liabilities
Total Debt
Total Deposits
Total Equity
Total Inventory
Total Liabilities
Total Liabilities & Shareholders' Equity
Total Long Term Debt
Total Preferred Shares Outstanding
Total Receivables, Net
Total Short Term Borrowings
Treasury Stock - Common
Unrealized Gain (Loss)

Annual vs interim reports

Both annual and interim fiscal period reports are available. More data is available for annual than interim periods.

Is the dataset point-in-time?

The dataset is point-in-time and suitable for backtesting, with one caveat. Financial statements are time-indexed to their release date (for U.S. companies, this is the Form 10-K filing date), so you know when the data became available. Restated data are distinguished from as-reported data, making it possible to exclude restated data from backtests. However, when restated data is available for a particular fiscal period and financial statement (income, balance sheet, or cash flow), the corresponding as-reported financial statement is not provided by Reuters/IB. In these cases, this means you can avoid using restated financials that weren't available at the time but won't have access to the as-reported financials that were available at the time. This limitation only applies to financial statements that pre-date your use of QuantRocket. As you continue using QuantRocket to keep your Reuters fundamentals database up-to-date, QuantRocket will preserve the as-reported and restated financials as they appear in real-time, increasing the accuracy of your backtests.

Where can I find more documentation?

IB provides virtually no documentation about the Reuters financials dataset. To learn more about the available metrics and how they are calculated, search Google for "Reuters Fundamentals Glossary" to try to locate a copy of Reuters PDF glossary which contains hundreds of pages of information about the dataset.

Fetch financial statements

To use Reuters financial statements in QuantRocket, first fetch the data from IB into your QuantRocket database. Then you can run queries against the database in your research and backtests.

To fetch financial statements, specify one or more conids or universes to fetch data for:

$ quantrocket fundamental fetch-financials --universes 'japan-banks' 'singapore-banks'
status: the fundamental data will be fetched asynchronously
>>> from quantrocket.fundamental import fetch_reuters_financials
>>> fetch_reuters_financials(universes=["japan-banks","singapore-banks"])
{'status': 'the fundamental data will be fetched asynchronously'}
$ curl -X POST 'http://houston:1969/fundamental/reuters/financials?universes=japan-banks&universes=singapore-banks'
{"status": "the fundamental data will be fetched asynchronously"}

Multiple requests will be queued and processed sequentially. You can monitor flightlog via the command line or Papertrail to track progress:

$ quantrocket flightlog stream
2017-11-22 09:30:10 quantrocket.fundamental: INFO Fetching Reuters financials from IB for universes japan-banks, singapore-banks
2017-11-22 09:30:45 quantrocket.fundamental: INFO Expected remaining runtime to fetch Reuters financials for universes japan-banks, singapore-banks: 0:00:33
2017-11-22 09:32:09 quantrocket.fundamental: INFO Saved 12979 total records for 100 total securities to quantrocket.fundamental.reuters.financials.sqlite for universes japan-banks, singapore-banks

Query financial statements

To query Reuters financials, first look up the code(s) for the metrics you care about, optionally limiting to a particular statement type:

$ quantrocket fundamental codes --report-types 'financials' --statement-types 'CAS'
financials:
  FCDP: Total Cash Dividends Paid
  FPRD: Issuance (Retirement) of Debt, Net
  FPSS: Issuance (Retirement) of Stock, Net
  FTLF: Cash from Financing Activities
  ITLI: Cash from Investing Activities
  OBDT: Deferred Taxes
  OCPD: Cash Payments
  OCRC: Cash Receipts
...
>>> from quantrocket.fundamental import list_reuters_codes
>>> list_reuters_codes(report_types=["financials"], statement_types=["CAS"])
{'financials': {'FCDP': 'Total Cash Dividends Paid',
  'FPRD': 'Issuance (Retirement) of Debt, Net',
  'FPSS': 'Issuance (Retirement) of Stock, Net',
  'FTLF': 'Cash from Financing Activities',
  'ITLI': 'Cash from Investing Activities',
  'OBDT': 'Deferred Taxes',
  'OCPD': 'Cash Payments',
  'OCRC': 'Cash Receipts',
...
$ curl -X GET 'http://houston:1969/fundamental/reuters/codes?report_types=financials&statement_types=CAS'
{"financials": {"FCDP": "Total Cash Dividends Paid", "FPRD": "Issuance (Retirement) of Debt, Net", "FPSS": "Issuance (Retirement) of Stock, Net", "FTLF": "Cash from Financing Activities", "ITLI": "Cash from Investing Activities", "OBDT": "Deferred Taxes", "OCPD": "Cash Payments", "OCRC": "Cash Receipts",
...
QuantRocket reads the codes from the financial statements database; therefore, you must fetch data into the database before you can list the available codes.

Let's query Net Income Before Taxes (code EIBT) for a universe of securities:

$ quantrocket fundamental financials 'EIBT' -u 'us-banks' -s '2014-01-01' -e '2017-01-01' -o financials.csv
$ head financials.csv
CoaCode,ConId,Amount,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,StatementType,StatementPeriodLength,StatementPeriodUnit,UpdateTypeCode,UpdateTypeDescription,StatementDate,AuditorNameCode,AuditorName,AuditorOpinionCode,AuditorOpinion,Source,SourceDate
EIBT,9029,13.53,2014,2014-12-31,Annual,,INC,12,M,UPD,"Updated Normal",2014-12-31,EY,"Ernst & Young LLP",UNQ,Unqualified,10-K,2015-03-13
EIBT,9029,28.117,2015,2015-12-31,Annual,,INC,12,M,UPD,"Updated Normal",2015-12-31,EY,"Ernst & Young LLP",UNQ,Unqualified,10-K,2016-02-29
EIBT,12190,-4.188,2015,2015-05-31,Annual,,INC,12,M,UPD,"Updated Normal",2015-05-31,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2015-08-26
EIBT,12190,1.873,2016,2016-05-31,Annual,,INC,12,M,UPD,"Updated Normal",2016-05-31,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2016-08-05
EIBT,270422,-3.77,2015,2015-09-30,Annual,,INC,12,M,UPD,"Updated Normal",2015-09-30,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2015-12-18
>>> from quantrocket.fundamental import download_reuters_financials
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_reuters_financials(["EIBT"],f,universes=["us-banks"],
                            start_date="2014-01-01", end_date="2017-01-01")
>>> financials = pd.read_csv(f, parse_dates=["SourceDate", "FiscalPeriodEndDate"])
>>> financials.head()
CoaCode   ConId  Amount  FiscalYear FiscalPeriodEndDate FiscalPeriodType  \
0    EIBT    9029  13.530        2014          2014-12-31           Annual
1 EIBT 9029 28.117 2015 2015-12-31 Annual
2 EIBT 12190 -4.188 2015 2015-05-31 Annual
3 EIBT 12190 1.873 2016 2016-05-31 Annual
4 EIBT 270422 -3.770 2015 2015-09-30 Annual
FiscalPeriodNumber StatementType StatementPeriodLength \ 0 NaN INC 12
1 NaN INC 12
2 NaN INC 12
3 NaN INC 12
4 NaN INC 12
StatementPeriodUnit UpdateTypeCode UpdateTypeDescription StatementDate \ 0 M UPD Updated Normal 2014-12-31
1 M UPD Updated Normal 2015-12-31
2 M UPD Updated Normal 2015-05-31
3 M UPD Updated Normal 2016-05-31
4 M UPD Updated Normal 2015-09-30
AuditorNameCode AuditorName AuditorOpinionCode AuditorOpinion Source \ 0 EY Ernst & Young LLP UNQ Unqualified 10-K
1 EY Ernst & Young LLP UNQ Unqualified 10-K
2 CROW Crowe Horwath LLP UNQ Unqualified 10-K
3 CROW Crowe Horwath LLP UNQ Unqualified 10-K
4 CROW Crowe Horwath LLP UNQ Unqualified 10-K
SourceDate
0 2015-03-13
1 2016-02-29
2 2015-08-26
3 2016-08-05
4 2015-12-18
$ curl -X GET 'http://houston:1969/fundamental/reuters/financials.csv?codes=EIBT&universes=us-banks&start_date=2014-01-01&end_date=2017-01-01' --output financials.csv
$ head financials.csv
CoaCode,ConId,Amount,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,StatementType,StatementPeriodLength,StatementPeriodUnit,UpdateTypeCode,UpdateTypeDescription,StatementDate,AuditorNameCode,AuditorName,AuditorOpinionCode,AuditorOpinion,Source,SourceDate
EIBT,9029,13.53,2014,2014-12-31,Annual,,INC,12,M,UPD,"Updated Normal",2014-12-31,EY,"Ernst & Young LLP",UNQ,Unqualified,10-K,2015-03-13
EIBT,9029,28.117,2015,2015-12-31,Annual,,INC,12,M,UPD,"Updated Normal",2015-12-31,EY,"Ernst & Young LLP",UNQ,Unqualified,10-K,2016-02-29
EIBT,12190,-4.188,2015,2015-05-31,Annual,,INC,12,M,UPD,"Updated Normal",2015-05-31,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2015-08-26
EIBT,12190,1.873,2016,2016-05-31,Annual,,INC,12,M,UPD,"Updated Normal",2016-05-31,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2016-08-05
EIBT,270422,-3.77,2015,2015-09-30,Annual,,INC,12,M,UPD,"Updated Normal",2015-09-30,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2015-12-18
By default, annual rather than interim statements are returned, and restatements are excluded. If you prefer, you can choose interim instead of annual statements, and/or you can choose to include restatements:
$ quantrocket fundamental financials 'EIBT' -u 'us-banks' -s '2014-01-01' -e '2017-01-01' --interim --restatements -o interim_financials.csv
$ head interim_financials.csv
CoaCode,ConId,Amount,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,StatementType,StatementPeriodLength,StatementPeriodUnit,UpdateTypeCode,UpdateTypeDescription,StatementDate,AuditorNameCode,AuditorName,AuditorOpinionCode,AuditorOpinion,Source,SourceDate
EIBT,9029,8.359,2016,2016-09-30,Interim,3,INC,3,M,UPD,"Updated Normal",2016-09-30,,,,,10-Q,2016-11-04
EIBT,9029,3.459,2016,2016-12-31,Interim,4,INC,3,M,UCA,"Updated Calculated",2016-12-31,DHS,"Deloitte & Touche LLP",UNQ,Unqualified,10-K,2017-03-03
EIBT,12190,0.744,2017,2016-08-31,Interim,1,INC,3,M,RES,"Restated Normal",2016-08-31,,,,,10-Q,2016-10-13
EIBT,12190,-0.595,2017,2016-11-30,Interim,2,INC,3,M,UPD,"Updated Normal",2016-11-30,,,,,10-Q,2017-01-12
EIBT,270422,1.599,2016,2016-07-01,Interim,3,INC,3,M,UPD,"Updated Normal",2016-07-01,,,,,10-Q,2016-08-10
>>> from quantrocket.fundamental import download_reuters_financials
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_reuters_financials(["EIBT"],f,universes=["us-banks"],
                            interim=True,
                            restatements=True,
                            start_date="2014-01-01", end_date="2017-01-01")
>>> interim_financials = pd.read_csv(f, parse_dates=["SourceDate", "FiscalPeriodEndDate"])
>>> interim_financials.head()
CoaCode   ConId  Amount  FiscalYear FiscalPeriodEndDate FiscalPeriodType  \
0    EIBT    9029   8.359        2016          2016-09-30          Interim
1 EIBT 9029 3.459 2016 2016-12-31 Interim
2 EIBT 12190 0.744 2017 2016-08-31 Interim
3 EIBT 12190 -0.595 2017 2016-11-30 Interim
4 EIBT 270422 1.599 2016 2016-07-01 Interim
FiscalPeriodNumber StatementType StatementPeriodLength \ 0 3 INC 3
1 4 INC 3
2 1 INC 3
3 2 INC 3
4 3 INC 3
StatementPeriodUnit UpdateTypeCode UpdateTypeDescription StatementDate \ 0 M UPD Updated Normal 2016-09-30
1 M UCA Updated Calculated 2016-12-31
2 M RES Restated Normal 2016-08-31
3 M UPD Updated Normal 2016-11-30
4 M UPD Updated Normal 2016-07-01
AuditorNameCode AuditorName AuditorOpinionCode AuditorOpinion \ 0 NaN NaN NaN NaN
1 DHS Deloitte & Touche LLP UNQ Unqualified
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
Source SourceDate
0 10-Q 2016-11-04
1 10-K 2017-03-03
2 10-Q 2016-10-13
3 10-Q 2017-01-12
4 10-Q 2016-08-10
$ curl -X GET 'http://houston:1969/fundamental/reuters/financials.csv?codes=EIBT&universes=us-banks&interim=True&restatements=True&start_date=2014-01-01&end_date=2017-01-01' --output interim_financials.csv
$ head interim_financials.csv
CoaCode,ConId,Amount,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,StatementType,StatementPeriodLength,StatementPeriodUnit,UpdateTypeCode,UpdateTypeDescription,StatementDate,AuditorNameCode,AuditorName,AuditorOpinionCode,AuditorOpinion,Source,SourceDate
EIBT,9029,8.359,2016,2016-09-30,Interim,3,INC,3,M,UPD,"Updated Normal",2016-09-30,,,,,10-Q,2016-11-04
EIBT,9029,3.459,2016,2016-12-31,Interim,4,INC,3,M,UCA,"Updated Calculated",2016-12-31,DHS,"Deloitte & Touche LLP",UNQ,Unqualified,10-K,2017-03-03
EIBT,12190,0.744,2017,2016-08-31,Interim,1,INC,3,M,RES,"Restated Normal",2016-08-31,,,,,10-Q,2016-10-13
EIBT,12190,-0.595,2017,2016-11-30,Interim,2,INC,3,M,UPD,"Updated Normal",2016-11-30,,,,,10-Q,2017-01-12
EIBT,270422,1.599,2016,2016-07-01,Interim,3,INC,3,M,UPD,"Updated Normal",2016-07-01,,,,,10-Q,2016-08-10
When restatements are excluded (the default behavior), the start_date and end_date filters are based on the filing date. When restatements are included, the start_date and end_date filters are based on the fiscal period end date.

Use financial statements with Moonshot or Zipline

QuantRocket makes it easy to use Reuters financials with Moonshot or Zipline. See the Moonshot or Zipline section of the usage guide.

Analyst estimates and actuals

The Reuters estimates dataset provides analyst estimates and actuals for over 20 metrics.

     
Book Value Per Share Earnings Per Share Reported Pre-Tax Profit
Capital Expenditure Funds From Operations Per Share Pre-Tax Profit Before Goodwill
Cash Flow Per Share Net Asset Value Per Share Pre-Tax Profit Reported
Dividend Per Share Net Debt Return On Assets
Earnings Before Interest and Tax Net Profit Return On Equity
Earnings Before Interest, Taxes, Depreciation and Amortization Net Profit Before Goodwill Revenue
Earnings Per Share Net Profit Reported
Earnings Per Share Before Goodwill Operating Profit

Data is available for annual, semi-annual, and quarterly reporting periods. Data includes the announcement date for the actuals. Estimate dates are not provided; estimates are the latest estimates prior to the announcement of the actuals.

Fetch estimates and actuals

To use Reuters estimates and actuals in QuantRocket, first fetch the data from IB into your QuantRocket database. Then you can run queries against the database in your research and backtests.

To fetch analyst estimates and actuals, specify one or more conids or universes to fetch data for:

$ quantrocket fundamental fetch-estimates --universes 'japan-banks' 'singapore-banks'
status: the fundamental data will be fetched asynchronously
>>> from quantrocket.fundamental import fetch_reuters_estimates
>>> fetch_reuters_financials(universes=["japan-banks","singapore-banks"])
{'status': 'the fundamental data will be fetched asynchronously'}
$ curl -X POST 'http://houston:1969/fundamental/reuters/estimates?universes=japan-banks&universes=singapore-banks'
{"status": "the fundamental data will be fetched asynchronously"}

Multiple requests will be queued and processed sequentially. You can monitor flightlog via the command line or Papertrail to track progress:

$ quantrocket flightlog stream
2017-11-23 14:13:22 quantrocket.fundamental: INFO Fetching Reuters estimates from IB for universes japan-banks, singapore-banks
2017-11-23 14:15:35 quantrocket.fundamental: INFO Expected remaining runtime to fetch Reuters estimates for universes japan-banks, singapore-banks: 0:04:25
2017-11-23 14:24:01 quantrocket.fundamental: INFO Saved 3298 total records for 60 total securities to quantrocket.fundamental.reuters.estimates.sqlite for universes japan-banks, singapore-banks

Query estimates and actuals

To query Reuters estimates and actuals, first look up the code(s) for the metrics you care about:

$ quantrocket fundamental codes --report-types 'estimates'
estimates:
  BVPS: Book Value Per Share
  CAPEX: Capital Expenditure
  CPS: Cash Flow Per Share
  DPS: Dividend Per Share
  EBIT: Earnings Before Interest and Tax
...
>>> from quantrocket.fundamental import list_reuters_codes
>>> list_reuters_codes(report_types=["estimates"])
{'estimates': {'BVPS': 'Book Value Per Share',
  'CAPEX': 'Capital Expenditure',
  'CPS': 'Cash Flow Per Share',
  'DPS': 'Dividend Per Share',
  'EBIT': 'Earnings Before Interest and Tax',
...
$ curl -X GET 'http://houston:1969/fundamental/reuters/codes?report_types=estimates'
{"estimates": {"BVPS": "Book Value Per Share", "CAPEX": "Capital Expenditure", "CPS": "Cash Flow Per Share", "DPS": "Dividend Per Share", "EBIT": "Earnings Before Interest and Tax",
...

Let's query EPS estimates and actuals:

$ quantrocket fundamental estimates 'EPS' -u 'us-banks' -s '2014-01-01' -e '2017-01-01' -o eps_estimates.csv
$ head eps_estimates.csv
ConId,Indicator,Unit,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,High,Low,Mean,Median,StdDev,NumOfEst,AnnounceDate,UpdatedDate,Actual
9029,EPS,U,2014,2014-03-31,Q,1,0.31,0.2,0.255,0.255,0.055,2,2014-05-01T11:45:00,2014-05-01T12:06:31,0.12
9029,EPS,U,2014,2014-06-30,Q,2,0.77,0.73,0.7467,0.74,0.017,3,2014-07-31T11:45:00,2014-07-31T13:47:24,1.02
9029,EPS,U,2014,2014-09-30,Q,3,0.71,0.63,0.6667,0.66,0.033,3,2014-11-04T12:45:00,2014-11-04T13:27:49,0.62
9029,EPS,U,2014,2014-12-31,A,,2.25,2.23,2.2433,2.25,0.0094,3,2015-02-27T12:45:00,2015-02-27T13:20:27,2.29
9029,EPS,U,2014,2014-12-31,Q,4,0.49,0.47,0.4833,0.49,0.0094,3,2015-02-27T12:45:00,2015-02-27T13:20:26,0.53
>>> from quantrocket.fundamental import download_reuters_estimates
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_reuters_estimates(["EPS"],f,universes=["us-banks"],
                            start_date="2014-01-01", end_date="2017-01-01")
>>> eps_estimates = pd.read_csv(f, parse_dates=["FiscalPeriodEndDate", "AnnounceDate"])
>>> eps_estimates.head()
    ConId Indicator Unit  FiscalYear FiscalPeriodEndDate FiscalPeriodType  \
0   9029       EPS    U        2014          2014-03-31                Q
1 9029 EPS U 2014 2014-06-30 Q
2 9029 EPS U 2014 2014-09-30 Q
3 9029 EPS U 2014 2014-12-31 A
4 9029 EPS U 2014 2014-12-31 Q
FiscalPeriodNumber High Low Mean Median StdDev NumOfEst \ 0 1.0 0.31 0.20 0.2550 0.255 0.0550 2.0
1 2.0 0.77 0.73 0.7467 0.740 0.0170 3.0
2 3.0 0.71 0.63 0.6667 0.660 0.0330 3.0
3 NaN 2.25 2.23 2.2433 2.250 0.0094 3.0
4 4.0 0.49 0.47 0.4833 0.490 0.0094 3.0
AnnounceDate UpdatedDate Actual
0 2014-05-01 11:45:00 2014-05-01T12:06:31 0.12
1 2014-07-31 11:45:00 2014-07-31T13:47:24 1.02
2 2014-11-04 12:45:00 2014-11-04T13:27:49 0.62
3 2015-02-27 12:45:00 2015-02-27T13:20:27 2.29
4 2015-02-27 12:45:00 2015-02-27T13:20:26 0.53
$ curl -X GET 'http://houston:1969/fundamental/reuters/estimates.csv?codes=EPS&universes=us-banks&start_date=2014-01-01&end_date=2017-01-01' --output eps_estimates.csv
$ head eps_estimates.csv
ConId,Indicator,Unit,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,High,Low,Mean,Median,StdDev,NumOfEst,AnnounceDate,UpdatedDate,Actual
9029,EPS,U,2014,2014-03-31,Q,1,0.31,0.2,0.255,0.255,0.055,2,2014-05-01T11:45:00,2014-05-01T12:06:31,0.12
9029,EPS,U,2014,2014-06-30,Q,2,0.77,0.73,0.7467,0.74,0.017,3,2014-07-31T11:45:00,2014-07-31T13:47:24,1.02
9029,EPS,U,2014,2014-09-30,Q,3,0.71,0.63,0.6667,0.66,0.033,3,2014-11-04T12:45:00,2014-11-04T13:27:49,0.62
9029,EPS,U,2014,2014-12-31,A,,2.25,2.23,2.2433,2.25,0.0094,3,2015-02-27T12:45:00,2015-02-27T13:20:27,2.29
9029,EPS,U,2014,2014-12-31,Q,4,0.49,0.47,0.4833,0.49,0.0094,3,2015-02-27T12:45:00,2015-02-27T13:20:26,0.53

Note that the estimates dataset is not currently integrated as tightly into Moonshot and Zipline as the financial statements dataset.

Real-time Data

Getting snapshots of market data

You can get the latest quotes for a group of securities:

$ quantrocket realtime quotes --groups mexico --snapshot
from quantrocket.realtime import get_quotes
quotes = get_quotes(groups=["mexico"], snapshot=True)
$ curl -X GET 'http://houston:1969/realtime/quotes?groups=mexico&snapshot=true'
When you request a quote without first adding securities to the realtime stream (as described below), QuantRocket will add the securities to the realtime stream for you, retrieve the quotes, then cancel the stream for those securities. If you're going to be requesting quotes multiple times in succession, it's more efficient to add and cancel the streaming yourself.

Streaming market data

If you want more than just a quote snapshot, you can fire up a realtime data stream for a set of securities and flexibly query the streamed market data. First, add securities or groups to the data stream. You can add as many securities as you like (up to your IB market data limits):

$ quantrocket realtime add --groups mexico
$ quantrocket realtime add --conids 123456
from quantrocket.realtime import stream_securities
stream_securities(groups=["mexico"])
stream_securities(conids=[123456])
$ curl -X POST 'http://houston:1969/realtime/stream?groups=mexico'
$ curl -X POST 'http://houston:1969/realtime/stream?conids=123456'
QuantRocket will store the streamed market data in a Redis data store. Next, you can query the streamed data for, say, the last 20 minutes, limiting to just the bid and ask in this example:
$ quantrocket realtime quotes --groups mexico --window 20m --fields bid ask
from quantrocket.realtime import get_quotes
quotes = get_quotes(groups=["mexico"], window="20m", fields=["bid", "ask"])
$ curl -X GET 'http://houston:1969/realtime/quotes?groups=mexico&window=20m&fields=bid&fields=ask'
When you're done with a particular set of securities, you cancel the stream:
$ quantrocket realtime cancel --groups mexico
from quantrocket.realtime import cancel_stream
cancel_stream(groups=["mexico"])
$ curl -X DELETE 'http://houston:1969/realtime/stream?groups=mexico'
Alternatively, you could have indicated the cancellation time when your first started the stream:
$ quantrocket realtime add --groups mexico --cancel-in 4h30m
from quantrocket.realtime import stream_securities
stream_securities(groups=["mexico"], cancel_in="4h30m")
$ curl -X POST 'http://houston:1969/realtime/stream?groups=mexico&cancel_in=4h30m'
If you have multiple IB Gateway services, possibly with different market data permissions and different numbers of simultaneous market data lines, QuantRocket will account for these permissions when selecting which IB Gateway service to use for realtime streaming of a given security. Make sure you register your market data permissions with the launchpad service so QuantRocket can make good choices.

Research and Backtesting

Backtesters come in many shapes and sizes, each with strengths and limitations. Since backtests are where you model the real-world constraints and characteristics of your strategies, it's vital to choose a backtester that aligns well with the needs of your strategy. QuantRocket supports multiple backtesters: Moonshot, a vectorized backtester developed by and for QuantRocket; Zipline, the popular open-source backtester that powers Quantopian; or you can plug in your own backtester using QuantRocket's satellite service.

Event-driven vs vectorized backtesters

What's the difference between event-driven backtesters like Zipline and vectorized backtesters like Moonshot? Event-driven backtests process one event at a time, where an event is usually one historical bar (or in the case of live trading, one real-time quote). Vectorized backtests process all events at once, by performing simultaneous calculations on an entire vector or matrix of data. (In pandas, a Series is a vector and a DataFrame is a matrix).

Imagine a simplistic strategy of buying a security whenever the price falls below $10 and selling whenever it rises above $10. We have a time series of prices and want to know which days to buy and which days to sell. In an event-driven backtester we loop through one date at a time and check the price at each iteration:

>>> data = {
>>>     "2017-02-01": 10.07,
>>>     "2017-02-02": 9.87,
>>>     "2017-02-03": 9.91,
>>>     "2017-02-04": 10.01
>>> }
>>> for date, price in data.items():
>>>     if price < 10:
>>>         buy_signal = True
>>>     else:
>>>         buy_signal = False
>>>     print(date, buy_signal)
2017-02-01 False
2017-02-02 True
2017-02-03 True
2017-02-04 False

In a vectorized backtest, we check all the prices at once to calculate our buy signals:

>>> import pandas as pd
>>> data = {
>>>     "2017-02-01": 10.07,
>>>     "2017-02-02": 9.87,
>>>     "2017-02-03": 9.91,
>>>     "2017-02-04": 10.01
>>> }
>>> prices = pd.Series(data)
>>> buy_signals = prices < 10
>>> buy_signals.head()
2017-02-01    False
2017-02-02     True
2017-02-03     True
2017-02-04    False
dtype: bool

Both backtests produce the same result using a different approach. The strengths and limitations of these differing approaches are summarized below:

Vectorized backtests are faster than event-driven backtests

Speed is one of the principal benefits of vectorized backtests, thanks to running calculations on an entire time series at once. Event-driven backtests can be prohibitively slow when working with large universes of securities and large amounts of data. Because of their speed, vectorized backtesters support rapid experimentation and testing of new ideas.

Watch out for look-ahead bias with vectorized backtesters

Look-ahead bias refers to making decisions in your backtest based on information that wouldn't have been available at the time of the trade. Because event-driven backtesters only give you one bar at a time, they generally protect you from look-ahead bias. Because a vectorized backtester gives you the entire time-series, it's easier to introduce look-ahead bias by mistake, for example generating signals based on today's close but then calculating the return from today's open instead of tomorrow's.

If you achieve a phenomenal backtest result on the first try with a vectorized backtester, check for look-ahead bias.

Vectorized backtesters can't trade some event-driven strategies

With event-driven backtesters, switching from backtesting to live trading typically involves changing out a historical data feed for a real-time market data feed, and replacing a simulated broker with a real broker connection.

With a vectorized backtester, live trading can be achieved by running an up-to-the-moment backtest (possibly appending a real-time quote to the historical data) and using the final row of signals (that is, today's signals) to generate orders.

The vectorized design is well-suited for cross-sectional and factor-model strategies with regular rebalancing intervals, or for any strategy that periodically "wakes up," checks current and historical market conditions, and makes trading decisions accordingly. The intervals might be quarters, months, days, or even minutes: for example, wake up every 15 minutes and trade stocks with the largest price change over the last 15 minutes. However, as the interval becomes even smaller (for example, monitor a futures contract continuously for extreme price movements and place orders within a few seconds of a triggering event), an event-driven backtester becomes a better choice.

Moonshot

Quickstart

Let's design a dual moving average strategy which buys tech stocks when their short moving average is above their long moving average. Assume we've already created a history database of daily bars for several tech stocks, like so:

$ # get the tech stock listings...
$ quantrocket master listings --exchange 'NASDAQ' --symbols 'GOOGL' 'NFLX' 'AAPL' 'AMZN'
status: the listing details will be fetched asynchronously
$ # monitor flightlog for listing details to be fetched, then make a universe:
$ quantrocket master get -e 'NASDAQ' -s 'GOOGL' 'NFLX' 'AAPL' 'AMZN' | quantrocket master universe 'tech-giants' -f -
code: tech-giants
inserted: 4
provided: 4
total_after_insert: 4
$ # get 1 day bars for the stocks
$ quantrocket history create-db 'tech-giants-1d' -u 'tech-giants' --bar-size '1 day'
status: successfully created quantrocket.history.tech-giants-1d.sqlite
$ quantrocket history fetch 'tech-giants-1d'
status: the historical data will be fetched asynchronously

Now let's write the minimal strategy code to run a backtest:

from moonshot import Moonshot

class DualMovingAverageStrategy(Moonshot):

    CODE = "dma-tech"
    DB = "tech-giants-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

    def get_signals(self, prices):
        closes = prices.loc["Close"]

        # Compute long and short moving averages
        lmavgs = closes.rolling(self.LMAVG_WINDOW).mean()
        smavgs = closes.rolling(self.SMAVG_WINDOW).mean()

        # Go long when short moving average is above long moving average
        signals = smavgs.shift() > lmavgs.shift()

        return signals.astype(int)

A strategy is a subclass of the Moonshot class. You implement your trading logic in the class methods and store your strategy parameters as class attributes. Class attributes include built-in Moonshot parameters which you can specify or override, as well as your own custom parameters. In the above example, CODE and DB are built-in parameters while LMAVG_WINDOW and SMAVG_WINDOW are free parameters which we've chosen to store as class attributes, which will allow us to run parameter scans or create similar strategies with different parameters.

Place your code in a directory called 'moonshot' inside your codeload volume. For example, if your Docker Compose file looks like this...

codeload:
    image: 'quantrocket/codeload:latest'
    volumes:
        - /home/neilarmstrong/code:/codeload

...place your strategy code in /home/neilarmstrong/code/moonshot/. QuantRocket recursively scans .py files in this directory and loads your strategies.

Now we can run a backtest and view a PDF tear sheet of performance results:

$ quantrocket moonshot backtest 'dma-tech' -s '2005-01-01' -e '2017-01-01' -o tearsheet.pdf --details
>>> from quantrocket.moonshot import backtest
>>> backtest(["dma-tech"], start_date="2005-01-01", end_date="2017-01-01",
             filepath_or_buffer="tearsheet.pdf")
$ curl -X POST 'http://houston:1969/moonshot/backtests?strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01' > tearsheet.pdf
Now we can open the PDF to review the performance. moonshot tearsheet
Since our demo strategy includes a limited number of securities, we ran it with the --details flag to see the individual security performance in addition to the aggregate performance. Don't use --details on a strategy with hundreds of securities or you'll probably run out of memory.

How a Moonshot backtest works

Moonshot is all about DataFrames. In a Moonshot backtest, we start with a DataFrame of historical prices and derive a variety of equivalently-indexed DataFrames, including DataFrames of signals, trade allocations, positions, and returns. These DataFrames consist of a time-series index (vertical axis) with one or more securities as columns (horizontal axis). A simple example of a DataFrame of signals is shown below for a strategy with a 2-security universe (securities are identified by conid):

ConId       12345  67890
Date
2017-09-19      0     -1
2017-09-20      1     -1
2017-09-21      1      0

A Moonshot strategy consists of strategy parameters (stored as class attributes) and strategy logic (implemented in class methods). The strategy logic required to run a backtest is spread across four main methods, mirroring the stages of a trade:

method name input/output
what direction to trade? get_signals from a DataFrame of prices, return a DataFrame of integer signals, where 1=long, -1=short, and 0=cash
how much capital to allocate to the trades? allocate_weights from a DataFrame of integer signals (-1, 0, 1), return a DataFrame indicating how much capital to allocate to the signals, expressed as a percentage of the total capital allocated to the strategy (for example, -0.25, 0, 0.1 to indicate 25% short, cash, 10% long)
enter the positions when? simulate_positions from a DataFrame of allocations, return a DataFrame of positions (here we model the delay between when the signal occurs and when the position is entered, and possibly model non-fills)
what's our return? simulate_gross_returns from a DataFrame of positions and a DataFrame of prices, return a DataFrame of percentage returns before commissions and slippage (our return is the security's percent change over the period, multiplied by the position)

Since Moonshot is a vectorized backtester, each of these methods is called only once per backtest.

Our demo strategy above relies on the default implementations of several of these methods, but since it's better to be explicit than implicit, you should always implement these methods even if you copy the default behavior. Let's explicitly implement the default behavior in our demo strategy:

from moonshot import Moonshot

class DualMovingAverageStrategy(Moonshot):

    CODE = "dma-tech"
    DB = "tech-giants-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

    def get_signals(self, prices):
        closes = prices.loc["Close"]

        # Compute long and short moving averages
        lmavgs = closes.rolling(self.LMAVG_WINDOW).mean()
        smavgs = closes.rolling(self.SMAVG_WINDOW).mean()

        # Go long when short moving average is above long moving average
        signals = smavgs.shift() > lmavgs.shift()

        return signals.astype(int)

    def allocate_weights(self, signals, prices):
        # spread our capital equally among our trades on any given day
        weights = self.allocate_equal_weights(signals) # provided by moonshot.mixins.WeightAllocationMixin
        return weights

    def simulate_positions(self, weights, prices):
        # we'll enter in the period after the signal
        positions = weights.shift()
        return positions

    def simulate_gross_returns(self, positions, prices):
        # Our return is the security's close-to-close return, multiplied by
        # the size of our position. We must shift the positions DataFrame because
        # we don't have a return until the period after we open the position
        closes = prices.loc["Close"]
        gross_returns = closes.pct_change() * positions.shift()
        return gross_returns

Several weight allocation algorithms are provided out of the box via moonshot.mixins.WeightAllocationMixin.

Strategy inheritance

Often, you may want to re-use a strategy's logic while changing some of the parameters. For example, perhaps you'd like to run an existing strategy on a different market. To do so, simply subclass your existing strategy and modify the parameters as needed. Let's try our dual moving average strategy on a group of ETFs. First, get the historical data for the ETFs:

$ # get a handful of ETF listings...
$ quantrocket master listings --exchange 'ARCA' --symbols 'SPY' 'XLF' 'EEM' 'VNQ' 'XOP' 'GDX'
status: the listing details will be fetched asynchronously
$ # monitor flightlog for listing details to be fetched, then make a universe:
$ quantrocket master get -e 'ARCA' -s 'SPY' 'XLF' 'EEM' 'VNQ' 'XOP' 'GDX' | quantrocket master universe 'etf-sampler' -f -
code: etf-sampler
inserted: 6
provided: 6
total_after_insert: 6
$ # get 1 day bars for the ETFs
$ quantrocket history create-db 'etf-sampler-1d' -u 'etf-sampler' --bar-size '1 day'
status: successfully created quantrocket.history.etf-sampler-1d.sqlite
$ quantrocket history fetch 'etf-sampler-1d'
status: the historical data will be fetched asynchronously

Since we're inheriting from an existing strategy, implementing our strategy is easy:

# derive a strategy from DualMovingAverageStrategy (defined earlier in the file)
class DualMovingAverageStrategyETF(DualMovingAverageStrategy):

    CODE = "dma-etf"
    DB = "etf-sampler-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

Now we can run our backtest:

$ quantrocket moonshot backtest 'dma-etf' -s '2005-01-01' -e '2017-01-01' -o tearsheet_etf.pdf --details

Specify a benchmark

Optionally, we can identify a security within our strategy universe as a benchmark, and we'll get a chart of our strategy's performance against the benchmark. Our ETF strategy universe includes SPY, so let's make that our benchmark. First, lookup the conid (contract ID) if needed, since that's how we specify the benchmark:

$ quantrocket master get -e ARCA -s SPY -f ConId -p
ConId = 756733

Now set this conid as the benchmark:

class DualMovingAverageStrategyETF(DualMovingAverageStrategy):

    CODE = "dma-etf"
    DB = "etf-sampler-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100
    BENCHMARK = 756733 # Must exist with the strategy DB

Run the backtest again, and we'll see an additional chart in our tear sheet:

moonshot tearsheet vs benchmark

Multi-strategy backtests

We can easily backtest multiple strategies at once to simulate running complex portfolios of strategies. Simply specify all of the strategies:

$ quantrocket moonshot backtest 'dma-tech' 'dma-etf' -s '2005-01-01' -e '2017-01-01' -o dma_multistrat.pdf

Our tear sheet will show the aggregate portfolio performance as well as the individual strategy performance:

moonshot multi-strategy tearsheet

Parameter scans

You can run 1-dimensional or 2-dimensional parameter scans to see how your strategy performs for a variety of parameter values. Let's try varying the short moving average window on our dual moving average strategy:

$ quantrocket moonshot paramscan 'dma-tech' -p 'SMAVG_WINDOW' -v 5 20 100 -o dma_1d.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> scan_parameters(["dma-tech"], start_date="2005-01-01", end_date="2017-01-01",
                    param1="SMAVG_WINDOW", vals1=[5,20,100],
                    filepath_or_buffer="dma_tech_1d.pdf")
$ curl -X POST 'http://houston:1969/moonshot/paramscans?strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01&param1=SMAVG_WINDOW&vals1=5&vals1=20&vals1=100' > dma_tech_1d.pdf

The resulting tear sheet will show how the strategy performs for each parameter value:

moonshot paramscan 1-D tearsheet

Let's try a 2-dimensional parameter scan, varying both our short and long moving averages:

$ quantrocket moonshot paramscan 'dma-tech' --param1 'SMAVG_WINDOW' --vals1 5 20 100 --param2 'LMAVG_WINDOW' --vals2 150 200 300 -o dma_2d.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> scan_parameters(["dma-tech"], start_date="2005-01-01", end_date="2017-01-01",
                    param1="SMAVG_WINDOW", vals1=[5,20,100],
                    param2="LMAVG_WINDOW", vals2=[150,200,300],
                    filepath_or_buffer="dma_tech_2d.pdf")
$ curl -X POST 'http://houston:1969/moonshot/paramscans?strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01&param1=SMAVG_WINDOW&vals1=5&vals1=20&vals1=100&param2=LMAVG_WINDOW&vals2=150&vals2=200&vals2=300' > dma_tech_2d.pdf
This time our tear sheet uses a heat map to visualize the 2-D results: moonshot paramscan 2-D tearsheet We can even run a 1-D or 2-D parameter scan on multiple strategies at once:
$ quantrocket moonshot paramscan 'dma-tech' 'dma-etf' -p 'SMAVG_WINDOW' -v 5 20 100 -o dma_multistrat_1d.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> scan_parameters(["dma-tech","dma-etf"], start_date="2005-01-01", end_date="2017-01-01",
                    param1="SMAVG_WINDOW", vals1=[5,20,100],
                    filepath_or_buffer="dma_multistrat_1d.pdf")
$ curl -X POST 'http://houston:1969/moonshot/paramscans?strategies=dma-tech&strategies=dma-etf&start_date=2005-01-01&end_date=2017-01-01&param1=SMAVG_WINDOW&vals1=5&vals1=20&vals1=100' > dma_multistrat_1d.pdf

The tear sheet shows the scan results for the individual strategies and the aggregate portfolio:

moonshot paramscan multi-strategy 1-D tearsheet

Organize your Moonshot code

Your Moonshot code should be placed in a directory called 'moonshot' inside your codeload volume. For example, if your Docker Compose file looks like this...

codeload:
    image: 'quantrocket/codeload:latest'
    volumes:
        - /home/neilarmstrong/code:/codeload

...place your strategy code in /home/neilarmstrong/code/moonshot/. QuantRocket recursively scans .py files in this directory and loads your strategies (a strategy is defined as a subclass of moonshot.Moonshot). You can place as many strategies as you like within a single .py file, or you can place them in separate files. If you like, you can organize your .py files into subdirectories as you see fit.

If you want to re-use code across multiple files, you can do so using standard Python import syntax. Any .py files you place in the 'moonshot' directory inside your codeload volume can be imported from codeload_moonshot. For example, consider a simple directory structure containing two files for your strategies and one file with helper functions used by multiple strategies:

/home/neilarmstrong/code/moonshot/helpers.py
/home/neilarmstrong/code/moonshot/meanreversion_strategies.py
/home/neilarmstrong/code/moonshot/momentum_strategies.py

Suppose you've implemented a function in helpers.py called rebalance_positions. You can import and use the function in another file like so:

from codeload_moonshot.helpers import rebalance_positions

Importing also works if you're using subdirectories:

/home/neilarmstrong/code/moonshot/helpers/rebalance.py
/home/neilarmstrong/code/moonshot/meanreversion/buythedip.py
/home/neilarmstrong/code/moonshot/momentum/hml.py

Just use standard Python dot syntax to reach your modules wherever they are in the directory tree:

from codeload_moonshot.helpers.rebalance import rebalance_positions
To make your code importable as a standard Python package, the 'moonshot' directory and each subdirectory must contain a __init__.py file. QuantRocket will create these files automatically if they don't exist.

Modeling commissions

Moonshot supports realistic modeling of IB commissions. To model commissions, subclass the appropriate commission class, set the commission costs as per IB's website, then add the commission class to your strategy:

from moonshot import Moonshot
from moonshot.commission import PercentageCommission

class JapanStockFixedCommission(PercentageCommission):
    # look up commission costs on IB's website
    IB_COMMISSION_RATE = 0.0008 # 0.08% of trade value
    MIN_COMMISSION = 80.00 # JPY

class MyJapanStrategy(Moonshot):
    COMMISSION_CLASS = JapanStockFixedCommission
Because commission costs change from time to time, and because some cost components depend on account specifics such as your monthly trade volume or the degree to which you add or remove liquidity, Moonshot provides the commission logic but expects you to fill in the specific cost constants.

Percentage commissions

Use moonshot.commission.PercentageCommission where IB's commission is calculated as a percentage of the trade value. If you're using the tiered commission structure, you can also set an exchange fee (as a percentage of trade value). A variety of examples are shown below:

from moonshot.commission import PercentageCommission

class MexicoStockCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0010
    MIN_COMMISSION = 60.00 # MXN

class SingaporeStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0008
    EXCHANGE_FEE_RATE = 0.00034775 + 0.00008025 # transaction fee + access fee
    MIN_COMMISSION = 2.50 # SGD

class UKStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0008
    EXCHANGE_FEE_RATE = 0.000045 + 0.0025 # 0.45 bps + 0.5% stamp tax on purchases > 1000 GBP
    MIN_COMMISSION = 1.00 # GBP

class HongKongStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0008
    EXCHANGE_FEE_RATE = (
          0.00005 # exchange fee
        + 0.00002 # clearing fee (2 HKD min)
        + 0.001 # Stamp duty
        + 0.000027 # SFC Transaction Levy
    )
    MIN_COMMISSION = 18.00 # HKD    

class JapanStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0005 # 0.08% of trade value
    EXCHANGE_FEE_RATE = 0.00002 + 0.000004 # 0.002% Tokyo Stock Exchange fee + 0.0004% clearing fee
    MIN_COMMISSION = 80.00 # JPY

Per Share commissions

Use moonshot.commission.PerShareCommission to model commissions which are assessed per share (US and Canada stock commissions). Here is an example of a fixed commission for US stocks:

from moonshot.commission import PerShareCommission

class USStockFixedCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.005
    MIN_COMMISSION = 1.00

IB Cost-Plus commissions can be complex; in addition to the IB commission they may include exchange fees which are assessed per share (and which may differ depending on whether you add or remove liqudity), fees which are based on the trade value, and fees which are assessed as a percentage of the IB comission itself. These can also be modeled:

class CostPlusUSStockCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.0035
    EXCHANGE_FEE_PER_SHARE = (0.0002 # clearing fee per share
                             + (0.000119/2)) # FINRA activity fee (per share sold so divide by 2)
    MAKER_FEE_PER_SHARE = -0.002 # exchange rebate (varies)
    TAKER_FEE_PER_SHARE = 0.00118 # exchange fee (varies)
    MAKER_RATIO = 0.25 # assume 25% of our trades add liquidity, 75% take liquidity
    COMMISSION_PERCENTAGE_FEE_RATE = (0.000175 # NYSE pass-through (% of IB commission)
                                     + 0.00056) # FINRA pass-through (% of IB commission)
    PERCENTAGE_FEE_RATE = 0.0000231 # Transaction fees as a percentage of trade value
    MIN_COMMISSION = 0.35 # USD

class CanadaStockCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.008
    EXCHANGE_FEE_PER_SHARE = (
        0.00017 # clearing fee per share
        + 0.00011 # transaction fee per share
        )
    MAKER_FEE_PER_SHARE = -0.0019 # varies
    TAKER_FEE_PER_SHARE = 0.003 # varies
    MAKER_RATIO = 0 # assume we always take liqudity
    MIN_COMMISSION = 1.00 # CAD

Futures commissions

moonshot.commission.FuturesCommission lets you define a commission, exchange fee, and carrying fee per contract:

from moonshot.commission import FuturesCommission

class GlobexEquityEMiniFixedCommission(FuturesCommission):
    IB_COMMISSION_PER_CONTRACT = 0.85
    EXCHANGE_FEE_PER_CONTRACT = 1.18
    CARRYING_FEE_PER_CONTRACT = 0 # Depends on equity in excess of margin requirement

Forex commissions

Spot forex commissions are percentage-based, so moonshot.commission.SpotForexCommission can be used directly without subclassing:

from moonshot import Moonshot
from moonshot.commission import SpotForexCommission

class MyForexStrategy(Moonshot):
    COMMISSION_CLASS = SpotForexCommission

Note that at present, SpotForexCommission does not model minimum commissions (this has to do with the fact that the minimum commission for forex is always expressed in USD, rather than the currency of the traded security). This limitation means that if your trades are small, SpotForexCommission may underestimate the commission.

Modeling minimum commissions

During backtests, Moonshot calculates and assesses commissions in percentage terms (relative to the capital allocated to the strategy) rather than in dollar terms. However, since minimum commissions are expressed in dollar terms, this means that in order for Moonshot to accurately model minimum commissions, you have to specify your NLV (Net Liquidation Value, i.e. account balance) in the backtest. You should specify your NLV in each currency you wish to model.

For example, if your account balance is $100K USD, and your strategy trades instruments denominated in JPY and AUD, you could specify this on the strategy:

class MyAsiaStrategy(Moonshot):
    CODE = "my-asia-strategy"
    NLV = {
        "JPY": 100000 * 110, # 110 JPY per USD
        "AUD": 100000 * 1.25 # 1.25 AUD per USD
    }

Or pass the NLV on the command line at the time you run the backtest:

$ quantrocket moonshot backtest 'my-asia-strategy' --nlv 'JPY:11000000' 'AUD:125000'
If you don't specify NLV on the strategy or via the --nlv option, the backtest will still run, it just won't take into account minimum commissions.

Multiple commission structures on the same strategy

You might run a strategy that trades multiple securities with different commission structures. Instead of specifying a single commission class, you can specify a Python dictionary associating each commission class with the respective security type, exchange, and currency it applies to:

class USStockFixedCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.005
    MIN_COMMISSION = 1.00

class GlobexEquityEMiniFixedCommission(FuturesCommission):
    IB_COMMISSION_PER_CONTRACT = 0.85
    EXCHANGE_FEE_PER_CONTRACT = 1.18

class MultiSecTypeStrategy(Moonshot):
    # this strategy trades NYSE and NASDAQ stocks and GLOBEX futures
    COMMISSION_CLASS = {
        # dict keys should be tuples of (security type, exchange, currency)
        ("STK", "NYSE", "USD"): USStockFixedCommission,
        ("STK", "NASDAQ", "USD"): USStockFixedCommission,
        ("FUT", "GLOBEX", "USD"): GlobexEquityEMiniFixedCommission
    }

Fundamental data in Moonshot

To use financial statements from the Reuters Worldwide Fundamentals dataset in Moonshot, first fetch the data into your QuantRocket database as described in the fundamental data section of the usage guide.

You can use the Python client to query Reuters financials in your backtest, but Moonshot provides a utility method, get_reuters_financials, to make it even easier. Simply specify which metrics you want and provide one of your pricing DataFrames, such as your closing prices. Moonshot will query the financials using the securities and date range in your pricing DataFrame, and will return the financials in a DataFrame of the same shape. This allows you to easily combine pricing and fundamental data in your calculations.

For example, you can use fundamental data to calculate book value per share, then compare that to your closing prices to calculate price-to-book ratios:

def get_signals(self, prices):
    closes = prices.loc["Close"]

    # calculate book value per share, defined as:
    #
    #    (Total Assets - Total Liabilities) / Number of shares outstanding
    #
    # The codes for these metrics are 'ATOT' (Total Assets), 'LTLL' (Total
    # Liabilities), and 'QTCO' (Total Common Shares Outstanding).
    financials = self.get_reuters_financials(["ATOT", "LTLL", "QTCO"], closes)

    tot_assets = financials.loc["ATOT"].loc["Amount"]
    tot_liabilities = financials.loc["LTLL"].loc["Amount"]
    shares_out = financials.loc["QTCO"].loc["Amount"]
    book_values_per_share = (tot_assets - tot_liabilities)/shares_out

    # Calculate price-to-book ratio
    pb_ratios = closes/book_values_per_share

A demo strategy utilizing financial statements is available in the codeload-demo repository.

Zipline

Zipline and pyfolio are open-source libraries for running backtests and analyzing algorithm performance. Both libraries are developed by Quantopian. QuantRocket makes it easy to run Zipline backtests using historical data from QuantRocket's history service and view a pyfolio tear sheet of the results.

Ingest historical data

To run a Zipline backtest, first ingest a data bundle. QuantRocket lets you easily ingest 1-day or 1-minute history databases from the history service. Let's ingest historical data for AAPL so we can run the Zipline demo strategy.

First, assume we've already fetched 1-day bars for AAPL, like so:

$ # get the listing...
$ quantrocket master listings --exchange NASDAQ --symbols AAPL
status: the listing details will be fetched asynchronously
$ # monitor flightlog for listing details to be fetched, then make a universe:
$ quantrocket master get -e NASDAQ -s AAPL | quantrocket master universe 'just-aapl' -f -
code: just-aapl
inserted: 1
provided: 1
total_after_insert: 1
$ # get 1 day bars for AAPL
$ quantrocket history create-db 'aapl-1d' --universes 'just-aapl' --bar-size '1 day'
status: successfully created quantrocket.history.aapl-1d.sqlite
$ quantrocket history fetch 'aapl-1d'
status: the historical data will be fetched asynchronously

After the historical data request finishes, we can ingest our historical data into Zipline:

$ quantrocket zipline ingest --history-db 'aapl-1d'
msg: successfully ingested aapl-1d bundle
status: success

The data bundle will use Zipline's default NYSE calendar. Or you can associate your data bundle with a different Zipline calendar:

$ # to see available Zipline calendars, you can pass an invalid value:
$ quantrocket zipline ingest --history-db 'london-stk-1d' --calendar ?
msg: 'unknown calendar ''?'', choices are: BMF, CFE, CME, ICE, LSE, NYSE, TSX, us_futures'
status: error
$ quantrocket zipline ingest --history-db 'london-stk-1d' --calendar 'LSE'
msg: successfully ingested london-stk-1d bundle
status: success

We can also list any bundles we've ingested:

$ quantrocket zipline bundles
aapl-1d:
- '2017-10-05 14:16:12.246592'
- '2017-10-05 14:07:51.482331'
- '2017-10-05 14:05:48.890156'
- '2017-10-05 13:51:09.721299'
london-stk-1d:
- '2017-10-05 14:20:11.241632'
quandl: []
quantopian-quandl: []

And clean up old bundles if needed:

$ quantrocket zipline clean -b 'aapl-1d' --keep-last 1
/root/.zipline/data/aapl-1d/2017-10-05T14;05;48.890156
/root/.zipline/data/aapl-1d/2017-10-05T13;51;09.721299
/root/.zipline/data/aapl-1d/2017-10-05T14;07;51.482331

Run a Zipline algo via CLI

Now let's write our strategy code. Here is a Zipline demo file of a dual moving average crossover strategy using AAPL:

# dual_moving_average.py

from zipline.api import order_target_percent, record, symbol, set_benchmark

def initialize(context):
    context.sym = symbol('AAPL')
    set_benchmark(symbol('AAPL'))
    context.i = 0

def handle_data(context, data):
    # Skip first 300 days to get full windows
    context.i += 1
    if context.i < 300:
        return

    # Compute averages
    # history() has to be called with the same params
    # from above and returns a pandas dataframe.
    short_mavg = data.history(context.sym, 'price', 100, '1d').mean()
    long_mavg = data.history(context.sym, 'price', 300, '1d').mean()

    # Trading logic
    if short_mavg > long_mavg:
        # order_target_percent orders as many shares as needed to
        # achieve the desired percent allocation.
        order_target_percent(context.sym, 0.2)
    elif short_mavg < long_mavg:
        order_target_percent(context.sym, 0)

    # Save values for later inspection
    record(AAPL=data.current(context.sym, "price"),
           short_mavg=short_mavg,
           long_mavg=long_mavg)

Place this file in a directory called 'zipline' inside your codeload volume. For example, if your Docker Compose file looks like this...

codeload:
    image: 'quantrocket/codeload:latest'
    volumes:
        - /home/neilarmstrong/code:/codeload

...place your strategy code in /home/neilarmstrong/code/zipline/.

Now we can run our backtest and save the results file, then use the results file to get a pyfolio PDF tear sheet:

$ quantrocket zipline run --bundle 'aapl-1d' -f 'dual_moving_average.py' -s '2000-01-01' -e '2017-01-01' -o aapl_results.csv
$ quantrocket zipline tearsheet aapl_results.csv -o aapl_results.pdf

Open the PDF and have a look:

zipline pyfolio tearsheet

Run a Zipline algo in a notebook

In addition to running Zipline backtests from the command line, you can also run them from a Jupyter notebook. The command line provides a quick, easy way to generate a PDF tear sheet in the fewest lines of code, while the Jupyter notebook environment lets you explore your backtest results more interactively.

To learn how to run your Zipline algo in a notebook, check out the example notebooks in the codeload-demo repository.

To learn more about Zipline and pyfolio, check out each project's documentation:

Fundamental data in Zipline

QuantRocket makes it easy to use financial statements from the Reuters Worldwide Fundamentals dataset in Zipline's Pipeline API. First fetch the data into your QuantRocket database as described in the fundamental data section of the usage guide.

To use the fundamental data in Pipeline, import the ReutersFinancials Pipeline dataset (for annual financial reports) or the ReutersInterimFinancials dataset (for interim/quarterly financial reports) from the zipline_extensions package provided by QuantRocket. You can reference any of the available financial statement indicator codes and use them to build a custom Pipeline factor. (See the fundamental data section of the usage guide for help looking up the codes.)

Below, we create a custom Pipeline factor that calculates price-to-book ratio.

from zipline.pipeline import Pipeline, CustomFactor
from zipline.pipeline.data import USEquityPricing
# zipline_extensions is provided by QuantRocket
from zipline_extensions.pipeline.data import ReutersFinancials # or ReutersInterimFinancials

# Create a price-to-book custom pipeline factor
class PriceBookRatio(CustomFactor):
    """
    Custom factor that calculates price-to-book ratio.

    First, calculate book value per share, defined as:

        (Total Assets - Total Liabilities) / Number of shares outstanding

    The codes we'll use for these metrics are 'ATOT' (Total Assets),
    'LTLL' (Total Liabilities), and 'QTCO' (Total Common Shares Outstanding).

    Price-to-book ratio is then calculated as:

        closing price / book value per share
    """
    inputs = [
        USEquityPricing.close, # despite the name, this works fine for non-US equities too
        ReutersFinancials.ATOT, # total assets
        ReutersFinancials.LTLL, # total liabilities
        ReutersFinancials.QTCO # common shares outstanding
    ]
    window_length = 1

    def compute(self, today, assets, out, closes, tot_assets, tot_liabilities, shares_out):
        book_values_per_share = (tot_assets - tot_liabilities)/shares_out
        pb_ratios = closes/book_values_per_share
        out[:] = pb_ratios

Now we can use our custom factor in our Pipeline:

pipe = Pipeline()
pb_ratios = PriceBookRatio()
pipe.add(pb_ratios, 'pb_ratio')

A demo strategy utilizing financial statements is available in the codeload-demo repository.

See Zipline's documentation for more on using the Pipeline API.

Integrate other backtesters

QuantRocket makes it easy to integrate other Python backtesters. You can tell QuantRocket what packages to install, you can use QuantRocket's Python client to pull historical and fundamental data into your strategy code, and you can use the CLI to run your backtests. You get the benefit of QuantRocket's infrastructure and data services together with the freedom and flexibility to choose the backtester best suited to your particular strategy.

As an example, we'll show how to connect the open-source Python backtesting framework backtrader to QuantRocket.

Define your satellite service

First, we add a satellite service to our Docker Compose file or Docker Cloud stack file and tell QuantRocket what packages to install on it. The service name should consist of alphanumerics and hyphens, and should begin with 'satellite'. We'll name our backtrader service 'satellite':

# docker-compose.yml
services:
    ...
    satellite:
        image: 'quantrocket/satellite:latest'
        volumes_from:
            - codeload
        environment:
            PIP_INSTALL: 'backtrader>=1.9'

The satellite service is QuantRocket's extensible service for bringing outside packages and tools into the QuantRocket solar system. The quantrocket/satellite Docker image ships with Anaconda, Python 3, and the QuantRocket client. We can instruct the service to install additional Python packages by specifying an environment variable called PIP_INSTALL which should contain a space-separated string of Python packages. If needed, we can also install Debian packages by specifying an APT_INSTALL environment variable, but we don't need this for our example.

Now that we've defined our service, we can launch our service using Docker Compose (or Docker Cloud for cloud deployments):

$ docker-compose -f path/to/docker-compose.yml -p quantrocket up -d satellite

Write your strategy code

Let's write a basic moving average strategy for backtrader using AAPL stock. First, assume we've already fetched 1-day bars for AAPL, like so:

$ # get the listing...
$ quantrocket master listings --exchange NASDAQ --symbols AAPL
status: the listing details will be fetched asynchronously
$ # monitor flightlog for listing details to be fetched, then make a universe:
$ quantrocket master get -e NASDAQ -s AAPL | quantrocket master universe 'just-aapl' -f -
code: just-aapl
inserted: 1
provided: 1
total_after_insert: 1
$ # get 1 day bars for AAPL
$ quantrocket history create-db 'aapl-1d' --universes 'just-aapl' --bar-size '1 day'
status: successfully created quantrocket.history.aapl-1d.sqlite
$ quantrocket history fetch 'aapl-1d'
status: the historical data will be fetched asynchronously

Now that we have historical data for AAPL, we can use it in backtrader by downloading a CSV and creating our backtrader data feed from it. The relevant snippet is shown below:

import backtrader.feeds as btfeeds
from quantrocket.history import download_history_file

# Create data feed using QuantRocket data and add to backtrader
# (Put files in /tmp to have QuantRocket automatically clean them out after
# a few hours)
download_history_file(
    'aapl-1d',
    filepath_or_buffer='/tmp/aapl-1d.csv',
    fields=['ConId','Date','Open','Close','High','Low','Volume'])

data = btfeeds.GenericCSVData(
    dataname='/tmp/aapl-1d.csv',
    dtformat=('%Y-%m-%d'),
    datetime=1,
    open=2,
    close=3,
    high=4,
    low=5,
    volume=6
)
cerebro.adddata(data)

A backtest commonly ends by plotting a performance chart, but since our code will be running in a headless Docker container, we should save the plot to a file (which we'll tell QuantRocket to return to us when we run the backtest):

# Save the plot to PDF so the satellite service can return it (make sure
# to use the Agg backend)
cerebro.plot(use='Agg', savefig=True, figfilename='/tmp/backtrader-plot.pdf')

A complete, working strategy is shown below:

# dual_moving_average.py

import backtrader as bt
import backtrader.feeds as btfeeds
from quantrocket.history import download_history_file

class DualMovingAverageStrategy(bt.SignalStrategy):

    params = (
        ('smavg_window', 100),
        ('lmavg_window', 300),
    )

    def __init__(self):

        # Compute long and short moving averages
        smavg = bt.ind.SMA(period=self.p.smavg_window)
        lmavg = bt.ind.SMA(period=self.p.lmavg_window)

        # Go long when short moving average is above long moving average
        self.signal_add(bt.SIGNAL_LONG, bt.ind.CrossOver(smavg, lmavg))

if __name__ == '__main__':

    cerebro = bt.Cerebro()

    # Create data feed using QuantRocket data and add to backtrader
    # (Put files in /tmp to have QuantRocket automatically clean them out after
    # a few hours)
    download_history_file(
        'aapl-1d',
        filepath_or_buffer='/tmp/aapl-1d.csv',
        fields=['ConId','Date','Open','Close','High','Low','Volume'])

    data = btfeeds.GenericCSVData(
        dataname='/tmp/aapl-1d.csv',
        dtformat=('%Y-%m-%d'),
        datetime=1,
        open=2,
        close=3,
        high=4,
        low=5,
        volume=6
    )
    cerebro.adddata(data)

    cerebro.addstrategy(DualMovingAverageStrategy)
    cerebro.run()

    # Save the plot to PDF so the satellite service can return it (make sure
    # to use the Agg backend)
    cerebro.plot(use='Agg', savefig=True, figfilename='/tmp/backtrader-plot.pdf')

Place this file in your codeload volume, which we mounted inside the satellite service above. Inside the satellite service, the codeload volume will be mounted at /codeload. Reminder: for local deployments, you probably mapped the codeload service to a directory on your host machine containing your code and config files; you'll place the algo file in this directory. For cloud deployments, you probably told codeload to pull your code and config from a Git repo; you'll place the algo file in your Git repo.

Run your backtests

We can now run our backtest from the QuantRocket client. The API for the satellite service lets us execute an arbitrary command and optionally return a file. In our case, we'll execute our algo script and tell QuantRocket to return the PDF performance chart that our script will create.

$ # We've placed our dual_moving_average.py script in a 'backtrader' folder in our
$ # codeload volume, and the codeload volume is mounted inside the Docker
$ # container at /codeload, so the path to our script inside the container is
$ # /codeload/backtrader/dual_moving_average.py
$ quantrocket satellite exec 'python /codeload/backtrader/dual_moving_average.py' --return-file '/tmp/backtrader-plot.pdf' --outfile 'backtrader-plot.pdf'
$ # now we can have a look at backtrader-plot.pdf
>>> from quantrocket.satellite import execute_command
>>> # We've placed our dual_moving_average.py script in a 'backtrader' folder in our
>>> # codeload volume, and the codeload volume is mounted inside the Docker
>>> # container at /codeload, so the path to our script inside the container is
>>> # /codeload/backtrader/dual_moving_average.py
>>> execute_command("python /codeload/backtrader/dual_moving_average.py",
                    return_file="/tmp/backtrader-plot.pdf",
                    filepath_or_buffer="backtrader-plot.pdf")
>>> # now we can have a look at backtrader-plot.pdf
$ # We've placed our dual_moving_average.py script in a 'backtrader' folder in our
$ # codeload volume, and the codeload volume is mounted inside the Docker
$ # container at /codeload, so the path to our script inside the container is
$ # /codeload/backtrader/dual_moving_average.py
$ curl -X POST 'http://houston:1969/satellite/commands?cmd=python%20%2Fcodeload%2Fbacktrader%2Fdual_moving_average.py&return_file=%2Ftmp%2Fbacktrader-plot.pdf' > backtrader-plot.pdf
$ # now we can have a look at backtrader-plot.pdf

Running out of memory

There's no way around it: quantitative trading requires lots of memory. Exactly how much memory depends on a variety of factors including the number of securities in your research and backtests, the depth and granularity of your data, the number of data fields, and your analytical techniques. While QuantRocket's historical and fundamental data services are designed to efficiently serve data without loading all of it into memory, in your research and backtests you will typically load large amounts of data into memory. Sooner or later, you may run out of memory.

QuantRocket makes no attempt to prevent you from loading more data than your system can handle; it tries to do whatever you ask. Luckily, because QuantRocket runs inside Docker, running out of memory won't crash your whole deployment or the host OS; Docker will in most cases simply kill the process that tried to use too much memory. By the nature of out of memory errors, QuantRocket doesn't have a chance to provide an optimal error message, so it's worth knowing what to look for.

If you run out of memory in a Jupyter notebook, Docker will kill the kernel process and you'll probably see a message like this:

Jupyter Notebooks killed process

If you run out of memory in a backtest, you'll get a 502 error referring you to flightlog, which will instruct you to add more memory or reduce your backtest:

$ quantrocket moonshot backtest 'big-boy' --start-date '2000-01-01'
msg: 'HTTPError(''502 Server Error: Bad Gateway for url: http://houston:1969/moonshot/backtests?strategies=big-boy&start_date=2000-01-01'',
  ''please check the logs for more details'')'
status: error
$ quantrocket flightlog stream --hist 1
2017-10-02 19:29:32 quantrocket.moonshot: ERROR the system killed the worker handling the request, likely an Out Of Memory error; please add more memory or try a smaller request

You can use Docker to check how much memory is available and how much is being used by different containers:

$ docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"
NAME                                CPU %               MEM USAGE / LIMIT
quantrocket_moonshot_1              0.01%               58.25MiB / 7.952GiB
quantrocket_jupyter_1               0.01%               64.86MiB / 7.952GiB
quantrocket_zipline_1               0.01%               65.26MiB / 7.952GiB
quantrocket_master_1                0.02%               31.85MiB / 7.952GiB
...

Trading

Scheduling

You can use QuantRocket's cron service, named "countdown," to schedule automated tasks such as fetching historical data or running your trading strategies.

You can pick the timezone in which you want to schedule your tasks, and you can create as many countdown services as you like. If you plan to trade in multiple timezones, consider creating a separate countdown service for each timezone where you will trade.

When scheduling cron jobs, it's easiest to schedule the jobs in the timezone of the exchange they relate to. For example, if you want to download stock loan data for Australian stocks every day at 9:45 AM before the market opens at 10:00 AM local time, it's better to schedule this in Sydney time than in, say, New York time. Scheduling in New York time would require you to adjust the crontab times several times per year whenever there is a daylight savings changes in New York or Sydney. By scheduling the cron job in Sydney time, you never have to worry about this. If you also have other cron jobs that need to be anchored to another timezone, run a separate countdown service for those jobs.

Add a countdown service

To add a countdown service to an existing deployment, use the configuration wizard to define the name and timezone of your countdown service:

Countdown configuration wizard

Copy the block of YAML for the countdown service from the configuration wizard and paste it at the bottom of your Docker Compose or Stack file. As an example, if you define a countdown service running in New York time, the block of YAML might look like this:

countdown-newyork:
  image: 'quantrocket/countdown:1.1.0'
  environment:
    SERVICE_NAME: countdown-newyork
    TZ: America/New_York
  volumes_from:
    - codeload

You can then deploy the new service. For local deployments:

$ cd /path/to/docker-compose.yml
$ docker-compose -p quantrocket up -d

Create your crontab

You can create and edit your crontab within the Jupyter environment. The countdown service uses a naming convention to recognize and load the correct crontab. In the above example of a countdown service named countdown-newyork, the service will look for and load a crontab named quantrocket.countdown-newyork.crontab. The expected filename is displayed in the configuration wizard when you first define the service. This file should be created in the top-level of your codeload volume, that is, in the top level of your Jupyter file browser.

create crontab

After you create the file, you can add cron jobs as on a standard crontab. An example crontab is shown below:

# Crontab syntax cheat sheet
# .------------ minute (0 - 59)
# |   .---------- hour (0 - 23)
# |   |   .-------- day of month (1 - 31)
# |   |   |   .------ month (1 - 12) OR jan,feb,mar,apr ...
# |   |   |   |   .---- day of week (0 - 6) (Sunday=0 or 7)  OR sun,mon,tue,wed,thu,fri,sat
# |   |   |   |   |
# *   *   *   *   *   command to be executed

# Fetch historical data Monday-Friday evenings at 5:30 PM
30 17 * * 1-5 quantrocket history fetch 'nasdaq-1d'
# Fetch fundamental data on Sunday afternoons
0 14 * * 7 quantrocket fundamental fetch-financials -u 'nasdaq'

Each time you edit the crontab, the corresponding countdown service will detect the change and reload the file.

Validate your crontab

Whenever you save your crontab, it's a good idea to have flightlog open (quantrocket flightlog stream) so you can check that it was successfully loaded by the countdown service:

2018-02-21 09:31:57 quantrocket.countdown-newyork: INFO Successfully loaded quantrocket.countdown-newyork.crontab

If there are syntax errors in the file, it will be rejected:

2018-02-21 09:32:38 quantrocket.countdown-newyork: ERROR quantrocket.countdown-newyork.crontab is invalid, please correct the errors:
2018-02-21 09:32:38 quantrocket.countdown-newyork: ERROR     new crontab file is missing newline before EOF, can't install.
2018-02-21 09:32:38 quantrocket.countdown-newyork: ERROR

You can also use the client to print out the crontab installed in your container so you can verify that it is as expected:

$ quantrocket countdown crontab countdown-newyork
>>> from quantrocket.countdown import get_crontab
>>> get_crontab("countdown-newyork")
$ curl -X GET 'http://houston:1969/countdown-newyork/crontab'

Assuming your crontab is free of syntax errors and loaded successfully, there might still be errors when your commands run and you will want to know about those. You can monitor flightlog for this purpose, as any errors returned by the unattended commands will be logged to flightlog. Setting up flightlog's Papertrail integration works well for this purpose as it allows you to monitor anywhere and set up alerts.

Account Monitoring

QuantRocket keeps track of your IB account balances and of exchange rates between your IB base currency and other currencies you might trade.

IB account balances

Whenever you're connected to IB, QuantRocket pings IB every few minutes and saves your latest account balance details to your database. One reading per day (if available) is retained permanently to provide a historical record of your account balances over time.

You can query your latest account balance through QuantRocket without having to open Trader Workstation. IB provides many account-related fields, so you might want to limit which fields are returned. This will check your Net Liquidation Value (IB's term for your account balance):

$ quantrocket account balance --latest --fields 'NetLiquidation' --pretty
Account     Currency    NetLiquidation  LastUpdated
---------- ---------- -------------- ------------------- DU123456 USD 500000.0 2017-12-28 16:48:03
>>> from quantrocket.account import download_account_balances
>>> import io
>>> f = io.StringIO()
>>> download_account_balances(f, latest=True, fields=["NetLiquidation"], output="txt")
>>> print(f.getvalue())
Account     Currency    NetLiquidation  LastUpdated
---------- ---------- -------------- ------------------- DU123456 USD 500000.0 2017-12-28 16:48:03
$ curl 'http://houston:1969/account/balances.txt?latest=true&fields=NetLiquidation'
Account     Currency    NetLiquidation  LastUpdated
---------- ---------- -------------- ------------------- DU123456 USD 500000.0 2017-12-28 16:48:03
Or you can download a CSV of your available account balance history:
$ quantrocket account balance --outfile balances.csv
>>> from quantrocket.account import download_account_balances
>>> download_account_balances("balances.csv")
>>> balances = pd.read_csv("balances.csv")
$ curl 'http://houston:1969/account/balances.csv' > balances.csv

Using the CLI, you can filter the output to show only accounts where the margin cushion is below 5%, and log the results (if any) to flightlog:

$ quantrocket account balance --latest --below 'Cushion:0.05' --fields 'NetLiquidation' 'Cushion' --pretty | quantrocket flightlog log --name 'quantrocket.account' --level 'CRITICAL'

If you've set up Twilio alerts for CRITICAL messages, you can add this command to the crontab on one of your countdown services, and you'll get a text message whenever you're at risk of auto-liquidation by IB. If no accounts are below the cushion, nothing will be logged.

Exchange rates

To support currency conversions between your base currency and other currencies you might trade, QuantRocket fetches daily exchange rates and stores them in your database. Exchange rates come from the European Central Bank, which updates them each business day at 4 PM CET.

You probably won't need to query the exchange rates directly very often, but you can if needed. You can check the latest exchange rates:

$ quantrocket account rates --latest --pretty
BaseCurrency  QuoteCurrency  Rate        Date
------------ ------------- ---------- ---------- USD AUD 1.2774 2018-01-09 USD CAD 1.2425 2018-01-09 USD CHF 0.98282 2018-01-09 ...
>>> from quantrocket.account import download_exchange_rates
>>> import io
>>> f = io.StringIO()
>>> download_exchange_rates(f, latest=True, output="txt")
>>> print(f.getvalue())
BaseCurrency  QuoteCurrency  Rate        Date
------------ ------------- ---------- ---------- USD AUD 1.2774 2018-01-09 USD CAD 1.2425 2018-01-09 USD CHF 0.98282 2018-01-09 ...
$ curl 'http://houston:1969/account/rates.txt?latest=true'
BaseCurrency  QuoteCurrency  Rate        Date
------------ ------------- ---------- ---------- USD AUD 1.2774 2018-01-09 USD CAD 1.2425 2018-01-09 USD CHF 0.98282 2018-01-09 ...
Or download a CSV of all exchange rates stored in your database:
$ quantrocket account rates --outfile rates.csv
>>> from quantrocket.account import download_exchange_rates
>>> download_exchange_rates("rates.csv")
>>> rates = pd.read_csv("rates.csv")
$ curl 'http://houston:1969/account/rates.csv' > rates.csv

Logging

Stream logs in real-time

The best way to monitor your logs is through Papertrail (see the configuration wizard for details). But you can also stream your logs, tail -f style, from flightlog:

$ quantrocket flightlog stream
2017-01-18 10:19:31 quantrocket.flightlog: INFO Detected a change in flightlog configs directory, reloading configs...
2017-01-18 10:19:31 quantrocket.flightlog: INFO Successfully loaded config
2017-01-18 14:25:57 quantrocket.master: INFO Requesting contract details for error 200 symbols

Flightlog provides application-level monitoring of the sort you will typically want to keep an eye on. For more verbose, low-level system logging which may be useful for troubleshooting, you can stream logs from the logspout service:

$ quantrocket flightlog stream --detail
quantrocket_houston_1|172.21.0.1 - - [18/Jan/2017:10:14:48 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2017-01-18 10:19:31 quantrocket.flightlog: INFO Detected a change in flightlog configs directory, reloading configs...
2017-01-18 10:19:31 quantrocket.flightlog: INFO Successfully loaded config
2017-01-18 14:25:57 quantrocket.master: INFO Requesting contract details for error 200 symbols
test_houston_1|2017/01/18 20:59:01 [error] 5#5: *17137 open() "/usr/local/openresty/nginx/html/invalidpath" failed (2: No such file or directory), client: 172.20.0.8, server: localhost, request: "GET /invalidpath HTTP/1.1", host: "houston"

Download log files

In addition to streaming your logs, you can also download log files, which contain up to 7 days of log history. You can download the application logs:

$ quantrocket flightlog get /path/to/localdir/app.log

Or you can download the more verbose system logs:

$ quantrocket flightlog get --detail /path/to/localdir/system.log

Papertrail integration

Papertrail is a log management service that lets you monitor logs from a web interface, flexibly search the logs, and send alerts to other services (email, Slack, PagerDuty, webhooks, etc.) based on log message criteria. You can configure flightlog to send your logs to your Papertrail account.

To get started, sign up for a Papertrail account (free plan available).

In Papertrail, locate your Papertrail host and port number (Settings > Log Destinations).

Use the QuantRocket configuration wizard to enter your Papertrail configuration:

Papertrail configuration wizard

Copy the block of YAML for the flightlog service from the configuration wizard and paste it into your Docker Compose or Stack file, replacing the existing flightlog YAML block. An example YAML block is shown below:

flightlog:
  image: 'quantrocket/flightlog:1.1.0'
  volumes:
    - /var/log/flightlog
  environment:
    PAPERTRAIL_HOST: logsX.papertrailapp.com
    PAPERTRAIL_PORT: 'XXXXX'
    PAPERTRAIL_LOGLEVEL: DEBUG

Redeploy the flightlog service. For local deployments:

$ cd /path/to/docker-compose.yml
$ docker-compose -p quantrocket up -d flightlog

You can log a message from the CLI to test your Flightlog configuration:

$ quantrocket flightlog log "this is a test" --name myapp --level INFO

Your message should show up in Papertrail:

Papertrail log message

You can set up alerts in Papertrail based on specific log criteria. For example, you could configure Papertrail to email you whenever new ERROR-level log messages arrive.

Send log messages

You can use the Python client to log to Flightlog from your own code:

import logging
from quantrocket.flightlog import FlightlogHandler

logger = logging.getLogger('myapp')
logger.setLevel(logging.DEBUG)
handler = FlightlogHandler()
logger.addHandler(handler)

logger.info('my app just opened a position')

You can also log directly from the CLI (this is a good way to test your Flightlog configuration):

$ quantrocket flightlog log "this is a test" --name myapp --level INFO

If you're streaming your logs, you should see your message show up:

Output
2018-02-21 10:59:01 myapp: INFO this is a test

Log command output

The CLI can accept a log message over stdin, which is useful for piping in the output of another command. In the example below, we check our balance with the --below option to only show account balance info if the cushion has dropped too low. If the cushion is safe, the first command produces no output and nothing is logged. If the cushion is too low, the output is logged to flightlog at a CRITICAL level:

$ quantrocket account balance --latest --below 'Cushion:0.02' --fields 'NetLiquidation' 'Cushion' --pretty | quantrocket flightlog log --name 'quantrocket.account' --level 'CRITICAL'

If you've set up Twilio alerts for CRITICAL messages, you can add this command to the crontab on one of your countdown services, and you'll get a text message whenever there's trouble.

Performance Tracking

Database Management

Backing up your databases

QuantRocket can backup your databases to Amazon S3 (Amazon account required). Provide your AWS credentials to the db service as environment variables (see the configuration wizard for guidance). You can then backup your databases, either for a specific service or for all services, using the "all" keyword as in this example:

$ quantrocket db s3push all
status: the databases will be pushed to S3 asynchronously

If the same database already exists in S3, it will be overwritten by the new version of the database. If you wish to keep multiple versions, you can enable versioning on your S3 bucket.

You can use your crontab to automate the backup process. It's also good to optimize your databases periodically, preferably when nothing else is using them. For example:

0 0 * * * quantrocket db s3push all && quantrocket db optimize all

Restoring databases from backup

You can restore backups from S3 to your QuantRocket deployment:

$ quantrocket db s3pull all
status: the databases will be pulled from S3 asynchronously

In fact, if you provide AWS credentials, this command will automatically be run when the db service launches, allowing you to redeploy QuantRocket with databases backed up from an earlier deployment.

Working directly with databases

QuantRocket uses SQLite as its database backend. SQLite is fast, simple, and reliable. SQLite databases are ordinary disk files, making them easy to copy, move, and work with. If you want to run SQL queries directly against your databases, you can use the sqlite3 command line tool, either within the Docker container or on a separate downloaded copy of the database.

The safest way to run SQL queries against your databases is to first copy the database. You can list the available databases, then download the one you care about:

$ # list databases
$ quantrocket db list
quantrocket.blotter.orders.sqlite
quantrocket.history.nyse-lrg.sqlite
quantrocket.history.nyse-mid.sqlite
quantrocket.history.nyse-sml.sqlite
quantrocket.master.main.sqlite
$ # download a copy of one
$ quantrocket db get quantrocket.history.nyse-lrg.sqlite /tmp/quantrocket.history.nyse-lrg.sqlite

Now you can safely use sqlite3 to explore and analyze your copy of the database.

$ sqlite3 /tmp/quantrocket.history.nyse-lrg.sqlite
sqlite>

Databases use the following naming convention: quantrocket.{service}.{code}.sqlite The {code} portion of the database name is unique for the service but not necessarily unique across all services, and is used as shorthand for specifying the database in certain parts of QuantRocket (for example, fetching of historical data is triggered by specifying the database code).
Alternatively, to run queries inside the Docker container, use docker exec to open a shell inside the container (or click "Terminal" on the container page if running in Docker Cloud). You can then list the databases which are located in /var/lib/quantrocket, and open a sqlite3 shell into the database you're interested in:

$ docker exec -ti quantrocket_db_1 bash
root@71d2acb4d10c:/$ ls /var/lib/quantrocket
quantrocket.blotter.orders.sqlite
quantrocket.history.nyse-lrg.sqlite
quantrocket.history.nyse-mid.sqlite
quantrocket.history.nyse-sml.sqlite
quantrocket.master.main.sqlite
root@71d2acb4d10c:/$ sqlite3 /var/lib/quantrocket/quantrocket.history.nyse-lrg.sqlite
sqlite>

SQLite supports unlimited concurrent reads but only one writer at a time. Be careful if you choose to work directly with your QuantRocket databases, as you could break QuantRocket by doing so. You should limit yourself to SELECT queries. A safer approach is to copy the database as described above.

Exporting all databases

Databases are stored inside a Docker volume, a special Docker-managed area of the filesystem. If you want to export all of your databases, the easiest way is to use Docker's cp command to copy the entire database directory to your host machine:

$ docker cp quantrocket_db_1:/var/lib/quantrocket exported_quantrocket_dbs

Advanced Topics

Custom Docker services

If you run your own custom Docker services inside the same Docker network as QuantRocket, and those services provide an HTTP API, you can access them through houston. Assuming a custom Docker service named secretsauce listening on port 80 inside the Docker network and providing an API endpoint /secretstrategy/signals, you can access your service at:

$ curl -X GET 'http://houston:1969/proxy/http/secretsauce/80/secretstrategy/signals'

Houston can also proxy services speaking the uWSGI protocol:

$ curl -X GET 'http://houston:1969/proxy/uwsgi/secretsauce/80/secretstrategy/signals'

The benefit of using houston as a proxy, particularly if running QuantRocket in the cloud, is that you don't need to expose your custom service to a public port; your service is only accessible from within your trusted Docker network, and all requests from outside the network must go through houston, which you can secure with SSL and Basic Auth. The following table depicts an example configuration:

This service... ...exposes this port to other services in the Docker network.. ...and maps it to this port on the host OS... ..making this service directly reachable from outside
houston 443 and 80 443 (80 not mapped) yes
secretsauce 80 not mapped no

So you would connect to houston securely on port 443 and houston would connect to secretsauce on port 80, but you would not connect directly to the secretsauce service. Your service would use EXPOSE 80 in its Dockerfile but you would not use the -p/--publish option when starting the container with docker run (or the ports key in Docker Compose).

HTTP request concurrency

The number of workers available to handle HTTP requests in a QuantRocket service is set via environment variable and can be overridden. If you have a very active deployment, you might find it beneficial to increase the number of workers (at the cost of greater resource consumption). First, check the current number of workers:

$ docker exec quantrocket_master_1 env | grep UWSGI_WORKERS
UWSGI_WORKERS=3

Override the variable by setting the desired value in your Compose file or Stack file:

# docker-compose.yml
master:
    image: 'quantrocket/master:latest'
    environment:
        UWSGI_WORKERS: 5

Then redeploy the service:

$ docker-compose -f docker-compose.yml -p quantrocket up -d master

CLI output format

By default, the command line interface (CLI) will display command results in YAML format:

$ quantrocket launchpad status
ibg1: stopped
ibg2: running
ibg3: stopped

If you prefer the output format to be JSON, set an environment variable called QUANTROCKET_CLI_OUTPUT_FORMAT:

$ export QUANTROCKET_CLI_OUTPUT_FORMAT=json
$ quantrocket launchpad status
{'ibg1': 'stopped', 'ibg2': 'running', 'ibg3': 'stopped'}