Deep Q-learning test results for algorithmic trading

In the previous post, we trained our deep Q-network using Litecoin historical data. In this post, we are going to learn how to test the network using a test data. We will also see whether we can make money or not using AI. Before starting, I highly recommend you to read the previous post, because you will need a trained network in order to test the trading data. The link of the previous post is:

Now, let’s have a look at the close price history of the traning data. It’s the Litecoin data recorded between the dates 10 July, 2019 and 10 Aug, 2019. The data frequency is 1 hour.

Litecoin Historical Data

I assumed that you trained your network and you have the weights file named ‘ltcusdt-1hour-weights.h5f’. Now, it is time to test the network. For testing, it is extremely important that you use a different data set. This is because we need to make sure that the performance of the network on the training data is not due to overfitting. Therefore, we are going to use Bitcoin data which is also recorded between the dates dates 10 July, 2019 and 10 Aug, 2019. You may also use some other coin data or Litecoin data in a different time interval. For this you may have a look at the following posts:

I assumed that you have already obtained the file ‘btcusdt-1hour.csv’ for testing. Let’s get started. In your working directory, create a file named ‘’. Using your favorite editor, copy and paste the following code block:

from binanceFeatures import FeatureExtractor
from binanceCreateModel import create_model
from keras.optimizers import Adam
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from rl.agents.dqn import DQNAgent
from rl.memory import SequentialMemory
import gym
from gym.envs.registration import register, make

ENV_NAME = 'binanceEnvironment-v0'

if ENV_NAME in gym.envs.registry.env_specs:
    del gym.envs.registry.env_specs[ENV_NAME]

env = make(ENV_NAME)

input_file = 'btcusdt-1hour.csv'
output_file = 'btcusdt-1hour-out.csv'
w_file_name = 'ltcusdt-1hour-weights.h5f'

feature_extractor = FeatureExtractor(input_file, output_file)

feature_list = feature_extractor.get_feature_names()

trade_cost = 0.5
env.init_file(output_file, feature_list, trade_cost, False)

model = create_model(env)
memory = SequentialMemory(limit=5000, window_length=1)

dqn = DQNAgent(model=model, nb_actions=env.action_size, memory=memory, 
               nb_steps_warmup=50, target_model_update=1e-2, policy=None)
dqn.compile(Adam(lr=1e-3), metrics=['mse'])


dqn.test(env, nb_episodes=1, action_repetition=1, callbacks=None,
     visualize=True, nb_max_episode_steps=None,
     nb_max_start_steps=0, start_step_policy=None, verbose=1)

fig = plt.figure()
gs = gridspec.GridSpec(2, 1, figure=fig)
ax1 = fig.add_subplot(gs[0, 0])
ax1.plot(env.df['close'],'-b',linewidth = 0.5)
ax1.plot(env.df['close'][env.df['action']>0],'og', label='Buy',markersize = 1)
ax1.plot(env.df['close'][env.df['action']<0],'or', label ='Sell',markersize = 1)

ax2 = fig.add_subplot(gs[1, 0])
ax2.plot(env.df['profit'][:-1],'-r',linewidth = 0.5)
ax2.set_ylabel('Profit (%)')


Now, run the script ‘’ and you will see a result like the following:

Testing for 1 episodes ...
episode: 1, time: 0, gain: 0.00, trades: 0
episode: 1, time: 100, gain: 1.25, trades: 21
episode: 1, time: 200, gain: 21.75, trades: 51
episode: 1, time: 300, gain: 33.86, trades: 68
episode: 1, time: 400, gain: 42.28, trades: 89
episode: 1, time: 500, gain: 41.13, trades: 108
episode: 1, time: 600, gain: 54.61, trades: 123
episode: 1, time: 700, gain: 64.38, trades: 147

In this result, ‘gain’ indicates the percent profit and ‘trades’ indicates the number of trades. You will also see a plot results like the following:

Historical Bitcoin price data, buy/cell actions, and the profit

In the first figure, you see the price history as well as the buy/sell instances. In the second figure, you see how your profit increases with time. Linear increase in the figure shows that although falls and fluctuations take place in the test data, the algorithm continues to make profit which you should expect from a good predictor. Now, we can say that our DQN agent performs quite well and at the end of one month period, we make about 64 percent profit using 147 trades (buy/sell). Cryptocurrency market Binance applies 0.1% percent trading fee. In our example, if you had invested 1000$, after one month you would have obtained 1643 – 147 = 1496$ which is quite satisfactory result. In the next post, we are going to discuss how we can construct more stable networks and how we can increase our profit. That’s all for today. Enjoy your trading!

Sharing is caring!

5 thoughts on “Deep Q-learning test results for algorithmic trading

  1. i am getting an error
    Traceback (most recent call last):
    File “C:/Users/PycharmProjects/crypto trading/”, line 50, in
    gs = gridspec.GridSpec(2, 1, figure=fig)
    TypeError: __init__() got an unexpected keyword argument ‘figure’
    with :
    fig = plt.figure()
    gs = gridspec.GridSpec(2, 1, figure=fig)

    any idea how to make the plot work?
    also where do you split the data for train and test?

    1. Please make sure that matplotlib is installed on your computer. You don’t need to split the data because you can retrieve them separately.

  2. Hi Matoksoz,
    i tried running your train file but I get the following error (TF/Keras, PyCharm):

    ModuleNotFoundError: No module named ‘binanceFeatures’.

    I do have your in the working directory.
    What am missing here?

  3. TypeError: len is not well defined for symbolic Tensors. (activation_3/Softmax:0) Please call `x.shape` rather than `len(x)` for shape information.

Leave a Reply

Your email address will not be published. Required fields are marked *