using-redisql-with-python
Last updated
Was this helpful?
Last updated
Was this helpful?
This tutorial will help you to get start to use RediSQL with Python3.
In this tutorial we will scrape the content of Hacker News using their .
We will use async python with to manage the event loop, to retrieve data from a public API and to communicate with Redis.
To follow this tutorial you will need a modern (> v4.0) instance of Redis running RediSQL. You can obtain RediSQL from or from the .
To load RediSQL is sufficient to pass it as argument to the redis-server: ./redis-server --loadmodule /path/to/redisql.so
The whole code show in this example is reachable while we also created a that stress much more the infrastructure to show that the bottle neck is not RediSQL but python and the network.
Most Redis library implements methods to call the standard Redis command like SET
or GET
or RPOP
and aioredis is not an exception. This is generally a problem for Redis modules like RediSQL that instead defined their own commands. Fortunately most libraries usually expose also a lower level method that is used to implement most of the other Redis command. For what concern aioredis
the lower level method that we can use is .execute
that is implemented for both and for a .
Indeed is possible to implement all the other high level command using the low level .execute
method.
While in this article we will talk about aioredis
, another, not asynchronous library for using Redis with python is .
In the redis
library, the low level method is .execute_command
and not .execute
as for aioredis
, other than this difference everything will apply just the same.
The very first thing to do is to connect to Redis, in our case we use a connection pool that has the same interface of a simple connection but is backed by a pool of different connections.
Creating the pool can be done like so:
and now the variable redis
refer to a aioredis
pool.
When we will need a new connection, the pool will either give us an idle connection or open a new connection to Redis and give us the new one.
Now that we have a pool of connections before to get the data into RediSQL we need to set up RediSQL. The first step is to create a database in RediSQL, this can be done easily with a call like
this call will create a new RediSQL database and it will call it HN
. If the key HN
already exists the call will return an error.
The next step is to create the structure to hold our data. In our case we will stick to something simple, a single table where we store the identifier of each item (comment or story) from HN, the author of such item, when the item was created and finally we will store the whole item as json structure in a text field.
Finally, since we storing data from the open internet inside our database, is wise to create an SQL statement to execute when doing an insert. The advantage of the statement is that is safe from SQL injections and is usually faster than re-compile the same query each time.
To create a statement we can proceed as following:
The last command create a new statement in the HN
database and associate it with the string insert_item
so that we can refer to it later.
Is usually wise to wrap those command into a try: except:
block. Hence the final function will look like this:
Now that we have set up our environment we can go on and start to listen for new items posted on HN.
We repeat the loop forever with a sleep to avoid hammering the API endpoint.
The last interesting bit is about the store_item
function that is the one that download the item from the API and store it into RediSQL.
Downloading the item from the API is a simple HTTP GET request, then we simply check if it returns a successful status code and that it actually returns valid json.
Finally to store the element into RediSQL we execute the statement that we have create before during the set up phase. Indeed we are executing the command REDISQL.EXEC_STATEMENT HN insert_item $item_id $item_author $item_time $item
. This command will find inside the database HN
the statement insert_item
that we have previously defined as INSERT INTO hn VALUES(?1, ?2, ?3, json(?4));
. Now the item id will be substituted to ?1
, the item author will substitute ?2
, the creation time of the item will take the place of ?3
and the whole json string of the item will substitute ?4
, finally the statement is executed agains RediSQL and its result returned.
If everything went right, we have just added our first row to the database using async python.
In this tutorial we took a rather simple problem and we use it to show how to use RediSQL with async python.
We started by setting up the database, we show how to create a database and tables inside it. Then we also show how to create statements to avoid SQL injections attack and improve the efficiency of repeated queries.
Then we obtain the data from the Hacker News API and we show how to insert the data into RediSQL using the statement that we have just created.
Hopefully this tutorial will be helpful and sufficient to get started, but if you have any question feel free to get in touch or to open an issue.
Also note the use of the json(?4)
function, this is a function provided by the of SQLite and exposed by RediSQL that allow fast and efficient manipulation of json object. Using the JSON1 module is possible to have a lot of flexibility even inside a rigid SQL schema.
The API provides a simple endpoint that returns the id of the latest item posted on HN. When the loop start we get maxitem and we store it into Redis. Then, when the maxitem get updated we download each of the items between the old maxitem
and the new maxitem
.
If you wish to see a similar tutorial for a different language,