Getting started with SQLite – Pros, cons, and tips
What is SQLite?
SQLite is one of my favorite go-to SQL languages when I have limited time and a lot of data to go through. It is a relational database management system similar to MySQL, PostgreSQL, Oracle, or Microsoft SQL Server. That said, it differs from all other SQL languages since it uses a dynamic type system, meaning a value stored in a column determines its data type, and not the column’s data type. You can also interact with an SQLite database using Java, Python, PHP, and Node.js.
I’d like to share some useful tips and tricks for using SQLite. While the language has big advantages, it also has some limitations. Let’s review the pros and cons!
Pros of SQLite
The biggest advantage of SQLite is ease of use - you can set it up on any machine (even a cellphone!), and it doesn’t require much configuration. Setup is fast and easy, and using the language is simple. You don’t need to worry about a data center or a powerful network, and it runs very fast.
SQLite version 184.108.40.206 2015-05-20 18:17:19 Enter “.help” for usage hints. Connected to a transient in-memory database. Use “.open FILENAME” to reopen on a persistent database. sqlite> .separator ‘,’ sqlite> .import SQLite_Testdata.csv test sqlite> select country, count(user_id) n from test group by country limit 5; AU,2 FR,2 ID,2 IN,3 Sqlite>
In the code above I use the SQLite3 command-line shell application, and I’m constantly amazed at how easy it is to import a file with any type of large dataset and query against it. After you open a file you already can query it without declaring specific data type for a column.
Another pro of SQL is that a column in SQLite can store different data types. Therefore, if your dataset hasn’t been cleaned, you still can open and query it. Be mindful with sorting the values though, because different data types might affect the order of your results.
Cons of SQLite
As a data analyst, the biggest disadvantage I run into with SQLite is its primitive syntax and formatting limitations. In contrast to PostgreSQL or MySQL, SQLite doesn’t support as many functions (that said, most window functions were added in SQLite 3.25). It’s easy to get stuck when using SQLite when you have to do more complex data manipulations (like multiple concurrent writing operations). Also, any file IMPORT reads data only as TEXT, and it’s not as straightforward to reformat it with basic limited syntax:
SQLite version 220.127.116.11 2015-05-20 18:17:19 Enter “.help” for usage hints. Connect to a transient in-memory database. Use “.open FILENAME” to reopen on a persistent database. sqlite> .separator ‘,’ sqlite> .import SQLite_Testdata.csv test sqlite> .schema test CREATE TABLE test( “Account_id” TEXT, “User_id” TEXT, “Created_at” TEXT, “Country” TEXT, “Amount” TEXT ); sqlite>
For every file import, SQLite reads data columns as TEXT. This means that you have to cast or convert data to allow aggregate or manipulate it. Here is one of the tricks on how to convert from $ to INT:
-- convert `$5.00` to cents `500` query sqlite> select replace(replace(Amount, '$', ''), '.', '') from mrr limit 15; SQLite version 18.104.22.168 2015-05-20 18:17:19 Enter “.help” for usage hints. Connected to a transient in-memory database. Use “.open FILENAME” to reopen on a persistent database. sqlite> “.separator ‘,’ sqlite> .import SQLite_Testdata.csv test sqlite> select amount from test limit 5; $5.00 $6.00 $7.00 $5.00 $6.00 sqlite> select replace(replace(Amount, ‘$’, ‘’), ‘.’, ‘’) from test limit 5; 500 600 700 500 600 sqlite>
There are other ways to work with datatypes in SQLite, including converting a string into an integer and vice versa.
Another con is that SQLite doesn’t support Date and Time classes. That being said, there are built-in Date and Time functions that can be helpful. In addition, you can create a customized VIEW, and that can make using SQLite easier.
I highly recommend exploring SQLite and use it for EDA (Exploratory Data Analysis) and other data analysis that involves querying against large datasets. It’s a great tool for handling large amounts of data and shows promise compared to other similar tools.
Olga Berezovsky is a Senior Data Analyst. She has extensive experience in the Big Data industry—specifically in data acquisition, transformation, and analysis—and deep expertise in building quantitative and qualitative user profile analysis that reveals user insights and behavior.
Follow Olga on LinkedIn