SQLite3 is a powerful, efficient, and self-contained C-language library that serves as a SQL database engine. Notably, it's the most widely used database engine globally, thanks to its high reliability and comprehensive feature set. In Python, SQLite3 is accessible through the sqlite3 module, a built-in component that doesn't require external dependencies.
SQLite3 distinguishes itself through its simplicity and seamless integration. Unlike typical client-server database systems such as MySQL or PostgreSQL, SQLite3 operates serverlessly, eliminating the need for a separate server process. Python applications can seamlessly interact with SQLite databases via the sqlite3 module included in the standard Python library.
SQLite3 offers numerous advantages for developers, including:
Zero Configuration: Setup and usage are incredibly straightforward, requiring no installation or administration.
Serverless Architecture: It operates in-process with the application, simplifying overall system architecture.
Cross-Platform Compatibility: Available across all platforms supported by Python.
Atomic Commit and Rollback: Guarantees data integrity.
Proven Reliability: Extensively used across various platforms, from web browsers to mobile devices.
Setting up sqlite3 is generally straightforward since it's part of the Python standard library. To ensure everything is ready, you should have an appropriate version of Python installed, preferably Python 3.x. Here’s how you can check if SQLite3 is available in your Python environment:
import sqlite3
print(sqlite3.version)
print(sqlite3.sqlite_version)
This script displays the versions of both the sqlite3 module and the SQLite database engine it utilizes. If you encounter any issues or if the module isn't available, you might need to update your Python installation.
To interact with an SQLite database in Python, the first step is to establish a connection. This is done using the sqlite3 module’s connect function, which returns a connection object. Here's how to create a connection object:
import sqlite3
conn = sqlite3.connect('example_database.sqlite3')
conn.close()
This method will create a new SQLite database file if it doesn't already exist. If it does, it will connect to the existing database.
To create a new SQLite database, simply connect to a database file that doesn't exist yet:
conn = sqlite3.connect('new_example_database.sqlite3')
conn.close()
When working with SQLite database connections, it's essential to follow best practices:
Use with statements to ensure proper management of resources.
Handle exceptions using try-except blocks.
Always close connections to free up resources.
Adhere to thread safety guidelines by not using a connection object in more than one thread simultaneously.
Regularly back up your database by copying the file.
SQLite3 uses SQL as its query language. Fundamental SQL operations include Create, Read, Update, and Delete (CRUD). In Python, these operations can be performed using the cursor object obtained from the connection object.
Each value stored in an SQLite database (or manipulated by the database engine) has one of the following storage classes:
NULL. The value is a NULL value.
INTEGER. The value is a signed integer, stored in 0, 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
REAL. The value is a floating point value, stored as an 8-byte IEEE floating point number.
TEXT. The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
BLOB. The value is a blob of data, stored exactly as it was input.
For example, here's how you can create a table and insert data:
conn = sqlite3.connect('example_database.sqlite3')
cursor = conn.cursor()
# Creating a table
cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
''')
# Inserting data
cursor.execute("INSERT INTO users (name, email) VALUES (?, ?)", ("John Doe", "[email protected]"))
conn.commit()
conn.close()
To retrieve data from an SQLite database, you can use the cursor’s fetch methods like fetchone()
, fetchall()
, or fetchmany(size)
.
pythonCopy codeconn = sqlite3.connect('example_database.sqlite3')
cursor = conn.cursor()
cursor.execute("SELECT * FROM users")
users = cursor.fetchall()
for user in users:
print(user)
conn.close()
# Import necessary libraries
import yfinance as yf
import pandas as pd
import sqlite3
# Fetch historical data for the NSE Nifty 50 index using yfinance
nifty = yf.Ticker('^NSEI')
df = nifty.history(interval='1d', start="2021-08-01", end="2022-06-23")
# Drop irrelevant columns ('Dividends' and 'Stock Splits') from the DataFrame
df.drop(columns=['Dividends', 'Stock Splits'], inplace=True)
# Convert the data to a pandas DataFrame for further manipulation
df = pd.DataFrame(df)
print(df)
# Connect to SQLite database named 'new.db'. Creates the file if it doesn't exist
conn = sqlite3.connect("new.db")
# The following commented line is used to store DataFrame into SQLite database.
# It's commented out to prevent accidental execution.
# df.to_sql('nifty', conn, if_exists='replace', index=True)
# Retrieve data from the 'nifty' table in the database and store it in a pandas DataFrame
data = pd.read_sql_query("SELECT * FROM nifty", conn)
print(data)
# Close the database connection
conn.close()
This code is specifically optimized for scenarios where we need to store real-time data received from a WebSocket into a database. It's designed to handle the continuous stream of data efficiently, ensuring seamless capturing and storage of live data updates in the database.
pythonCopy codeimport sqlite3
# Establish a connection to the SQLite database named 'time.db'. Creates the file if it doesn't exist.
conn = sqlite3.connect("time.db")
# Create a cursor object to execute SQLite commands.
c = conn.cursor()
# Execute a SQL statement to create a table named 'infy' with specified columns.
# The 'ts' column is set as the primary key, and other columns store real numbers (floating-point).
c.execute("""CREATE TABLE infy(
ts datetime primary key,
Open real,
High real,
Low real,
Close real,
volume real
)""" )
# Insert a record into the 'infy' table. This includes a timestamp and stock market data for that date.
c.execute("""INSERT INTO infy VALUES ('2021-08-02', 15874, 15892, 15834, 15885, 244800)""")
# Retrieve all data from the 'infy' table and print the fetched data.
c.execute("""SELECT * FROM infy""")
print(c.fetchall())
# Commit the current transaction. This saves the changes made to the database.
conn.commit()
# Close the connection to the database to free up resources.
conn.close()
I have developed two specialized functions to streamline database interactions for data stored in pandas DataFrames. The first function, store_df_to_db
, is designed to efficiently transfer data from a dictionary of DataFrames into a specified SQLite database. It ensures that each key-value pair in the dictionary, where the key represents the table name and the value is the corresponding DataFrame, is accurately reflected in the database.
The second function, retrieve_df_from_db
, is crafted to retrieve this data back into a Python environment. It fetches the stored tables from the database and converts them into a dictionary of DataFrames, maintaining the structure and data integrity as in the original DataFrames.
These functions not only facilitate a seamless transition of data between DataFrames and the database but also ensure that the data retrieval process is smooth and error-free, making them invaluable tools for database management in data analysis workflows.
import sqlite3
import pandas as pd
def store_into_database(msg,database_name):
"""
This function stores data into a SQLite database.
Args:
msg (dict): A dictionary where each key is a table name and
each value is a pandas DataFrame to be stored in the database.
The function iterates over the dictionary, printing the name of each DataFrame
and its contents, then storing each DataFrame into a SQLite table with the
same name. If the table already exists, it is replaced.
"""
# Connect to SQLite database (or create it if it doesn't exist)
conn = sqlite3.connect(database_name)
# Iterate over the dictionary items
for name, df in msg.items():
print(name) # Print the name of the DataFrame
print(df) # Print the contents of the DataFrame
# Store the DataFrame into a SQL table with the same name
df.to_sql(name, conn, if_exists='replace', index=True)
# Commit (save) the changes to the database
conn.commit()
def get_from_database(database_name):
"""
This function retrieves all tables from a SQLite database and
returns them as a dictionary of pandas DataFrames.
Returns:
dict: A dictionary where each key is a table name and each value is
a pandas DataFrame containing the data from the corresponding SQL table.
"""
# Connect to SQLite database
con = sqlite3.connect(database_name)
cursorObj = con.cursor()
# Retrieve the names of all tables in the database
cursorObj.execute('SELECT name from sqlite_master where type= "table"')
data = cursorObj.fetchall() # Fetch all the results
option_price_df = {}
# Iterate over each table name
for i in data:
k = i[0] # Table name
# Query the table and store the result into a DataFrame
option_price_df[k] = pd.read_sql_query(f'SELECT * FROM {k}', con)
# Return the dictionary of DataFrames
return option_price_df