%sql = %sql SELECT * FROM irisįinally, let’s create a nice scatter graph with some of the data. The last command shows the end of the frame so we can confirm it has data. Next, let’s connect to ClickHouse and fetch data from the famous Iris data set into a pandas data frame. For now let’s step through the recipe since this likely to be the most common way many users access data from ClickHouse.įirst, let’s load SQLAlchemy and enable the %sql function. There is a sample notebook that shows how to do this easily. The easiest way to work on data from ClickHouse is via the SQLAlchemy %sql magic function. However, the drivers shown above are available on conda-forge which makes them easy to use with Anaconda. There are other Python drivers available such as the sqlalchemy-clickhouse driver developed by Marek Vavrusa and others. I do this regularly to top up missing libraries. Tip: you can run these commands to load modules while Jupyter is already running. Now when you start Jupyter with the ‘base’ environment you’ll have ClickHouse drivers available for import. # List environments and pick ‘base’ environment.Ĭonda install -c conda-forge clickhouse-driverĬonda install -c conda-forge clickhouse-sqlalchemy This example uses the ‘base’ environment. (If not, read the Anaconda docs and come back.) To use the ClickHouse drivers you’ll want to run conda commands similar to the following to bring them into your environment. We’ll assume you know how to run Jupyter from Anaconda Navigator. You can run Jupyter Notebooks directly from the command line but like most people I run them using Anaconda. The remainder of this blog contains tips to help you integrate ClickHouse data to your notebooks. The results are now published on Github at. I have spent the last several weeks playing around with Jupyter Notebooks using two community drivers: clickhouse-driver and clickhouse-sqlalchemy. Notebooks are so ubiquitous that it’s hard to think of manipulating data in Python without them.ĬlickHouse support for Jupyter Notebooks is excellent. The code output includes not just text output but also graphs from powerful libraries like matplotlib and seaborn. They can invoke Python libraries for numerical processing, machine learning, and visualization. For those unfamiliar with them, notebooks are documents that contain runnable code snippets mixed with documentation. Jupyter Notebooks are an indispensable tool for sharing code between users in Python data science.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |