Member-only story
This is probably written several times — in case you haven’t come across these, here are the top 14 notebook tips you need to start incorporating into your notebooking .
1) Pip install (or any shell command) through the notebook
Usually one does not need to swap to a shell to install packages. A quick way of installing a missing package when you’re copy-pasting code is to run the shell command on a notebook. Try %pip install
or !pip install
%pip install pandas
!pip install numpy
2) Share notebook for non-Python users
The easiest way to share your notebook is simply using the notebook file (.ipynb), but for those who don’t use Jupyter, you have a few options:
- Convert notebooks to HTML files using the
File > Download as > HTML
Menu option. - Upload to GitHub and one can view the file as HTML there
3) Suppress warnings in Notebook (not recommended)
Notebooks come with several warnings that make them less readable. You could temporarily try to suppress them.
import warnings
warnings.filterwarnings("ignore")
4) Increase resolution of your Matplotlib plots
Add this at the beginning of the notebook, to increase your DPI to 300. Too many data scientists use low-resolution and poor dpi images for their plots that are hard to see. Finally, you can get high-resolution plots in your notebooks instead of blurred regular outputs.
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
5) Measure the time taken to run your cell
This is super helpful when you’re trying to profile your code and see what parts are taking more time. Add magic statement `%%time`before your cell. Remember that a single %
is for the line while %%
is for the cell.
%%time
clf = tree.DecisionTreeRegressor().fit(X_train, y_train)
res = clf.predict(X_test)
print(res)
PS: %%timeit
uses the Python timeit module which runs a statement 100,000 times (by default) and then provides the mean of the fastest three times.