# SageMathCloud Blog

## RethinkDB versus PostgreSQL: my personal experience

William Stein (wstein@sagemath.com) •

## Introduction

Initially motivated by the shutdown of the RethinkDB company, and the licensing situation with RethinkDB (a blocker for certain parts of my business), I worked very hard for two months to completely rewrite the realtime and database components of SageMathCloud (SMC) to use PostgreSQL instead of RethinkDB, initially motivated by this discussion in Hacker news. I battled with and used RethinkDB heavily since May 2015, and I’ve used PostgreSQL heavily as well, with production data, rewriting all the same queries in both systems, so I’m in a good position to compare them for my use case (the site SMC).

This is my story. It’s a personal comparison, with NO BENCHMARKS or hard data you could reproduce. It’s what I would tell you if we were talking by the water cooler.

Summary:

• I’m very happy with the rewrite.
• Everything is an order of magnitude more efficient using PostgreSQL than it was with RethinkDB.
• It is much easier to do exploratory queries of our data using PostgreSQL than it was with RethinkDB. PostgreSQL is much more expressive than ReQL, has a massive number of built-in functions, so we are making much better use of our data. With RethinkDB, often we just ended up greping through the latest database dump.
• PostgreSQL is “statically typed”, whereas RethinkDB had no type or schema enforcement at all; explicit clear typing improved the quality and robustness of our application.
• We are saving $800 month (!), due to reduced CPU and disk space requirements. • I had no clue that RethinkDB would be Apache licensed in February 2017. ## SMC for Collaborative LaTeX Editing Hal Snyder • • latex SageMathCloud (SMC) is the most powerful online$\LaTeXcollaboration software available today. SMC offers the full complement of features expected from online services today for professionals and students, on a par with and exceeding other leading products, such as Overleaf and ShareLaTeX. In addition, SMC offers a complete environment for teaching, research, and exploratory computing with the same rich feature set. Here’s an overview of key SMC features to back up the claims of the previous paragraph. ## Inline LaTeX Errors haraldschilly • • latex Small update of our LaTeX editor. Now it shows LaTeX errors inline, which should make it easier to fix these problems. The parser reading the LaTeX log was also improved to treat consecutive errors separately. Error location mapping also works for .Rnw knitr source files. ## Wishbone haraldschilly • • r Do you want to run Wishbone in your SMC project? Wishbone is an algorithm to align single cells from differentiation systems with bifurcating branches. Wishbone has been designed to work with multidimensional single cell data from diverse technologies such as Mass cytometry and single cell RNA-seq. First, open a terminal file and run the following lines to install it locally inside your project: pip3 install --user Cython pip3 install --user git+https://github.com/dominiek/python-bhtsne.git pip3 install --user git+https://github.com/jacoblevine/phenograph.git pip3 install --user git+https://github.com/ManuSetty/wishbone.git  Then, open a new Jupyter notebook, switch to the “Python 3” kernel, and run examples from their documentation. (In order to access GitHub or PyPi, your project needs “internet access”) ## R updated haraldschilly • • r Dear R users! We’ve switched our default R to be the “official” one from https://www.r-project.org/. Therefore the default R version has been updated from version 3.2.4 to the most recent one 3.3.2. This is a quite significant update from the one in SageMath, hence this notice. If for some reason you still need to work with the older version of R in SageMath, do this: 1. In the command-line use R-sage instead of R and Rscript-sage instead of Rscript. 2. In Jupyter notebooks there are two R kernels: the newer “R (R-Project)” and “R (SageMath)” for the version shipped with SageMath. Regarding libraries, we installed all the ones we know about and many more. In total, there are 557 R libraries (1.3GB) available. Is something that you need still missing? Email help@sagemath.com. ## using SMC with python & data science MOOC halsnyder • • python, mooc, and datascience Here are a couple tips based on my experience using SMC to complete the Coursera course, Introduction to Data Science in Python. This course is the first installment of a new 5-part Applied Data Science with Python Specialization from the University of Michigan. Examples and study assignments for the course are offered as Python3 Jupyter notebooks, i.e. .ipynb files. Students may use Coursera-hosted jupyter notebooks or any other platform that allows them to run the code. Homework is submitted by uploading a .ipynb file for each programming assignment. The following might be helpful for students taking the course and using SMC: ### 1. Set Jupyter kernel. After uploading a course .ipynb file, change the kernel from Python3 to Anaconda (Python3) as shown below. This will prevent errors such as AttributeError: ‘Index’ object has no attribute ‘str’ due to different versions of pandas. ### 2. Convert Jupyter notebooks to Sage worksheets. If you would rather code in a Sage worksheet than a Jupyter notebook, use the SMC script smc-ipynb2sagews to convert the files. Open a terminal file, for example mooc.term, and enter the following commands  cp Assignment\ 2.ipynb assgn2.ipynb
$smc-ipynb2sagews assgn2.ipynb /usr/local/bin/smc-ipynb2sagews: Creating SageMathCloud worksheet 'assgn2.sagews'$ open assgn2.sagews


## NLTK text corpus

haraldschilly • • python

The full 2.4gb NLTK text corpus is now available. You can for example run this in our SageMath or the Anaconda Python environment:

from nltk.corpus import brown
w = brown.words()
len(list(w))


which gives

1161192


## Install Jupyter's nbextensions configurator

haraldschilly • • jupyter

You can install it in your own project. For that, you need internet access enabled or somehow upload the code into your project. Then, install it like this in a terminal (create a new file terminal.term)

 pip install --user --no-deps jupyter_nbextensions_configurator
jupyter nbextensions_configurator enable --user


and restart the Jupyter server in SMC

smc-jupyter restart


Then, in order to see the configurator, you have to open an ipynb file. Click on the the “About” button in the top right click on the link there to open the version of jupyter without the synchronization. There, either go to the main page or the one dedicated for the nbextensions. The URL looks like this:

https://cloud.sagemath.com/<your_project_id>/port/jupyter/nbextensions


## Toggle SageWS Cells

williamstein •

After getting too tired of people saying things like

I’m getting crazy with cells becoming hidden in a way I’m not in full control.

I just rewrote how cell input/output hiding works. Now there is a little toggle triangle in the very left column next to the input cell divider, and also next to the output. That’s how you toggle visibility of input and output.

Also, %md and %html no longer “magically” hide the input, and double clicking on the output doesn’t do anything anymore. It’s simple, straightforward, and you are in control.

Additionally, a second level of line numbering helps to orient inside a cell and across the whole document.

## TimeTravel Diffs

williamstein •

I just released a new SageMathCloud feature – TimeTravel Diffs

In a supported document, open up “TimeTravel”, click on “changes”, drag sliders, and see what changed in a file during any interval. This works for all editor based documents, e.g., python code, sage worksheets, etc. (Not available for Jupyter notebooks yet.) You can see exactly what happened with a file during any interval of time.