CoCalc Blog

using SMC with python & data science MOOC

halsnyder • • python, mooc, and datascience

Here are a couple tips based on my experience using SMC to complete the Coursera course, Introduction to Data Science in Python. This course is the first installment of a new 5-part Applied Data Science with Python Specialization from the University of Michigan.

Examples and study assignments for the course are offered as Python3 Jupyter notebooks, i.e. .ipynb files. Students may use Coursera-hosted jupyter notebooks or any other platform that allows them to run the code. Homework is submitted by uploading a .ipynb file for each programming assignment.

The following might be helpful for students taking the course and using SMC:

1. Set Jupyter kernel.

After uploading a course .ipynb file, change the kernel from Python3 to Anaconda (Python3) as shown below. This will prevent errors such as AttributeError: ‘Index’ object has no attribute ‘str’ due to different versions of pandas.

2. Convert Jupyter notebooks to Sage worksheets.

If you would rather code in a Sage worksheet than a Jupyter notebook, use the SMC script smc-ipynb2sagews to convert the files. Open a terminal file, for example mooc.term, and enter the following commands

$ cp Assignment\ 2.ipynb assgn2.ipynb
$ smc-ipynb2sagews assgn2.ipynb
/usr/local/bin/smc-ipynb2sagews: Creating SageMathCloud worksheet 'assgn2.sagews'
$ open assgn2.sagews

NLTK text corpus

haraldschilly • • python

The full 2.4gb NLTK text corpus is now available. You can for example run this in our SageMath or the Anaconda Python environment:

from nltk.corpus import brown w = brown.words() len(list(w))

which gives

1161192

Install Jupyter's nbextensions configurator

haraldschilly • • jupyter

You can install it in your own project. For that, you need internet access enabled or somehow upload the code into your project. Then, install it like this in a terminal (create a new file terminal.term)

 pip install --user --no-deps jupyter_nbextensions_configurator
 jupyter nbextensions_configurator enable --user

and restart the Jupyter server in SMC

smc-jupyter restart

Then, in order to see the configurator, you have to open an ipynb file. Click on the the “About” button in the top right click on the link there to open the version of jupyter without the synchronization. There, either go to the main page or the one dedicated for the nbextensions. The URL looks like this:

https://cloud.sagemath.com/<your_project_id>/port/jupyter/nbextensions

Toggle SageWS Cells

williamstein •

After getting too tired of people saying things like

I’m getting crazy with cells becoming hidden in a way I’m not in full control.

I just rewrote how cell input/output hiding works. Now there is a little toggle triangle in the very left column next to the input cell divider, and also next to the output. That’s how you toggle visibility of input and output.

Also, %md and %html no longer “magically” hide the input, and double clicking on the output doesn’t do anything anymore. It’s simple, straightforward, and you are in control.

Additionally, a second level of line numbering helps to orient inside a cell and across the whole document.

TimeTravel Diffs

williamstein •

I just released a new SageMathCloud feature – TimeTravel Diffs

In a supported document, open up “TimeTravel”, click on “changes”, drag sliders, and see what changed in a file during any interval. This works for all editor based documents, e.g., python code, sage worksheets, etc. (Not available for Jupyter notebooks yet.) You can see exactly what happened with a file during any interval of time.

Nightly Changelog #3

johnjeng •

Our Nightly Changelog keeps you updated on small feature changes, bugfixes, and quality of life improvements. For upcoming changes, see our weekly progress report column.

General Usage

Quality of Life

Week 35, 2016

• smc

Last week marks the end of the summer and we noticed a significant increase in overall traffic and activity. This adds more pressure to the ongoing Kubernetes rewrite of the SMC back-end. We also started to collect a few courses teaching with SMC — if you are also teaching with SMC, please let us know!

Multi-user sync-aware full document undo/redo

williamstein • • dev

Today – motivated by a challenge from a c9.io developer at a recent meetup in Seattle – I finally implemented multi-user sync-aware full document undo/redo, at least for code editors, sage worksheets, and Jupyter notebooks. If you’ve ever edited a file, worksheet, or Jupyter notebook at the same time as somebody else, and you hit control+z (or click undo) right after they type something, you would have undid their last thing. That’s because the undo/redo would use the underlying Codemirror editor’s undo/redo functionality. I wrote a new implementation of undo/redo built on top of the realtime multiuser sync functionality. Instead of undoing the last change (or changes if you undo or redo multiple times) to the document, it undoes only the changes that you made during this session.

For Jupyter notebooks in SageMathCloud this has an interesting side effect. Vanilla Jupyter itself doesn’t have any global undo – instead they have a local undo in each cell, which you could only use via the keyboard. With this change, now Jupyter notebooks in SMC have a global undo: make some changes in any cell(s), move cells around, delete cells, etc., then click undo/redo or use the keyboard to undo/redo, and the undo should undo everything you actually did across all cells.

Tip #1: Restart Project

haraldschilly • • tipps

Today’s little usage tip is about resource usage and project restarts. Each time you open up Sage Worksheets or Jupyter Notebooks, the state of it needs to be stored in memory. This can become quite costly if you open many of them after another! They also continue to run in the background when you close the tab.

For example, you’re grading a lot of homework from your students, or you’re torn apart working with many files at once.

The solution is to either explicitly stop each running instance with the stop button (available for both types of documents) after you’re finished with it, or restart the entire project.

Restarting the project is like rebooting your computer. Everything is cleaned up and you end up with a blank state. Go to the project settings, and then click “restart project” in “project control”.

If there were still Jupyter Notebooks open, they might give you little error messages about being cut off abruptly. Well, don’t worry, just close and re-open them.

Pro-tip: In these project settings, on the left hand side, you can see the current memory usage and the quota. At the latest when it does grow above the quota, things might no longer work as well as they should.

Nightly Changelog #2

johnjeng • • smc

Our Nightly Changelog keeps you updated on small feature changes, bugfixes, and quality of life improvements. For upcoming changes, see our weekly progress report column.

General Usage

Quality of Life

Bugfixes