{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Cloud" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## cuPyNumeric\n", "\n", "cuPy 只能跑在一個 GPU 上,cuPyNumeric 跑在 GPU grid 上\n", "\n", "### Setting Up Worker Machines(沒試過)\n", "\n", "#### 1. System Prep on Each Machine \n", "\n", "Make sure every node has:\n", "* Linux OS (Ubuntu or CentOS recommended, use WSL on windows)\n", "* NVIDIA GPU and CUDA driver (check with `nvidia-smi`)\n", "* Python 3.11–3.13\n", "* OpenMPI and UCX installed for inter-node communication\n", "* No need for CUDA Toolkit or `nvcc` unless you're compiling from source\n", "\n", "#### 2. Install cuPyNumeric\n", "\n", "On each machine, create a venv and install nvidia-cupynumeric which will install Legate automatically; no need to install legate separately\n", "```bash\n", "python -m venv legate-env\n", "source legate-env/bin/activate\n", "pip install nvidia-cupynumeric\n", "```\n", "\n", "#### 3. Set Up Networking\n", "To enable distributed execution:\n", "* Configure `passwordless SSH` between all machines:\n", "```bash\n", "ssh-keygen -t rsa -b 4096 -N \"\"\n", "ssh-copy-id user@worker-node\n", "```\n", "* Ensure machines are on the same subnet or have direct IP access\n", "* Open MPI-related ports in firewalls (e.g., TCP 10000–20000)\n", "* Optional: Add IPs and hostnames to `/etc/hosts` for easier reference\n", "* More details in the next subsection\n", "\n", "\n", "#### 4. Launch Distributed Jobs\n", "From your head node, use the `legate` launcher:\n", "```bash\n", "legate --gpus \\\n", " --nodes \\\n", " --ranks-per-node \\\n", " --launcher mpirun \\\n", " ./your_script.py\n", "```\n", "* Example for 2 nodes with 4 GPUs each:\n", "```bash\n", "legate --gpus 4 --nodes 2 --ranks-per-node 1 --launcher mpirun ./main.py\n", "```\n", "\n", "#### 5. Test Your Setup\n", "Try this simple script to verify GPU execution:\n", "```python\n", "from legate.timing import time\n", "import cupynumeric as np\n", "\n", "size = 100_000_000\n", "start = time()\n", "a = np.random.rand(size)\n", "b = np.random.rand(size)\n", "result = np.dot(a, b)\n", "end = time()\n", "\n", "print(\"Dot product:\", result)\n", "print(f\"Elapsed time: {(end - start)/1000:.2f} ms\")\n", "```\n", "* Run it with `legate` and scale up the number of GPUs to see performance gains\n", "\n", "### Step-by-Step Guide To Set Up Networking\n", "\n", "#### 1. Assign Hostnames or IPs\n", "\n", "Pick one machine to be your head node (you’ll run python scripts with cupynumeric from here). Get the IP addresses or hostnames of all machines:\n", "```bash\n", "hostname\n", "ip a # or ifconfig\n", "```\n", "* Create a simple list like:\n", "```\n", "192.168.1.10 # Head node\n", "192.168.1.11 # Worker 1\n", "192.168.1.12 # Worker 2\n", "```\n", "\n", "#### 2. Create an SSH Key on the Head Node\n", "\n", "Enable passwordless SSH, which MPI requires for launching processes on remote nodes.\n", "```bash\n", "ssh-keygen -t rsa -b 4096 -N \"\" -f ~/.ssh/id_rsa\n", "```\n", "* `~/.ssh/id_rsa` is the default file location \n", "* `-N \"\"` sets an empty passphrase\n", "\n", "\n", "#### 3. Share the Key with Worker Nodes\n", "\n", "For each worker node:\n", "```bash\n", "ssh-copy-id user@192.168.1.11\n", "ssh-copy-id user@192.168.1.12\n", "```\n", "* Once done, test it:\n", "```bash\n", "ssh user@192.168.1.11 hostname\n", "```\n", "* You should be able to log in without a password\n", "\n", "\n", "#### 4. Add Hostnames/IPs to `/etc/hosts` (Optional)\n", "\n", "This helps MPI recognize the machines easily.\n", "```bash\n", "sudo nano /etc/hosts\n", "```\n", "Example entry:\n", "```\n", "192.168.1.11 node1\n", "192.168.1.12 node2\n", "```\n", "* Repeat this on each machine. Use static IPs if possible\n", "\n", "\n", "#### 5. Open Firewall Ports\n", "\n", "MPI and UCX may require open TCP ports (especially if you use `mpirun`):\n", "* Suggested range: `TCP 10000–20000`\n", "* Also allow SSH (`TCP 22`)\n", "* For `Ubuntu UFW`, run:\n", "```bash\n", "sudo ufw allow 22\n", "sudo ufw allow 10000:20000/tcp\n", "```\n", "* Or disable firewall (only in isolated or test setups):\n", "```bash\n", "sudo ufw disable # or sudo systemctl stop firewalld\n", "```\n", "\n", "\n", "#### 6. Install OpenMPI + UCX\n", "\n", "Ensure both are installed and compatible across machines.\n", "```bash\n", "sudo apt install openmpi-bin libopenmpi-dev\n", "pip install ucx-py\n", "```\n", "* Optionally verify:\n", "```bash\n", "mpirun --version\n", "```\n", "\n", "#### 7. Quick Connectivity Test\n", "\n", "Try running a simple MPI command:\n", "```bash\n", "mpirun -np 2 -host node1,node2 hostname\n", "```\n", "or \n", "```bash\n", "mpirun -np 2 -host 192.168.1.10,192.168.1.11 hostname\n", "```\n", "* Should return the hostname of each node\n", "\n", "#### 8. Try Running a cuPyNumeric Job\n", "\n", "From head node:\n", "```bash\n", "legate --gpus 4 \\\n", " --nodes 2 \\\n", " --ranks-per-node 1 \\\n", " --launcher mpirun \\\n", " --hostfile ./nodes.txt \\\n", " your_script.py\n", "```\n", "Where nodes.txt contains:\n", "```\n", "192.168.1.10 slots=4\n", "192.168.1.11 slots=4\n", "```\n", "\n", "\n", "### `setup.sh` for Worker Machine Configuration\n", "\n", "```bash\n", "#!/bin/bash\n", "\n", "# === Setup Variables ===\n", "SSH_USER=\"your_username\" # Change to your actual SSH user\n", "HEAD_NODE_IP=\"192.168.1.100\" # IP of your head node\n", "HOSTNAME=$(hostname)\n", "\n", "echo \"=== Starting setup on $HOSTNAME ===\"\n", "\n", "# === 1. Generate SSH Key if not exists ===\n", "if [ ! -f ~/.ssh/id_rsa ]; then\n", " echo \"--- Generating SSH key...\"\n", " ssh-keygen -t rsa -b 4096 -N \"\" -f ~/.ssh/id_rsa\n", "else\n", " echo \"--- SSH key already exists. Skipping generation.\"\n", "fi\n", "\n", "# === 2. Copy Public Key to Head Node ===\n", "echo \"--- Copying SSH key to head node...\"\n", "ssh-copy-id $SSH_USER@$HEAD_NODE_IP\n", "\n", "# === 3. Update /etc/hosts ===\n", "echo \"--- Updating /etc/hosts with head node IP...\"\n", "echo \"$HEAD_NODE_IP headnode\" | sudo tee -a /etc/hosts\n", "\n", "# === 4. Open Firewall Ports (Ubuntu) ===\n", "echo \"--- Configuring firewall...\"\n", "sudo ufw allow ssh\n", "sudo ufw allow 10000:20000/tcp\n", "sudo ufw enable\n", "\n", "# === 5. Install OpenMPI and UCX ===\n", "echo \"--- Installing OpenMPI and UCX...\"\n", "sudo apt update\n", "sudo apt install -y openmpi-bin libopenmpi-dev\n", "pip install --quiet ucx-py\n", "\n", "# === 6. Test MPI Connectivity ===\n", "echo \"--- Testing MPI connectivity with hostname command...\"\n", "mpirun -np 1 hostname\n", "\n", "echo \"=== Setup completed on $HOSTNAME ===\"\n", "```\n", "\n", "\n", "### Notes\n", "\n", "* Replace `your_username` and `192.168.1.100` with actual values\n", "* Run this script on each worker machine. You can also use SSH to distribute it from your head node if needed\n", "* Add more host entries as needed to `/etc/hosts` if you plan to use hostnames like `node1`, `node2`, etc.\n", "* For CentOS or other distros, package manager commands may need adjustment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Blocking a Website \n", "\n", "* Windows\n", " * Run notepad as administrator\n", " * Open `C:\\Windows\\System32\\drivers\\etc\\hosts`\n", " * Add \"127.0.0.1 www.google.com\" in the end to block google.com\n", "* iPad\n", " * Settings > Screen Time > turn on \"Content & Privacy Restrictions\" > App Store, Media, Web & Games > Web Content > Only Approved Websites" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [IDL Files](https://stackoverflow.com/questions/670630/what-is-idl)\n", "\n", "* Interface Definition Language 是用來定義 Client 和 Server 之間溝通用的 Interface" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [cmd Freeze Issue](https://stackoverflow.com/questions/24571981/python-program-stops-in-command-line)\n", "\n", "* 如果選取了任何東西 cmd 就會卡住,不自動執行下一個 python script\n", " * 去 cmd property 把 QuickEdit Mode 取消就會一直自動跑下去了" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Environment Variables \n", "\n", "* 有兩種不同 level 的環境變數:user 和 system,只有要改 system 時需要 Admin right\n", "* Windows:start > Edit Environment Variables for Your Account" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [GCC Options](https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Option-Summary.html#Option-Summary)\n", "\n", "* -o:output 檔名\n", "* -fPIC:Generate position-independent code (PIC) suitable for use in a shared library 並非所有硬體上都有這個選項\n", "* -shared:產生 shared object(so),可被 link 到別的 object 上成為一個 executable。[要跟 -fPIC 一起用](https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Link-Options.html#Link-Options)\n", "* -c:編譯但不連結\n", "* -I:include,例如 `gcc -fPIC -c lib_swig.c lib_swig_wrap.c -I/srv/conda/envs/notebook/include/python3.7m` 因為需要這個 path 裡的 `Python.h`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linux\n", "\n", "* [find](https://math2001.github.io/article/bashs-find-command/):cd 到最上層然後 `find -name iostream` \n", "* [grep](https://stackoverflow.com/questions/16956810/how-do-i-find-all-files-containing-specific-text-on-linux):`grep -rnw 'path/to/somewhere' -e 'pattern'`\n", "* [cp](https://www.cyberciti.biz/faq/copy-folder-linux-command-line/):`cp -avr /usr/include/range/ /srv/conda/envs/notebook/include/`\n", "* help:`ls --help`\n", "* 第二層選項:`g++ -c my_f_wrap.cxx -I ../../../srv/conda/envs/notebook/include`\n", " * `g++ --help` 不會跑出 `-I` 的說明,要 `g++ --help=c`\n", " * `-I` 像是「第二層選項」,只有用了 `-c` 之後才會出現" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SQL\n", "\n", "* [MySQL Crash Course](https://www.youtube.com/watch?v=9ylj9NR0Lcg) 和 [mysql cheat sheet](https://gist.github.com/bradtraversy/c831baaad44343cc945e76c2e30927b3#file-mysql_cheat_sheet-md)\n", "* [SQLAlchemy](https://www.youtube.com/watch?v=OT5qJBINiJY):[在 windows 裡 db 要這樣寫:](https://stackoverflow.com/questions/19260067/sqlalchemy-engine-absolute-path-url-in-windows)`sqlite:///C:\\\\Users\\\\Username\\\\AppData\\\\Roaming\\\\Appname\\\\mydatabase.db`" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2021-05-13 23:07:11,496 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1\n", "2021-05-13 23:07:11,497 INFO sqlalchemy.engine.base.Engine ()\n", "2021-05-13 23:07:11,499 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1\n", "2021-05-13 23:07:11,500 INFO sqlalchemy.engine.base.Engine ()\n", "2021-05-13 23:07:11,502 INFO sqlalchemy.engine.base.Engine PRAGMA main.table_info(\"person\")\n", "2021-05-13 23:07:11,502 INFO sqlalchemy.engine.base.Engine ()\n", "2021-05-13 23:07:11,503 INFO sqlalchemy.engine.base.Engine PRAGMA temp.table_info(\"person\")\n", "2021-05-13 23:07:11,504 INFO sqlalchemy.engine.base.Engine ()\n", "2021-05-13 23:07:11,506 INFO sqlalchemy.engine.base.Engine \n", "CREATE TABLE person (\n", "\tid INTEGER NOT NULL, \n", "\tusername VARCHAR, \n", "\tPRIMARY KEY (id), \n", "\tUNIQUE (username)\n", ")\n", "\n", "\n", "2021-05-13 23:07:11,507 INFO sqlalchemy.engine.base.Engine ()\n", "2021-05-13 23:07:11,515 INFO sqlalchemy.engine.base.Engine COMMIT\n", "2021-05-13 23:07:11,518 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)\n", "2021-05-13 23:07:11,520 INFO sqlalchemy.engine.base.Engine INSERT INTO person (id, username) VALUES (?, ?)\n", "2021-05-13 23:07:11,520 INFO sqlalchemy.engine.base.Engine (0, 'mary')\n", "2021-05-13 23:07:11,523 INFO sqlalchemy.engine.base.Engine COMMIT\n", "2021-05-13 23:07:11,530 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)\n", "2021-05-13 23:07:11,531 INFO sqlalchemy.engine.base.Engine SELECT person.id AS person_id, person.username AS person_username \n", "FROM person\n", "2021-05-13 23:07:11,532 INFO sqlalchemy.engine.base.Engine ()\n", "mary 0\n", "2021-05-13 23:07:11,534 INFO sqlalchemy.engine.base.Engine ROLLBACK\n" ] } ], "source": [ "from sqlalchemy import create_engine, Column, Integer, String, ForeignKey\n", "from sqlalchemy.ext.declarative import declarative_base\n", "from sqlalchemy.orm import sessionmaker, relationship\n", "\n", "Base = declarative_base()\n", "\n", "class User(Base):\n", " __tablename__ = 'person'\n", " user_id = Column('id', Integer, primary_key=True)\n", " user_name = Column('username', String, unique=True)\n", "\n", "engine = create_engine('sqlite:///users.db', echo=True)\n", "Base.metadata.create_all(bind=engine)\n", "Session = sessionmaker(bind=engine)\n", "\n", "session = Session()\n", "\n", "user = User()\n", "user.user_id = 0\n", "user.user_name = 'mary'\n", "session.add(user)\n", "session.commit()\n", "\n", "users = session.query(User).all()\n", "for user in users:\n", " print(user.user_name, user.user_id)\n", "\n", "session.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [inotify-tools](https://medium.com/100-days-of-linux/an-introduction-to-file-system-monitoring-tools-afd99164ce66)\n", "\n", "* 可以 set up 一個 watcher 讓所有 ipynb file 每次 ctrl-s 就自動 cnp\n", "* [Example:每次儲存 tex file 就自動編譯 LaTeX](https://www.gitpod.io/docs/languages/latex)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gitpod\n", "\n", "* Prefix any Git repository URL with `gitpod.io/#`\n", "* [Gitpod python environment](https://www.gitpod.io/docs/languages/python/#pandas)\n", "* Python Debug: \n", " 1. [install python extension](https://stackoverflow.com/questions/61948801/how-to-set-up-python-debugger-for-vs-code)\n", " 1. 去 `.theia/launch.json` [改 config](https://www.gitpod.io/docs/languages/python/#debugging)\n", " 1. F9 放 break point,F5 debug \n", "* VS Code\n", " * 去 Dashboard > Settings > Preference 切換\n", " * [目前(May 2020)很多功能不 support,例如記不住 gitpod.yml](https://www.gitpod.io/blog/root-docker-and-vscode/)\n", "* [M\\$ C/C++ extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode.cpptools)\n", " * 不 support 非 MS 官方 build 的 binary。看[這兩個](https://github.com/microsoft/vscode-cpptools/issues/6518) [issue](https://github.com/microsoft/vscode-cpptools/issues/6388)\n", " * 官方的 C/C++ extension 不發佈到 Open VSX,[所以在 extension tab 裡就找不到](https://github.com/microsoft/vscode-cpptools/issues/6388)\n", "* C/C++ [Native Debug 設定](https://www.gitpod.io/docs/languages/cpp/)\n", " * `g++ -g main.cpp` 編譯\n", " * `launch.json` 裡放這一段:\n", " ```\n", " {\n", " \"type\": \"gdb\",\n", " \"request\": \"launch\",\n", " \"name\": \"Debug Hello World (GDB)\",\n", " \"target\": \"./cpp/a.out\",\n", " \"cwd\": \"${workspaceRoot}\",\n", " \"valuesFormatting\": \"parseText\"\n", " }\n", " ```\n", " * target 是編譯好的 executable \n", " * 設定好之後要重開 IDE 才能用\n", " * 同樣的設定檔放到 `.vscode/` 裡面之後還是不能在 cpp 檔放 breakpoint。目前只有 Theia 可以 debug c++\n", "* Sign in GitHub in VS Code\n", " * Sign in 之後可以直接在 VS Code 看到這個 repo 的 PR 和 Issues\n", " * Dashboard > Settings > Integrations > GitHub > Edit Permissions > check user:read\n", " \n", "### QuantLib Setup\n", "\n", "* [QuantLib Installation on Linux](https://www.quantlib.org/install/linux.shtml)\n", "* Build QuantLib and build example for debugging\n", "```sh\n", "sudo apt-get update\n", "sudo apt-get install -y libboost-all-dev automake autoconf libtool\n", "./autogen.sh\n", "./configure --with-boost-include=/usr/include/boost \n", "make\n", "sudo make install\n", "sudo ldconfig\n", "cd Examples/EquityOption/\n", "g++ -g EquityOption.cpp -o EquityOption -L../../ql/.libs/libQuantLib.so -lQuantLib\n", "./EquityOption \n", "```\n", "* Save the above as `build_and_debug_quantlib.sh` and \n", "```sh\n", "chmod +x ./build_and_debug_quantlib.sh\n", "```\n", "* In `.gitpod.yml` (see examples [here](https://github.com/gitpod-io/gitpod/blob/main/.gitpod.yml))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tasks:\n", " - name: Build and Debug QuantLib\n", " init: ./build_and_debug_quantlib.sh" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 要 prebuild 要先把一個 repo 設成 [Project](https://www.gitpod.io/docs/configure/projects/prebuilds#projects-and-prebuilds)\n", " * gitpod.io dashboard > Project > New Project > choose QuantLib\n", " * 第一次要 build 超過一小時\n", "* Install C/C++ Extension Pack v0.10.0 by franneck94\n", "* `.vscode/launch.json`\n", "```\n", "{\n", " \"version\": \"0.2.0\",\n", " \"configurations\": [\n", " {\n", " \"name\": \"C++ Debug\",\n", " \"type\": \"lldb\",\n", " \"request\": \"launch\",\n", " \"program\": \"${workspaceFolder}/Examples/EquityOption/EquityOption\",\n", " \"initCommands\":[\"settings set target.disable-aslr false\"], \n", " \"args\": [],\n", " \"cwd\": \"${workspaceFolder}/Examples/EquityOption\"\n", " }\n", " ]\n", "}\n", "```\n", "* What's going on in `build_and_debug_quantlib.sh`? \n", " * The `-y` after `sudo apt-get install` [means yes to all](https://stackoverflow.com/questions/55870932/what-does-sudo-apt-install-y-do)\n", " * boost 安裝好以後 lib 檔在 `/usr/lib/x86_64-linux-gnu` [裡](https://askubuntu.com/questions/263461/where-is-my-boost-lib-file)(或用 `find -name *boost*` 找)\n", " * [Linker 知道這個 folder](https://askubuntu.com/questions/263461/where-is-my-boost-lib-file) 所以在 `./configure` 不需要 specify library directory,只需要 include directory\n", " * `g++ -g EquityOption.cpp -o EquityOption -L../../ql/.libs/libQuantLib.so -lQuantLib`\n", " * `-g`: 編譯過程中產生 debug symbols\n", " * Debug symbols provide additional information that allows the debugger to map the compiled executable back to the original source code.\n", " * `EquityOption.cpp`: 要編譯的原碼\n", " * `-o EquityOption`: output executable file name\n", " * `-L../../ql/.libs/libQuantLib.so`: path of the QuantLib library\n", " * `-lQuantLib`: 要連結的 library 名字" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Docker\n", "\n", "* Clone\n", "```bash\n", "docker run --name repo alpine/git clone https://github.com/docker/getting-started.git\n", "docker cp repo:/git/getting-started/ .\n", "```\n", "* Build\n", "```bash\n", "cd getting-started\n", "docker build -t docker101tutorial .\n", "```\n", "* Run\n", "```bash\n", "docker run -d -p 80:80 --name docker-tutorial docker101tutorial\n", "```\n", "* Share (need to log into Docker Hub)\n", "```bash\n", "docker tag docker101tutorial {userName}/docker101tutorial\n", "docker push {userName}/docker101tutorial\n", "```\n", "* Locally build the docker image of a binder and push to Docker Hub\n", " 1. repo2docker a github repo (took 4 hours to build sandbox-stable, resulting image of size 8G)\n", " 2. ```docker tag (ugly_long_tag_name):latest beginnersc/sandbox-stable:latest```\n", " 3. ```docker push beginnersc/sandbox-stable:latest```\n", "* 只印出 Dockerfile,不 build image\n", " * ```jupyter-repo2docker --no-build --debug https://github.com/beginnerSC/sandbox-stable```\n", " * [repo2docker 產生的 Dockerfile 不能用來自己 build](https://repo2docker.readthedocs.io/en/latest/faq.html#can-i-use-repo2docker-to-bootstrap-my-own-dockerfile)\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Heroku \n", "\n", "* [Kaffeine](https://kaffeine.herokuapp.com/): ping your Heroku app every 30 minutes\n", " * [GitHub repo](https://github.com/romainbutteaud/Kaffeine)\n", " * There are other ways. Search \"how to keep your free heroku app alive and prevent it from going to sleep betterprogramming\"\n", "* Delete an app: Heroku dashboard > myapp > settings > delete\n", "\n", "\n", "### [Deploying Docker Image](https://www.codingforentrepreneurs.com/blog/jupyter-production-server-on-docker-heroku)\n", "\n", "\n", "* [install heroku cli](https://devcenter.heroku.com/articles/heroku-cli): ```curl https://cli-assets.heroku.com/install.sh | sh```\n", "* [要 log in 兩次](https://stackoverflow.com/questions/56814489/heroku-deploy-with-docker-can-not-login)。先 login 到 cli 才能 login 到 container\n", " * ```heroku login -i```\n", " * ```heroku container:login```\n", "* 只有 Jupyter-x-Docker-on-Heroku deploy 成功,拿其它 Dockerfile 試都失敗,如 [Jupyter Docker Stacks](https://jupyter-docker-stacks.readthedocs.io/en/latest/) 裡的 [base image 的 Dockerfile](https://github.com/jupyter/docker-stacks/blob/master/base-notebook/Dockerfile)\n", " * ```heroku create myapp```\n", " * 到 Jupyter-x-Docker-on-Heroku/conf/jupyter.py 裡改密碼和 public IP address\n", " * ```git clone https://github.com/beginnerSC/Jupyter-x-Docker-on-Heroku```\n", " * cd 到 Dockerfile 所在目錄\n", " * ```heroku container:push web -a myapp```\n", " * ```heroku container:release web -a myapp```\n", " * ```heroku open```\n", "* 如果打不開,用 ```heroku logs -a myapp``` 查看錯誤訊息\n", "* Deploy 成功之後要在內網用要手動改 https\n", "* 把 Jupyter-x-Docker-on-Heroku/Dockfile 的 base image 換成 Jupyter Docker Stacks 裡的 image 結果失敗而且要 build 很久(不知道 heroku 有沒有 image size 限制)\n", "* 如果最後 ```CMD [\"./scripts/postBuild.sh\"]``` 遇到 permission error 就改成 ```CMD [\"jupyter\", \"lab\", \"--config\", \"./jupyter_notebook_config.py\"]```\n", "* 不是最後一行的 permission error 可以改成 [chmod u+x program_name](https://stackoverflow.com/questions/18960689/ubuntu-says-bash-program-permission-denied/18960752)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [Free DB Hosting](https://gist.github.com/bmaupin/0ce79806467804fdbbf8761970511b8c) and [Free Backend Hosting](https://gist.github.com/bmaupin/d2d243218863320b01b0c1e1ca0cf5f3) Options" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Jupyter C++ Kernel on WSL2\n", "\n", "### Install WSL and the Ubuntu Linux Distribution\n", "\n", "* The below steps install the distribution to the C drive. We will move it to D drive later\n", "* In powershell, install wsl:\n", "```sh\n", "wsl --install\n", "```\n", "* Reboot the computer\n", "* See what linux distributions are installed in this computer (now empty):\n", "```sh\n", "wsl --list --verbose\n", "```\n", "* See what distributions you can download:\n", "```sh\n", "wsl --list --online\n", "```\n", "* Install the `Ubuntu` distribution (or any available distribution name listed above)\n", "```sh\n", "wsl --install -d Ubuntu\n", "```\n", "* Open Ubuntu from start menu to set up ID and password\n", "* Close and open powershell again. Now the installed distribution will be found\n", "```sh\n", "wsl --list --verbose\n", "```\n", " \n", "### Move to D Drive\n", "\n", "* Below Ubuntu is the distribution name\n", "* Create the `D:\\WSL\\Ubuntu` folder\n", "* Export the installed distribution\n", "```sh\n", "wsl --export Ubuntu D:/WSL/Ubuntu/backup.tar\n", "```\n", "* Unregister the distribution. It will be removed from the C drive\n", "```sh\n", "wsl --unregister Ubuntu\n", "```\n", "* Import the distribution in the D drive\n", "```sh\n", "wsl --import Ubuntu D:/WSL/Ubuntu D:/WSL/Ubuntu/backup.tar --version 2\n", "```\n", " \n", "### Python and Environment\n", "\n", "* Open Ubuntu in the start menu and run \n", "```sh\n", "sudo apt update\n", "sudo apt install python3-pip\n", "```\n", "* In windows create the folder `D:/pyscripts/sandbox`, and in Ubuntu run\n", "```sh\n", "cd /mnt/d/pyscripts/sandbox\n", "sudo apt install python3-poetry\n", "```\n", "* This will install poetry 1.8.2 (as of Feb 2025) which recognizes the shell command. Then install jupyter in a new venv: \n", "```sh\n", "poetry init\n", "poetry shell\n", "poetry add jupyterlab\n", "```\n", " \n", "### Jupyter C++ Kernel ([xeus-cling](https://github.com/jupyter-xeus/xeus-cling))\n", "\n", "* xeus-cling is distributed through the conda-forge channel. You cannot pip install it. You can only mamba install it. But to do that you need to install conda and mamba first\n", "```sh\n", "wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh\n", "bash Miniconda3-latest-Linux-x86_64.sh\n", "```\n", "* Follow the steps. I say yes to every question\n", "* When done, close Ubuntu and open again. Now run\n", "```sh\n", "conda --version\n", "conda install -c conda-forge mamba\n", "mamba install -c conda-forge xeus-cling\n", "```\n", "* Now because you installed Jupyter and xeus-cling independently, you have to tell Jupyter to pick up the new kernels: \n", " * Create xcpp20 and xcpp23 kernels\n", " * In windows file explorer go to `\\\\wsl$`\n", " * Go to `\\Ubuntu\\home\\beginnersc\\miniconda3\\share\\jupyter\\kernels`\n", " * Copy the `xcpp17` folder, paste twice, and rename to `xcpp20` and `xcpp23`\n", " * In each newly created folder:\n", " * Delete all *.identifier files. Those are likely by windows when you make copies in windows\n", " * Modify kernel.json to c++20 (and c++23) everywhere\n", " * This will not really give you fully supported c++20 and 23 because the clang version is too old\n", " * In Ubuntu, run\n", "```sh\n", "cd ~\n", "cd /mnt/d/pyscripts/sandbox\n", "poetry shell\n", "jupyter kernelspec install /home/beginnersc/miniconda3/share/jupyter/kernels/xcpp14 --sys-prefix\n", "jupyter kernelspec install /home/beginnersc/miniconda3/share/jupyter/kernels/xcpp17 --sys-prefix\n", "jupyter kernelspec install /home/beginnersc/miniconda3/share/jupyter/kernels/xcpp20 --sys-prefix\n", "jupyter kernelspec install /home/beginnersc/miniconda3/share/jupyter/kernels/xcpp23 --sys-prefix\n", "jupyter kernelspec uninstall xcpp14\n", "jupyter lab\n", "```\n", "* Copy the url `http://localhost:8888/lab?token=...`, including token, and paste to browser to start JupyterLab with C++ kernel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## JupyterLab\n", "\n", "* Check my JupyterLab version: `jupyter --version`\n", "* [可以用](https://stackoverflow.com/questions/45818538/where-can-i-put-a-startup-script-in-jupyter) `get_ipython().profile_dir.startup_dir` 檢查 startup script 要放在哪\n", "* 用 All the Kernels 可以在所有有安裝的 Kernel 間互相切換,像 cell magic ```%%javascript```,但會失去珍貴的顏色,所以灌了可是沒在用:\n", " ```\n", " >xcpp17\n", " #include \n", " std::cout << \"test\" << std::endl;\n", " ```\n", " * 目前 xeus-cling 沒有 cell magic 直接切換,但[有這個 request](https://github.com/jupyter-xeus/xeus-cling/issues/204) \n", "* Paste image from clipboard: working in notebook but not recommended\n", " * it won't compile to pdf\n", " * if paste multiple images in the same nb file, sphinx will display the last one only\n", " * 把每個圖的 alt text 改不一樣就可以修好這個問題(貼上的時候預設都是 image.png)。[這個 issue](https://github.com/spatialaudio/nbsphinx/issues/162) 有提到\n", " * even if one only pastes one image, RTD will compile to pdf but the size of the image still won't be correct\n", "* Equation Numbering Not Working\n", " * [jupyter_contrib_nbextensions](https://github.com/ipython-contrib/jupyter_contrib_nbextensions) 是一個所有 unofficial (classic) Jupyter notebook extension 的合集,裡面有 Equation Numbering\n", " * 但到 JupyterLab 上 \\ref 會壞掉,只印出 (???) \n", " * 2020/9 為止還沒修好。看這個 [issue](https://github.com/jupyterlab/jupyterlab/issues/4039) \n", "* nbconvert to pdf \n", " * [No section numbers](https://stackoverflow.com/questions/35077571/how-to-remove-heading-numbers-in-jupyter-during-pdf-conversion):```### my heading {-}``` 這樣 exported pdf 只會有 my heading,前面不會有數字 1.1.1\n", " * cell content 置中:放在 ```\\begin{center} ... \\end{center}``` block 裡\n", "* [Keyboard Shortcuts](https://blog.ja-ke.tech/assets/jupyterlab-shortcuts/Shortcuts.png)\n", " * Toggle left area: Ctrl + B \n", " * Close current tab: Alt + W\n", " * Splitting a cell: Ctrl + Shift + -\n", " * Merge seleceted cells: Shift + M\n", " * Switch tabs: Ctrl + Shift + ]\n", " * Restart Kernel: 00\n", " * Interrupt Kernel: II\n", " * Change to code cell: Y\n", " * Clear output of code cell: MY(先轉成 markdown 再轉回來)\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Poetry\n", "\n", "* Config poetry to set venv in D drive and check\n", " * `poetry config virtualenvs.path \"D:\\\\Users\\\\zhang\\\\AppData\\\\Local\\\\pypoetry\\\\Cache\\\\virtualenvs\"`\n", " * `poetry config --list`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GitHub\n", "\n", "* [條件搜尋](https://youtu.be/Uj6WWAqg0NY?t=194)\n", " * in:name spring boot stars:>3000\n", " * in:readme \n", " * in:description 微服務 language:java pushed:>2019-09-03\n", "* [diff 不同版本](https://youtu.be/HkphN8Js8AU?t=302):在網址後面輸入 /compare\n", "* Issue:留言區,feature/bug 追蹤系統,可以是 open/closed\n", " * 反應項目是否活躍的重要指標\n", "* Pull requests (pr):貢獻項目\n", "* [Projects](https://youtu.be/HkphN8Js8AU?t=607):項目管理工具,像大看板,可以開 projects 在裡面加 to do list, doing list 等\n", "* Insights:項目統計信息\n", "* Settings:各種服務\n", " * github.io 網站\n", " * webhooks 事件觸發,例如一收到 pull request 就寄 email\n", " * 刪除 repo,publish 或轉 private\n", "* [實用小技巧,快速鍵](https://www.youtube.com/watch?v=VzdU5GwZ47o)\n", "* 從 Teminal push 到 github(```git remote add``` 和 ```git push``` 裡的 ```-u``` 都是第一次推送才需要):\n", " ```\n", " git add *\n", " git commit -m \"test: push from code ocean\"\n", " git remote add github https://github.com/beginnerSC/misc.git\n", " git remote -v\n", " git push -u github master\n", " ```\n", "* Default branch name:\n", " * 創建一個新的 repo 時系統預設 default branch name 是 main,這樣就沒辦法 pull master。要改要去 Settings > Repositories > Repository default branch\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### GitHub Pages\n", "\n", "* [同一帳號下多個 GitHub Pages](https://stackoverflow.com/questions/15563685/can-i-create-more-than-one-repository-for-github-pages)\n", " * 每一個 repo 都可以變成一個子頁,網址是 https://username.github.io/reponame ,只要去該 repo Settings > GitHub Pages > Source 選想要 publish 的 branch。要有 index.html 或 README.md\n", "* Theme\n", " * 沒有 index.html 時 GitHub Pages 會自動抓 README.md 套上 theme(沒選 theme 的話就是白的)\n", " * 去這裡改 theme:Repo Settings > GitHub Pages > Theme Chosser,改完之後會 repo 裡會自動出現 ```_config.yml```\n", " * 預設主標是 repo name,副標空白,可以去 ```_config.yml``` 加\n", " * ```title: 這是主標```\n", " * ```description: 這是副標```\n", " * 實測預設不顯示 View on GitHub 按紐,可以去 ```_config.yml``` 打開\n", " ```\n", " github:\n", " is_project_page: true\n", " ```\n", " * 上面這個 indent 一定要是 two space,這是 [Cayman theme 的規定](https://github.com/pages-themes/cayman)(其它 theme 應該也一樣)\n", " * 實測不能放空的 ```google_analytics:``` 不然 ```_config.yml``` 會壞掉\n", " * 如果 README 的第一行是標題(h1 h2 都一樣),會自動被抓到 page title(browser tab 上顯示的字),如果 README 最開頭放一些簡單的說明而不是標題,page title 才會去抓 site title,也就是 ```_config.yml``` 裡的那個\n", "* 網頁 [Redirect](https://stackoverflow.com/questions/5411538/redirect-from-an-html-page)\n", " * 空白 html 裡放 ``````\n", " * 太舊的 browser 可能會失敗所以另外放一個 link ```

Redirect

```\n", " * 參考 https://github.com/yc14e/nb2pdf/blob/master/index.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### [Git LFS](https://git-lfs.github.com/)\n", "\n", "* 超過 50 MB 的檔案沒辦法直接 git commit\n", "* [下載安裝 git-lfs](https://askubuntu.com/questions/799341/how-to-install-git-lfs-on-ubuntu-16-04)(看起來在 binder 直接把 git-lfs 放進 apt.txt 裡就成功了)\n", "* 指定要用 lfs track 的檔案類型,例如 pdf 和 csv,然後 add .gitattributes。之後就可以正常使用 git commit 和 push 了\n", " ```\n", " git lfs install\n", " git lfs track \"*.pdf\"\n", " git lfs track \"*.csv\"\n", " git add .gitattributes\n", " ```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dashboarding Frameworks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Voilà\n", "* 可以做動態網頁,backed by Jupyter server,也可以佈署到 mybinder。用 ?urlpath=voila,參考 [Voila GitHub page](https://github.com/voila-dashboards/voila) 裡的 binder link\n", " * 直接指向一個 app 的 link 長這樣:https://mybinder.org/v2/gh/yc14e/nb2pdf/master?urlpath=voila/render/nb2pdf.ipynb\n", "* 把 output cell 呈現出來。[xwidgets](https://github.com/jupyter-xeus/xwidgets) 應該也能呈現\n", " * 實測 HoloViz Panel 不太能 render,因為 Voila 遇到 JavaScript 會有很多問題\n", "* 在 JupyterLab 上開發時 notebook 上方有 Render with Voila 按紐\n", "* Layout\n", " * 有 [gridstack template](https://github.com/voila-dashboards/voila-gridstack) 可以控制 dashboard layout(目前 sandbox-stable 上沒有安裝)\n", " * ipywidgets 裡有 HBox 和 VBox 可以接受 markdown,html 甚至 [css](https://stackoverflow.com/questions/49863789/setting-background-color-of-a-box-in-ipywidgets)\n", " * 可以研究 [jupyter-flex](https://github.com/danielfrg/jupyter-flex)\n", "* 在 Terminal 不用 browser 打開來 debug:```voila --no-browser --debug my_notebook.ipynb```\n", "* 實測無法 render 檔名有空白字元的 notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### RISE Slideshow\n", "* 如果有 backend server,用 rise slideshow 也可以 \"deploy\" webapps,需要\n", " * [enable autolaunch](https://rise.readthedocs.io/en/stable/customize.html#automatically-launch-rise)\n", " * hide code(看下面的討論)\n", "* 實測 nb2pdf 的 output form 會被拉長,沒有照原比例呈現 \n", "* 靜態的 slides 很多方法可以 deploy\n", " * nbviewer [url 用 nbviewer.jupyter.org/format/slides/ 開頭](https://nbviewer.jupyter.org/format/slides/gist/basnijholt/2e9aa58de39a07943dd3)\n", " * nbviewer [把 url 裡的 tree 改成 blob](https://nbviewer.jupyter.org/github/LangLEvoI/langchangeinnet/blob/master/ruse.slides.html)\n", " * 這招其實可以呈現任何 html 檔,例如 login page:```https://nbviewer.jupyter.org/github/beginnerSC/beginnersc.github.io/blob/master/index.html```\n", " * 目前 render bokeh plots 還有[很多問題](https://github.com/damianavila/RISE/issues/350)。有 backend server 時呈現是沒問題的\n", "* [Export html](https://github.com/damianavila/RISE/issues/336)\n", " * ```jupyter nbconvert myslides.ipynb --to slides --reveal-prefix ../reveal.js```,或者直接用滑鼠點\n", " * 實測 [Hide_code_slides](https://github.com/kirbs-/hide_code) 和 reveal.js Slides 只能選一個 Export。hide_code 沒辦法跟 RISE 結合所以砍掉了\n", "* [Good Example and Tips](http://droste.hk/jupyter-notebook-slides/)\n", "* 找不到怎麼把 matplotlib 畫出來的圖置中。[Google 上的解答](https://stackoverflow.com/questions/41485301/how-can-i-center-the-position-of-a-matplotlib-figure-after-nbconvert)不 work\n", "* chalkboard\n", " * Notebook Metadata 裡加:\n", " ```\n", " \"rise\": {\n", " \"enable_chalkboard\": true\n", " }\n", " ```\n", " * ```true``` 後面不能有逗號,如果 Metadata 的 json 語法有錯 Jupyterlab 就會顯示紅框。改完按框左上角的勾勾並存檔\n", " * 需要把 notebook shut down 再重開 chalkboard 才會跑出來\n", " * 實測 export 之後 chalkboard 會不見\n", "* 實測只有 export 的結果可以用 esc 看全部的 slides \n", "* Hide Code\n", " * live 時要 hide code 可以用加下面這兩個 cell(看[這個 post](https://www.markroepke.me/posts/2019/06/05/tips-for-slideshows-in-jupyter.html) 和[這個 issue](https://github.com/damianavila/RISE/issues/32)):\n", " ```\n", " %%html\n", " \n", " ---------------------------------------------------\n", " def hide_code_in_slideshow(): \n", " from IPython import display\n", " import binascii\n", " import os\n", " uid = binascii.hexlify(os.urandom(8)).decode() \n", " html = \"\"\"
\n", " \"\"\" % (uid, uid)\n", " display.display_html(html, raw=True)\n", " ```\n", " * 但一旦用了這個,這個 slides 就沒辦法 export 了,只能 live present\n", " * 直接用 nbconvert 的 flag:\n", " * 用 ```!jupyter nbconvert od_export.ipynb --to slides --TemplateExporter.exclude_input=True --no-prompt``` 也可以 render bokeh。這樣就沒有用到 RISE。只要有 nbconvert 就可以這樣 render slides,不需要 RISE\n", " * 有一個 ```--no-input``` flag [但會把 alignment 弄壞](https://github.com/jupyter/nbconvert/issues/915)\n", " * 加上 reveal.js 會失敗:```jupyter nbconvert export.ipynb --to slides --no-input --reveal-prefix reveal.js```\n", " * 也可以[給 nbconvert template](http://damianavila.github.io/blog/posts/hide-the-input-cells-from-your-ipython-slides.html)(這個 post 很老不知道還能不能用)\n", " * 或者[自己寫按紐](https://stackoverflow.com/questions/27934885/how-to-hide-code-from-cells-in-ipython-notebook-visualized-with-nbviewer/50790330#50790330)\n", " * 實測在 classic Jupyter Notebook 和 nbviewer 環境下能用,JupyterLab 不能用\n", " * 要在 sphinx 下用要把 ```div.input``` 改成 ```div.nbinput```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bokeh and HoloViz Panel\n", "* HoloViz Panel 目前(Oct 2020)唯一可以跑遍 widget input combinations 存下結果製造靜態網頁的 framework,然後就可以佈署到 github.io(或 RTD?)\n", "* HoloViz Panel 目前沒有 [drag and drop](https://github.com/holoviz/panel/issues/917),只有[按紐的 FileInput](https://panel.holoviz.org/reference/widgets/FileInput.html)\n", "* Bokeh 也能做有 widget 的靜態網頁,但需要自己手動先把結果存下來然後用 JavaScript 寫 callback。Plotly Dash 的 callback 就可以用 Python 寫但做出來是動態網頁\n", "* 要做 slideshow 要裝 [RISE](https://github.com/damianavila/RISE)。[FAQ](https://panel.holoviz.org/FAQ.html) 裡有人問\n", " * 但 RISE 目前為止只能在 Classic Jupyter Notebook 上用,看[這個](https://github.com/damianavila/RISE/issues/270) issue " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotly Dash\n", "* 實測用 [Jupyter Dash](https://github.com/plotly/jupyter-dash) 做的動態網頁無法透過 Voila 佈署到 mybinder 上,看這個 [issue](https://github.com/plotly/jupyter-dash/issues/23)。所以 Jupyter Dash 只能用來在 JupyterLab 裡開發\n", "* ```app.run_server()``` 會暫時佈署到 mybinder 上,可以在開發時測試用\n", " * 用的是 mybinder 配置給這個 JupyterLab 的 node。把目前的網址 lab 以下取代成 ```proxy/8050``` 就行了。例如\n", " * ```https://hub.gke2.mybinder.org/user/beginnersc-sandbox-dash-73nkpf4u/lab/workspaces/auto-F?clone=auto-M``` 變成\n", " * ```https://hub.gke2.mybinder.org/user/beginnersc-sandbox-dash-73nkpf4u/proxy/8050```\n", " * 如果有 Voila 以外的辦法可以自動 trigger notebook 執行也可以跑 ```app.run_server()``` 生成暫時可用的 Plotly Dash 網頁\n", "* 目前看來還是只能佈署到 Heroku 上\n", "* 只有 Dash Enterprise 才有 auth,不過有人做了 [Flask-Login on Dash App](https://github.com/RafaelMiquelino/dash-flask-login)\n", "* [Jupyter Dash](https://github.com/plotly/jupyter-dash) 和 [jupyter-plotly-dash](https://github.com/GibbsConsulting/jupyter-plotly-dash) 到底有什麼不同?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heroku for Deployment\n", "* 可以用來佈署動態/靜態網頁,如 Sphinx documents。不限 framework,不像 Voila 只能呈現 Jupyter notebook 的 output cell\n", "* 可以用 Flask 加密碼保護\n", "* unscalable,訪問量大的話會比 AWS 貴很多,但 AWS 設置起來比 Heroku 複雜多了" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [Hosting Private Static Site Free](https://github.com/hamelsmu/oauth-tutorial)\n", "\n", "* Using Oauth2 Proxy\n", "* 有空研究" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Jupyter Book\n", "\n", "### Installation\n", "\n", "* `poetry add jupyter-book`\n", "* `pip install jupyter-book` to add `fastjsonschema`\n", "* If you will build pdf:\n", " * `poetry add playwright`\n", " * `playwright install --with-deps chromium`\n", " * This could fail behind a corporate firewall, in which case\n", " * Download `\\\\corp.tdsecurities.com\\ny-dfs\\US_Rates\\Vol\\Scott\\ms-playwright.zip`\n", " * Unzip and put it under `C:\\Users\\zhang\\AppData\\Local\\`\n", "\n", "### Usage\n", "\n", "* In project root `jb create docs` (it doesn't have to be docs, it can be any book name you want)\n", " * if docs folder is there already it will error out\n", "* Delete md files (there are 2)\n", "* Delete md file items from `_toc.yml`\n", "* In `_config.yml` set `execute_notebooks: 'off'`\n", "* In project root `jb build docs` to build html\n", "* In project root `jb build docs --builder pdfhtml` to build pdf\n", "\n", "### Equations\n", "\n", "* 自帶 copy button!\n", "* Equation numbering can render\n", "* 認不得 `align` blocks\n", " * 試過 [amsmath extension](https://jupyterbook.org/en/stable/content/math.html#latex-style-math) 但沒用\n", "* Independent equations 放在 `$$...$$` 裡,之前和之後都需要換行符\n", "* `$$...$$` 裡可以用 `&=` 對齊,Jupyter Book 看的懂,但 VS Code 和 sphinx 看不懂\n", "\n", "### TOC\n", "\n", "* 一定要有 root 而且 root 一定要有一個檔\n", "* `chapters` 下不能馬上接 `sections`,至少要放一個檔,看[這裡](http://github.com/jupyter-book/jupyter-book/blob/main/docs/_toc.yml)\n", "* 不同的檔轉出來的 html 是不同頁\n", "* [Example: Number the chapters](https://github.com/quantgirluk/Understanding-Quantitative-Finance/blob/main/UQF/_toc.yml)\n", "```yml\n", "format: jb-book\n", "root: intro\n", "options: # The options key will be applied to all chapters, but not sub-sections\n", " numbered: True\n", "parts:\n", "- caption: Methodologies\n", " numbered: True\n", " chapters:\n", " - file: methodologies/summary\n", " - file: methodologies/BSDE_BS\n", " - file: methodologies/BSDE_SABR\n", " - file: methodologies/DGM_BS\n", " - file: methodologies/DGM_SABR\n", "- caption: Appendices\n", " chapters:\n", " - file: appendices/pytorch\n", "```\n", "\n", "### PDF\n", "\n", "* Build pdf 需要先把所有 html 接在一起變成一頁!所以如果 html 和 pdf 都要 build 應該先 build pdf\n", "* `playwright` 產生的 pdf 裡 Equation 有時候會壞掉所以沒什麼用,還是只能靠 texlive" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## nbsphinx and readthedocs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import a Project From GitHub to RTD\n", "\n", "* 如果 RTD 不是以 GitHub 註冊的,要先把 GitHub 加到 Connected Services 裡\n", " * RTD Dashbord > Settings > Connected Services\n", "* RTD Dashboard > Import a Project\n", "* Select a repo\n", "* Input a project name(不能和 RTD 目前上線的任何 project 重名)\n", "* GitHub 端的 webhook 會自動建立,也可以去 RTD 這個 project 裡的 admin > integrations 看" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### nbsphinx\n", "\n", "* See sphinx tutorial [new](https://www.youtube.com/watch?v=RvJ54ADcVno), [old](https://www.youtube.com/watch?v=oJsUvBQyHBs&feature=youtu.be) & [nbsphinx doc](https://nbsphinx.readthedocs.io/en/0.7.1/index.html)\n", "* 建一個 folder docs\n", "* 進到 docs 裡 ```sphinx-quickstart``` 開啟 wizard\n", " * [] 裡的是預設值\n", " * release 隨便打,例如 0.1\n", "* 跑完會出現 docs/source/conf.py,把 ```'nbsphinx'``` 加到 ```extensions``` 裡\n", "* 在 docs 裡 ```make html```,生成的 index.html 在 docs/build/html 裡\n", "* 在 docs/source/index.rst 裡的 toctree 很重要不能 delete 不然 sphinx 會沒辦法 make\n", " * 可以自己手動改的 index.rst 的 title\n", "* 在 docs/source 裡新增 ipynb,在 index.rst 的 toctree 裡紀錄 ipynb 的檔名(不需要副檔名)\n", " * 每個 ipynb file 一定要有 title\n", " * toctree 要對齊像這樣:\n", " ```\n", " .. toctree::\n", " :maxdepth: 2\n", " :caption: Contents:\n", " <---- 這裡一定要有一個空行 \n", " MYIPYNBFILENAME\n", " ```\n", " * 可以加 ```:hidden:``` 然後就會只出現在網頁左邊\n", " * 參考 [sphinx getting started guide](https://www.sphinx-doc.org/en/master/usage/quickstart.html) & [toctree directive](https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-toctree)\n", " * 也可以改 toctree maxdepth\n", " * 可以有多個 toctree 用不同的 captions,像 [JupyterLab doc](https://jupyterlab.readthedocs.io/en/stable/)\n", "* 切換成 3rd party readthedoc theme\n", " * 需要事先 ```pip install sphinx_rtd_theme```\n", " * 到 conf.py import 並更換 ```html_theme```:\n", " ```\n", " import sphinx_rtd_theme\n", " html_theme = 'sphinx_rtd_theme'\n", " ```\n", "* 內容全部加完後回到 docs 裡重新 ```make html```\n", "* ```make clean``` 刪除所有 build 裡的內容\n", "* 要 render bokeh plots 要看這個 [issue](https://github.com/spatialaudio/nbsphinx/issues/61)\n", "* Sphinx 做的網頁目前(Oct 2020)沒辦法在 JupyterLab 正確顯示,[local css 會有問題](https://discourse.jupyter.org/t/loading-static-css-in-jupyterlab/1088/7)\n", "* [Two column toc](https://stackoverflow.com/questions/56749718/how-to-make-2-columns-with-sphinx)(沒試過)\n", "* [md syntax](https://www.markdownguide.org/basic-syntax/) and [reStructuredText (rst) Quick Reference](https://docutils.sourceforge.io/docs/user/rst/quickref.html)\n", "* 目前(Apr 2021)要編譯 inline 的圖要在 requirements.txt 裡指定 docutils==0.16。看這個 [issue](https://github.com/spatialaudio/nbsphinx/issues/549)\n", "* 目前(Aug 2021)下面的組合會出現 `AssertionError: assert 'Verbatim' in lines[0] /nbsphinx.py\", line 2151, in depart_codearea_latex`\n", " * [這個 issue](https://github.com/spatialaudio/nbsphinx/issues/584) 有人提出 working 的版本組合\n", " ```\n", " docutils==0.16\n", " sphinx>=1.4\n", " sphinx-copybutton\n", " ipykernel\n", " nbsphinx\n", " ```\n", "* [nbsphinx 預設執行所有沒有 output 的 input cell](https://nbsphinx.readthedocs.io/en/0.3.5/usage.html#Running-Sphinx)。如果有 c++ kernel 的 cell 不想被執行要在 conf.py 裡加 `nbsphinx_execute = 'never'` [強制不執行](https://nbsphinx.readthedocs.io/en/0.5.1/never-execute.html)\n", " * 要執行用 xcpp17 kernel 的 notebook 要在 repo home 用 readthedocs.yml 和 environment.yml 把 xeus-cling 灌進 RTD。看[文檔](https://nbsphinx.readthedocs.io/en/0.8.1/usage.html#Using-conda)說明\n", " * 實測用 conda 灌排版會亂掉,不知道少灌了什麼。目前這個 repo home 的 `_readthedocs.yml` 和 `environment.yml` 就是灌失敗的檔\n", "* [code packaging](https://pythonpackaging.info/) 4.6 裡說加 `nbsphinx_prompt_width = 0` 可以讓 cell 的 prompt(例如 `In [1]:`)消失,實測沒有消失,只是往外推\n", "* 在 code block 加 copy button [(sphinx-copybutton)](https://stackoverflow.com/questions/39187220/how-to-add-a-copy-button-in-the-code-blocks-for-rst-read-the-docs)\n", " * `pip install sphinx-copybutton`\n", " * 在 conf.py 的 extensions 裡加 `'sphinx_copybutton'`\n", " * sphinx-copybutton 是出自一個 [executable books project](https://github.com/executablebooks),裡面還有其它 sphinx extension,例如 [sphinx-togglebutton](https://github.com/executablebooks/sphinx-togglebutton)\n", " * 試不出怎麼在 notebook 裡放 sphinx directives 所以沒辦法在 nbsphnix 裡用 toggle button\n", "* hide input cell but not output\n", " * [用第一個 cell 產生按紐控制整頁 input cell hide/show](https://stackoverflow.com/questions/27934885/how-to-hide-code-from-cells-in-ipython-notebook-visualized-with-nbviewer)\n", " * 預設是 hide。把 `.show()` 和 `.hide()` 對調就變成預設 show\n", " * 在預設 hide 的情況下把按紐()移走只留下空白的 form,就會把整面的 input 隱起來再也打不開\n", " * 據說 [nbsphinx hidden](https://nbsphinx.readthedocs.io/en/0.8.6/hidden-cells.html) 可以把 input/output cell 都隱起來不過實測沒有用\n", "* internal link hack\n", " * 直接在 () 裡放同一層裡另一頁的 html 檔名,例如 `[blah blah](other.html)`\n", " * 也可以指定 section 例如 `[blah blah](other.html#See-Also)`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Build in RTD\n", "\n", "* [在 RTD 環境配置裡安裝 nbsphinx](https://nbsphinx.readthedocs.io/en/0.3.3/usage.html#Automatic-Creation-of-HTML-and-PDF-output-on-readthedocs.org),在這個 repo 裡創建 requirements.txt 加入以下內容:\n", " ```\n", " sphinx>=1.4\n", " ipykernel\n", " nbsphinx\n", " ```\n", "* 在 conf.py 裡加入 ```master_doc = 'index'``` 不然會出現 [contents.rst not found Error](https://stackoverflow.com/questions/56336234/build-fail-sphinx-error-contents-rst-not-found)\n", " * RTD 預設用 ```contents.rst``` 作為 entry point 而非 ```index.rst```\n", "* 在 conf.py 裡加入 ```nbsphinx_allow_errors = True``` 不然用 ipywidgets 時 build 會失敗\n", "* 每次 push 回 GitHub,對應的 RTD project 就會自己重新 build(看 project 裡的 Builds)。也可以自己在 Overview 裡 Build version\n", "* RTD 只需要 source 裡的內容就夠了,所以在 local ```make clean``` 再 push 回 GitHub 也沒問題\n", " * `make clean` 完連 `make.bat` 和 `Makefile` 都可以砍掉(在 DocsTest repo 測過),不過這樣會沒辦法 local build\n", "* [把這段貼到 conf.py 讓 RTD PDF 編譯中文](https://www.kawabangga.com/posts/2331)(繁體中文需要用字型 [bsmi](https://wlzhong.wordpress.com/2016/10/31/latex-%E5%A6%82%E4%BD%95%E5%9C%A8%E6%96%87%E7%AB%A0%E4%B8%AD%E8%BC%B8%E5%85%A5%E4%B8%AD%E6%96%87/))\n", " ```\n", " latex_elements = {\n", " # The paper size ('letterpaper' or 'a4paper').\n", " #'papersize': 'letterpaper',\n", " #\n", " # The font size ('10pt', '11pt' or '12pt').\n", " #'pointsize': '10pt',\n", " #\n", " # Additional stuff for the LaTeX preamble.\n", " #'preamble': '',\n", " 'preamble': r'''\n", " \\hypersetup{unicode=true}\n", " \\usepackage{CJKutf8}\n", " \\DeclareUnicodeCharacter{00A0}{\\nobreakspace}\n", " \\DeclareUnicodeCharacter{2203}{\\ensuremath{\\exists}}\n", " \\DeclareUnicodeCharacter{2200}{\\ensuremath{\\forall}}\n", " \\DeclareUnicodeCharacter{2286}{\\ensuremath{\\subseteq}}\n", " \\DeclareUnicodeCharacter{2713}{x}\n", " \\DeclareUnicodeCharacter{27FA}{\\ensuremath{\\Longleftrightarrow}}\n", " \\DeclareUnicodeCharacter{221A}{\\ensuremath{\\sqrt{}}}\n", " \\DeclareUnicodeCharacter{221B}{\\ensuremath{\\sqrt[3]{}}}\n", " \\DeclareUnicodeCharacter{2295}{\\ensuremath{\\oplus}}\n", " \\DeclareUnicodeCharacter{2297}{\\ensuremath{\\otimes}}\n", " \\begin{CJK*}{UTF8}{bsmi}\n", " \\AtEndDocument{\\end{CJK}}\n", " ''',\n", " }\n", " ```\n", "* RTD 只有 enterprise 才有 private documentation\n", "* 可以 build 不同 branch 例如 dev,在 dashboard 去 version 裡 activate dev。如果 repo 有 tag 會自動 build stable" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### [RTD autodoc](https://www.youtube.com/watch?v=LQ6pFgQXQ0Q)\n", "\n", "* 手動 autodoc:[在 rst 裡加入像](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html)\n", " ```\n", " .. automodule:: pyod.models.abod\n", " :members:\n", " :undoc-members:\n", " :show-inheritance:\n", " :inherited-members:\n", " ```\n", " ```\n", " .. autofunction:: func\n", " ```\n", " ```\n", " .. automethod:: a_method\n", " ```\n", " ```\n", " .. autoclass:: a_class\n", " :members:\n", " ```\n", " * `:members:` 後面接想出現在 doc 上的 methods。如果空白,自動顯示所有 public methods(名稱不以底線開頭者)\n", "* 在 conf.py 裡加入這兩行,autodoc 才找的到 module" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os, sys\n", "\n", "sys.path.insert(0, os.path.abspath('../..'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 看 [exampy repo](https://github.com/beginnerSC/exampy/blob/master/docs/source/reference.rst) 和 [doc](https://exampy.readthedocs.io/en/latest/reference.html)\n", "* 或者在 conf.py 裡[加入](https://github.com/readthedocs/readthedocs.org/issues/1139)下面這段,[sphinx-apidoc](https://www.sphinx-doc.org/en/master/man/sphinx-apidoc.html) 就會自動掃過所有 script 並產生 rst(如 pycircle.rst)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import sys\n", "from sphinx.ext.apidoc import main\n", "\n", "sys.path.insert(0, os.path.abspath('../..'))\n", "\n", "def run_apidoc(_):\n", " sys.path.append(os.path.join(os.path.dirname(__file__), '../..'))\n", " cur_dir = os.path.abspath(os.path.dirname(__file__))\n", " module = os.path.join(cur_dir, '../..', 'pycircle')\n", " main(['-e', '-o', cur_dir, module, '--force']) # cur_dir is the output path *.rst will be in\n", " \n", "def setup(app):\n", " app.connect('builder-inited', run_apidoc)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 在 conf.py 的 ```extensions``` 裡加入:\n", " * `sphinx.ext.autodoc` \n", " * `sphinx.ext.napoleon` 讓 autodoc 讀懂 [Google 和 NumPy style docstring](https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html)\n", " * `sphinx.ext.mathjax` 可以讀懂數學式\n", " * `sphinx.ext.viewcode` 在 autodoc 產生的 API doc 中有 `[source]` 連結\n", "* ```../../pycircle``` 是 module 的路徑\n", "* 在 rst 裡加入\n", " ```\n", " .. toctree::\n", " :maxdepth: 1 \n", " :caption: PyCircle API Reference\n", " pycircle.rst\n", " ```\n", "* [apidoc 掃遍 package 的過程中會執行所有 script](https://www.youtube.com/watch?v=qrcj7sVuvUA),所以除了 class,function 和 ```if __name__ == \"__main__\":``` block 以外不要在 script 裡寫其它東西 \n", "* [sphinx doc](https://www.sphinx-doc.org/en/master/usage/quickstart.html#autodoc) \n", "* [NumPy Docstring Guide](https://numpydoc.readthedocs.io/en/latest/format.html)\n", " * 數學要這樣寫\n", " ```\n", " The FFT is a fast implementation of the discrete Fourier transform:\n", " .. math:: \n", " X(e^{j\\omega } ) = x(n)e^{ - j\\omega n}\n", " ```\n", " * 也可以 inline\n", " ```\n", " The value of :math:`\\omega` is larger than 5.\n", " ```\n", " * 實測 Parameters 上面一定要有空行不然會 build 出亂碼\n", " ```\n", " \"\"\"Perform minimax linkage on a condensed distance matrix.\n", " <--- 空行\n", " Parameters\n", " ----------\n", " dists : ndarray\n", " The upper triangular of the distance matrix. The result of\n", " ``pdist`` is returned in this form.\n", " <--- 空行\n", " Returns\n", " -------\n", " Z : ndarray\n", " A linkage matrix containing the hierarchical clustering. See\n", " the ``scipy.cluster.hierarchy.linkage`` function documentation for more information\n", " on its structure.\n", " \"\"\" \n", " ```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### [shpinxcontrib-disqus](https://robpol86.github.io/sphinxcontrib-disqus/usage.html)(沒試過)\n", "\n", "* RTD 預設[不 support comments](https://docs.readthedocs.io/en/stable/faq.html#i-want-comments-in-my-docs)\n", "* [這個 RTD 站](https://linuxtools-rst.readthedocs.io/zh_CN/latest/)有 comments 但不知道是不是用 Disqus\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Binder\n", "\n", "* 可以用 JupyterLab 打開,JupyterLab 裡也有 internet access 可以用 Terminal push 到 github\n", "* 沒有給 local storage 所以每次改完一定要 commit + push 不然就不見了\n", "* 有一個小缺點就是 inactive 十分鐘 kernel 就會自己斷掉,不過有 offlinenotebook 就沒有影響了\n", "* 不要濫用。查看 [Usage Guideline](https://mybinder.readthedocs.io/en/latest/about/user-guidelines.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 環境配置(from [Binder doc](https://mybinder.readthedocs.io/en/latest/config_files.html))\n", "\n", "* 如果只是要加 python package,可以用一個 requirements.txt 列下需要有哪些 python packages,像這樣:\n", " ```\n", " numpy\n", " scipy\n", " pandas\n", " ```\n", "* 預設是安裝最新版,也可以指定版本例如 ```seaborn==0.11.0```\n", "* 安裝 C++ Kernel\n", " 1. 複製 [xeus-cling repo](https://github.com/jupyter-xeus/xeus-cling) 裡的 environment.yml 放在這裡\n", " 1. boost 也用 conda 灌,不要用 apt,[不然 xeus-cling 找不到](https://stackoverflow.com/questions/61205040/how-do-i-use-boost-with-the-xeus-cling-jupyter-kernel)。看 sandbox-quant 的配置\n", " 1. 試了用 conda 灌 fftw 但灌不進去。可以 include 但真正用的時候會有 linking error\n", " 1. Eigen 也用 conda 灌。用之前要先 `#pragma cling add_include_path(\"/srv/conda/envs/notebook/include/eigen3/\")`,看[這裡](https://github.com/jupyter-xeus/xeus-cling/issues/107)\n", " 1. 在 dependencies 下把需要的 python package 加進去\n", " 1. 一旦配置了 environment.yml,requirements.txt 會自動被乎略(更 general 的,一旦配置了 Dockerfile,requirements.txt 和 environment.yml 都會自動被乎略)\n", " 1. 更多細節看[這裡](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)\n", "* 安裝 R(參考[這裡](https://github.com/binder-examples/r))\n", " 1. 創建一個 ```runtime.txt``` 裡面寫一行 ```r-3.6-2019-09-24```。這是 R 在 MRAN 上的一個版本的 snapshot\n", " 1. 創建一個 ```install.R``` 裡面列出要安裝的 R packages(Box-Cox Transformation 在 MASS 裡)\n", " ```\n", " install.packages(\"rmarkdown\")\n", " install.packages(\"leaflet\")\n", " install.packages(\"MASS\")\n", " ```\n", " 1. [這裡](https://github.com/jupyterlab/jupyterlab-demo/blob/master/binder/environment.yml)用了另一種方法安裝\n", "* Jupyterlab extension,例如 [toc](https://github.com/jupyterlab/jupyterlab-toc) 要新建在一個 postBuild file 寫\n", " ```\n", " jupyter labextension install @jupyterlab/toc\n", " ```\n", "* [Python debugger](https://github.com/jupyterlab/debugger) 同時需要用 conda 安裝 nodejs 和 xeus-python kernel 和 labextension @jupyterlab/debugger,所以 environment.yml 和 postBuild 都要改\n", "* postBuild\n", " * postBuild 可以執行任何 bash commands\n", " * 用 postBuild 安裝 execution time labextension + (用 bash)去 nbextension folder 把這個功能打開的[例子](https://github.com/deshaw/jupyterlab-execute-time/blob/master/binder/postBuild)(這個是 D. E. Shaw 的 repo)\n", " * 把 git add, commit, push 寫在一個 command 裡並放到 .bashrc 裡,在 postBuild 裡寫下這段:\n", " ```\n", " echo '\\n# git add commit and push in one command \\n' >> .bashrc\n", " echo 'function cnp() { ' >> .bashrc\n", " echo ' git add * ' >> .bashrc\n", " echo ' git config --global user.name \"beginnerSC\" ' >> .bashrc\n", " echo ' git config --global user.email \"25188222+beginnerSC@users.noreply.github.com\" ' >> .bashrc\n", " echo ' git commit -a -m \"update\" ' >> .bashrc\n", " echo ' git push ' >> .bashrc\n", " echo '} ' >> .bashrc\n", " ```\n", " * 但 push 還是需要敲 GitHub 密碼。如果要避免每次敲密碼應該永遠用 token pull(在 [beginnersc.github.io](https://beginnersc.github.io) 用 \"private\" 登入)\n", " * 安裝 [gc](https://miscbeginnersc.readthedocs.io/en/latest/cs/python_advanced.html#Making-Command-Line-Commands-Using-Python)\n", " 1. `gc` requires `pycrypto` which requires gcc: `apt-get install gcc` and then `pip install pycrypto`\n", " 1. 把 gc 放在 binder 裡\n", " 1. `chmod +x $HOME/binder/gc`\n", " 1. `echo 'export PATH=$HOME/binder:$PATH' >> .bashrc`\n", "* 最好不要照著 doc install 隨便改寫 environment.yml 和 postBuild。最穩定的辦法還是去找要安裝的 kernel 有沒有 binder link,照抄那個 repo 裡的配置\n", "* 如果要安裝 latex 沒辦法用 conda 安裝,打開 JupyterLab 再安裝也有權限問題(反正也留不下來),要用 apt.txt,參考 [binder-examples/latex](https://github.com/binder-examples/latex):\n", " ```\n", " cron\n", " pandoc\n", " dvipng\n", " ghostscript\n", " texlive-fonts-recommended\n", " texlive-generic-recommended\n", " texlive-latex-base\n", " texlive-latex-extra\n", " texlive-latex-recommended\n", " texlive-publishers\n", " texlive-science\n", " texlive-xetex\n", " texlive-lang-chinese\n", " ```\n", "* nbconvert to PDF 編譯中文:\n", " * 在 apt.txt 裡需要有 texlive-lang-chinese\n", " * nbconvert 編譯 tex 檔時會用一個 template,要去修改\n", " * ```find ../../.. -name \"base.tex.j2\" | sort```\n", " * 會找出 ```../../../srv/conda/envs/notebook/lib/python3.7/site-packages/nbconvert/templates/latex/base.tex.j2```\n", " * 進到那個 folder 裡用 vim 修改,把 ```\\documentclass[11pt]{article}``` 替換為\n", " ```\n", " \\documentclass{article}\n", " \\usepackage{xeCJK}\n", " ```\n", " * 參考[這個 issue](https://github.com/c1mone/Tensorflow-101/issues/4)\n", " * 實測用 ctex 會把日期也編譯成中文,xeCJK 不會,所以在 template 裡 hard code 永遠用 xeCJK 就可以中英兩種文件都編\n", " * 有些繁體字沒有字型,例如「複」「裡」會編成亂碼,大扣分!\n", " * 因為 template 不在 home 之下,要用 postBuild 每次 launch JupyterLab 的時候自動改(去看 sandbox-stable 裡的 postBuild)\n", "* 開啟 Binder 所需時間\n", " * 每次 commit 之後第一次開會很慢,因為 binder 根據這些 config 生成了 Dockerfile 並且推送到 Docker Hub 上\n", " * 加了 c++ kernel 和幾個 python packages 之後 rebuild 竟然要半小時!\n", " * 如果只改 postBuild,其它沒有動好像會比較快\n", " * 如果用兩個 repo,一個存內容一個存配置好的環境會快很多(下面有更詳細的說明)\n", "* 雖然不用自己手動配置但 kernel 在 ```../../srv/conda/envs/notebook/share/jupyter/kernels```\n", "* xcpp 相關的執行檔在 ```../../srv/conda/envs/notebook/bin```\n", "* 配置 Dockerfile\n", " * 創建一個檔案名為 Dockerfile(no extension)放在 sandbox 裡\n", " * 一旦配置了 Dockerfile,requirements.txt 和 environment.yml 都會自動被乎略\n", " * Export notebook as PDF,參考:\n", " * [Binder examples 裡的 minimal Dockerfile](https://github.com/binder-examples/minimal-dockerfile)\n", " * [nbconver installation](https://nbconvert.readthedocs.io/en/latest/install.html) 裡有提到需要哪些 texlive package\n", " * [JupyterLab installation](https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html)\n", " * [latex-online 的 Dockerfile](https://hub.docker.com/r/aslushnikov/latex-online/dockerfile)\n", " * [Latex.Online](https://latexonline.cc/) 是一個有付 API 的 online latex compiler\n", " * 更多 latex package 如果有需要可以到這個 Dockerfile 找\n", " * 照抄 Binder examples 裡 minimal 的配置然後把 latex-online Dockerfile 裡的 texlive package 只選需要的裝進去\n", " * 用同樣的方法也成功灌了 cron(在 pandoc 上面多加一行 cron 就行了)但沒有 editor 所以還是沒辦法 ```crontab -e```\n", " * 有一個 [Jupyter Docker Stacks](https://jupyter-docker-stacks.readthedocs.io/en/latest/) 裡有很多配好的 Dockerfile 可以參考,學怎麼配置自己要的 Dockerfile(網頁左上有 GitHub 連結)\n", " * 可以 [locally 跑 repo2docker](https://repo2docker.readthedocs.io/en/latest/usage.html) 來 generate Dockerfile,但是 repo2docker 需要先安裝 Docker 才能跑,所以還是沒辦法直接在 binder 上跑(目前沒試出來怎麼用 binder 安裝 docker,因為需要 root access 和 [add-apt-repository](https://docs.docker.com/engine/install/ubuntu/) 而 [postBuild 沒有 root access](https://github.com/jupyterhub/repo2docker/issues/192))\n", " ```\n", " pip install jupyter-repo2docker\n", " jupyter-repo2docker https://github.com/beginnerSC/sandbox-stable\n", " ```\n", " * 還是得學自己寫 Dockerfile\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 內容和環境分開配置在兩個 repo\n", "\n", "* 參考[這篇 post](https://discourse.jupyter.org/t/how-to-reduce-mybinder-org-repository-startup-time/4956) 和作者的 [env repo](https://github.com/choldgraf/binder-sandbox)\n", "* 環境 repo 配置完成之後不需要經常更改,所以沒有每次 commit 完開啟 Binder 就重新 build Dockerfile 的問題\n", "* 內容是用一個 nbgitpuller load 到開機完成的 Binder 裡的\n", "* 所以在 environment.yml 裡需要指定安裝 nbgitpuller,還要在 postBuild 裡 enable(nbgitpuller 是一個 Jupyter server extension)\n", "* content repo 裡不能有 environment.yml 等 config file,不然開一次 env repo 就會壞掉了,重新 commit rebuild 也沒用\n", "* 預設是用 Classic Jupyter Notebook 打開。如果要用 JupyterLab 打開,連結非常複雜,[nbgitpuller 的文檔](https://jupyterhub.github.io/nbgitpuller/link.html)也沒寫清楚,不過[這位老兄](https://edu.oggm.org/en/latest/user_content.html) figure out 了\n", " * Apr 2021 以後 build 的 image 這個功能已經壞掉了,看 [sandbox-test](https://github.com/beginnerSC/sandbox-test) 和 [sandbox-test1](https://github.com/beginnerSC/sandbox-test1) \n", "* 要從 Terminal commit content 的時候要先進到 content repo folder" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### nbgitpuller 可以 [pull private project](https://github.com/jupyterhub/nbgitpuller/issues/53)\n", "\n", "* 有兩個方法,一個是用 token,缺點是連結裡面會有 token,另一個是用一個 git proxy,看上面的連結討論\n", "* 要用 token 生成連結首先要知道 [git 怎麼用 token 直接 pull private project](https://github.blog/2012-09-21-easier-builds-and-deployments-using-git-over-https-and-oauth/)\n", "* GitHub 登入後 generate 一個 token:Settings -> Developer settings -> Generate new token -> 存在一個安全的地方\n", " * token 不分 project,所以知道這個 token 的人就有所有 private project 的讀寫權限\n", " * 實測過用 token 打開的 JupyterLab 在 git push 的時候不需要再敲一次帳號密碼\n", "* 最終 JupyterLab fast startup 連結是這樣的:https://mybinder.org/v2/gh/beginnerSC/sandbox-stable/master?urlpath=git-pull?repo=https://___TOKEN___@github.com/beginnerSC/___PRIVATE_PROJECT___%26amp%3Bbranch=master%26amp%3Burlpath=lab/tree/___PRIVATE_PROJECT___?autodecode" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### [Binder](https://mybinder.org/) and [Colab Badges](https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb#scrollTo=8QAWNjizy_3O)\n", "\n", "* [自動生成 badge md 的 app](https://mybinder.org/) \n", "* 要用 JupyterLab 開自己加 `?urlpath=/lab/tree/docs/source/quick_start.ipynb`\n", "* GitHub 上 public repo 裡的任何 notebook file 都可以在 google colab 打開\n", "* pyminimax 下的 `docs/source/quick_start.ipynb` 連結是 https://mybinder.org/v2/gh/beginnerSC/pyminimax/master?urlpath=/lab/tree/docs/source/quick_start.ipynb 和 https://colab.research.google.com/github/beginnerSC/pyminimax/blob/master/docs/source/quick_start.ipynb ,badge md 如下\n", "```\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/beginnerSC/pyminimax/master?urlpath=/lab/tree/docs/source/quick_start.ipynb)\n", "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/beginnerSC/pyminimax/blob/master/docs/source/quick_start.ipynb)\n", "```\n", "* 用 colab 的壞處是,不同的 notebook 需要不同的連結,用 JupyterLab 可以直接開一個 folder\n", "* [有辦法](https://stackoverflow.com/questions/55253498/how-do-i-install-a-library-permanently-in-colab)可以叫 colab 記住 package,不用每次重新 pip install,但很麻煩" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cloud JupyterLab Solutions Other Than Binder\n", "\n", "* Notebooks.ai 掛掉之後只剩下[一些選項](https://www.dataschool.io/cloud-services-for-jupyter-notebook/?fbclid=IwAR316JuoHek2bAFjYEgvFH2XIJDtUMoaBNKCml7nusaZkOB0oTTKSnymeu0)\n", "* 免費的選項中最好的是 Binder,只是要學會配置環境\n", "* 其實 Binder 不是設計來這樣用的,最好還是研究怎麼在 Google Colab 上安裝 JupyterLab\n", " * 這個 [post](https://medium.com/@swaroopkml96/jupyterlab-and-google-drive-integration-with-google-colab-42a8d64a9b63) 提到的 serveo 已經不能用了,但好像可以用 [ngrok](https://voila.readthedocs.io/en/stable/deploy.html#sharing-voila-applications-with-ngrok) 代替\n", "* private project solution: \n", " * CoCalc\n", " * 可以用 JupyterLab 打開(打開 settings 之後在最右下角)\n", " * 連 github 或 publish 或 pip install 需要取得 internet access,都是付費才能打開的功能\n", " * 但在 cocalc 本地端 Terminal 用 git 是可以的\n", " * 唯一的問題就是哪天 cocalc 如果又要倒,檔案只能一個一個下載(或者想辦法下載 .git 檔就可以了?)\n", "* public project solutions: \n", " * Gitpod\n", " * 直接在一個 github repo 的網址列前面加上 ```https://gitpod.io/#``` 打開項目\n", " * ipynb file 上按右鍵可以選 Open in Notebook Editor,編輯完在左邊 stage + commit 然後到右邊 push 到 github 上。但這個編輯器沒有 JupyterLab 好用\n", " * free account 每個月有 50 小時的使用時間限制\n", " * 付費會員可以編輯使用時間變成每個月 100 小時並且可以開 private github projects\n", " * gitpod 在打開 project 的時候會檢查此 project 是否 public,編輯完 commit 的時候不會\n", " * Kyso (read-only)\n", " * 像 Blog,讀者可以留言。Kyso 可以 render nb(nicely),也可以連 github public/private projects 還自動同步,但不能直接在 Kyso 編輯\n", " * 實測過和 GitHub 是同步的\n", " * make sense 因為有 webhook(GitHub Repo -> Settings -> Webhooks),commit 的時候會自動觸發 Kyso 同步\n", " * authorize 的時候要選 all repo 才能連 private project\n", " * Kyso 老記不住哪一個是 main file(然後就沒有 Files tab 可以按,也看不到任何 notebook),加 kyso.yaml 也沒用,README 裡的連結是最後是直接貼 README 在 Kyso 的連結才解決的\n", " * 進去之後自己進 Files tab 看所有 notebook\n", " * Code Hidden/Shown\n", " * 所有 nb 打開的時候預設是 Code Hidden,所有 input cell 都看不見。右上角可以選 Code Shown\n", " * 像這樣貼連結給人,對方打開的時候才是看的到 code 的:https://kyso.io/beginnerSC/misc/file/Piano.ipynb#code=shown\n", " * Kyso 沒辦法顯示 raw cell 不過本來就應該盡量避免 raw cell。nbviewer 能顯示可是會亂掉\n", " * 還是需要由 nbviewer 補足因為:\n", " * 手機平板上無法顯示 Files tab 也無法顯示 Code Hidden/Shown 的選單,這樣就沒辦法從 README 的連結找到其它 nb 了\n", " * 直接在 input cell 貼上的圖沒辦法顯示(GitHub 也不行),例如 [Backprop.ipynb](https://nbviewer.jupyter.org/github/beginnerSC/misc/blob/master/Backprop.ipynb) 的圖,只有在 nbviewer 看的到\n", " * 一張圖如果要在 Kyso 看的見就要存成圖檔,像 [Theory.ipynb](https://nbviewer.jupyter.org/github/beginnerSC/misc/blob/master/Theory.ipynb)\n", " * nbviewer (read-only)\n", " * 缺點不是即時的,據說有 10 分鐘 delay(真的只有 10 分鐘嗎?)可以手動在 url 後面加上 ```?flush_cache=true``` 或 ```?flush_cache=True``` 有時候會有用(browser dependent)\n", " * readme 裡的 url 是\n", " * nbviewer:https://nbviewer.jupyter.org/github/beginnerSC/misc/tree/master/\n", " * Kyso:https://kyso.io/beginnerSC/misc/file/README.md\n", " * Gitpod:https://gitpod.io/#https://github.com/beginnerSC/misc\n", " * JupyterLab (Binder, launch from this repo):https://mybinder.org/v2/gh/beginnerSC/misc/master?urlpath=lab\n", " * 把 JupyterLab url 最後面的 ```?urlpath=lab``` 拿掉就變成普通的 binder link,用 Jupyter Notebook 打開\n", " * [這裡](https://github.com/binder-examples/jupyterlab)有說明\n", " * 這個 repo 的 environment.yml 已經砍掉了。可以還是開但連 numpy 都沒有\n", " * Classic Jupyter Notebook (fast startup):https://mybinder.org/v2/gh/beginnerSC/sandbox/master?urlpath=git-pull?repo=https://github.com/beginnerSC/misc\n", " * JupyterLab (fast startup):https://mybinder.org/v2/gh/beginnerSC/sandbox/master?urlpath=git-pull?repo=https://github.com/beginnerSC/misc%26amp%3Bbranch=master%26amp%3Burlpath=lab/tree/misc?Fautodecode\n", " * 那個 %26amp%3B 是 & 的 encoding 但網址列認不得所以還是只能寫 %26amp%3B\n", "* public/private project solution\n", " * Code Ocean \n", " * 可以用 JupyterLab 打開,JupyterLab 裡也有 internet access 可以用 Terminal push 到 github\n", " * 在 env 裡可以自己 install 需要的 package 如 numpy,scipy,pandas\n", " * 只能 import from public github repo(如果要 import private repo 要打開一瞬間)\n", " * 唯一的缺點是每個月十小時的計算時間限制\n", " * Binder\n", " \n", " " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" }, "toc-autonumbering": false, "toc-showmarkdowntxt": false }, "nbformat": 4, "nbformat_minor": 4 }