Day 1, 2025-04-23
What Django needs from you, Sarah Boyce, Video
- Django needs you
- Attention is finite, there isn't enough to go around
- 300+ open PRs in the Django project
- 98% of contributors don't return
- Quickstart guide to PR review:
- Specialize in one area of Django, build on this knowledge instead of context switching
- Code is like an asset, like a house. Is it solid?
- Check the documentation
- Be aware of your attention footprint
- Question: how to encourage code reviews, incentivize them?
End-to-end testing, Jacob Rief, Innsbruck University, Video
- Used to use Selenium, found Playwright in 2021
- pytest uses "fixtures" ☛ nothing to do with Django fixtures
- HTMX testing: use end-to-end tests
- Playwright is a fork of Puppeteer!
- Combine unit and end-to-end testing ☛ Why do Django docs recommend unittest and many people recommend pytest?
- Playwright can take screenshots
- See playwright codegen for recording tests
Turn back time, Tim Bell, Kraken, Video
- "32 bits is enough, I say"
- Use bigint instead of integer
- Kraken: energy retail software
- Convert to BigInt payment field from Int payment field
- 3-phase migration: new column, add constraints, swap column in atomic transaction
- Converting money fields -- "having a lock is important" cf. 2023 talk
- Time-consuming: backfilling shadow columns
- Add NOT NULL constraint without locking, then validate constraint separately
- Swap out columns with ALTER TABLE
- SeparateDatabaseAndState to alter Django's view of the models
- Took 5 months and 104 PRs! *sobbing octopus*
- Can also be used for primary keys
- Backfilling is slow, triggers vacuuming, done via cron job
- See also: Datadog monitoring
- Possibility: using shadow tables instead of a shadow column
Data-oriented Django Drei, Adam Johnson, Video
- Use APM for measurements
- 60% of time spent in database access, mostly reads, not writes
- O(n) search is default
- "Indexes make the world go around", ☛ "indices"?
- B+ tree search runs in O(log n)
- B+ tree has side-links between leaf nodes
- Partial indexing is possible
- Inclusion indexes limit columns to include
- Alternative index data structures: Gist, GIN, hash, Bloom, HNSW ☛ cf. pgvector
- Default indexes on primary key, foreign key, unique constraint
- You can replace a default index
- Indexes add overhead to write, take storage
- Indexes are an art, not a science
- By design: ask questions, target filtered columns
- By debugging: target slow queries
- pgMustard -- a tool for looking at queryset.explain JSON output
- PlanetScale B-trees post, ☛ cf. the time-sortable uuidv7
- use-the-index-luke.com
- PostgreSQL docs Chapter 11
Workshop, Daniele Procida, Canonical
- Aim: engineering quality at scale -- define, measure and elevate
- Scale: 1,000+ people
- Define desired quality, don't necessarily turn it into quantity
- Example use case for release notes
- Break down a quality into objective criteria, "quality objectives"
- Find an aim, then find "started", "first results", "mature results" criteria
- Create a quality dashboard, cf. Canonical dashboard
- Make an agreement template with commitments
- Summary: Principle, Tool, Method
- Peer pressure and objectification: use them positively
How to solve a Python mystery, Aivars Kalvāns, Video
- TietoEvry, Ebury
- Brendan Gregg - Linux performance observability tools
- Restoring deleted log files: unlink will not remove files open in a process
-
cat /proc/.../fd/...
to view file contents ☛ see also lsof - strace traces system calls to the kernel
- strace does not require root access, nor does it need to be installed with sudo
- System call cheatsheet
-
connect
/read
/recvfrom
: no timeout, which is dangerous - Can report status of all threads, e.g. deadlock
- nmon, iostat -x
- Hardware can be at fault
- Testing network connections -> see tcptraceroute
- TCP_NODELAY - cf. Nagle's algorithm
- tcp_keepalive_time has default of 2 hours in Linux ☛ cf. RFC1122
- libkeepalive
- Service mesh: maybe not needed, maybe overkill
- ~40ms magic number
- "BPF trace"
- Linux auditd - audit package with user space auditing tools
Bulletproof Data Pipelines, Ricardo Morato Rocha, Vinta Software, Video
- A thought experiment to understand parallelization: a video model with multiple frames called Djangovids
- Celery handles unavailability and retries tasks
- Django model "FrameEnhancementExecution" which stores a task as an execution model, including its argument and status
- Take advantage of idempotency to ensure reliability and repeatability
- Clarify chain of responsibility
Logs, shells, caches and other strange words, Slawa Gladkov, Zyklum, Video
- The word "logbook" comes from navigational data, ships
- ENIAC had handwritten logs
- Expression "throwing a log"
- Cache, French "cacher" = to hide
- Squirrels "cache" nuts
- English meaning of cache now also used in French
- Use of word "demon" due to Fernando Corbató, 1990 Turing Award winner
- "Ping": ftp.arl.mil/~mike/ping.html
- "Shell" ca. 1965 and copied by Unix
- "RC" comes from "RUNCOM", a command processor
- Edison came up with "bug", by intuition
Lightning Talks, Video
- Aivars Kolvans: Celery and Redis not ideal, cf. django-taskq
- Caddy, CaddySnake for embedding Python apps in Caddy
- PyCon Italia in Bologna, 2025