Data Ownership

Who owns data on you? Google Maps has tracked nearly every location you’ve been in the last decade. But you cannot query it in any way. Have you ever wanted to know: “How many times have I been to the gym this week?” or “How many days did I spend in California this year?” You…

How to avoid overwriting files in python

Strategy 1: increment a suffix so like “file.pdf(1)” Version: Simple Version: Guards against race conditions The above is safe if there is only one process writing to the location but it can include a race condition. This is the safer version that “opens” the file. Strategy 2: attach a timestamp

How to use the python rank-bm25 library

Note: this library is called rank-bm25 on pypi (pypi) and NOT bm25 Official docs (github) say: However, you often want to clean the corpus (lowercase, remove punctuation) before indexing the corpus. Once you do that you must keep the original corpus around that index into the original unedited strings. This works for most cases! Note,…