🐮 Is a milk dud just an udder failure?
Late to the Party 🎉 is about insights into real-world AI without the hype.
Hello internet,
What a week of announcements. Announcements and failures! Often a theme emerges when I collect my favourite links, and this issue sure is filled with learnings and failures. Let’s dive right in!
The Latest Fashion
- Google lost $100 billion in stock value over factual mistakes in a half-baked chatGPT competitor announcement
- In a terrifying display, developers built a police sketch app on top of DALL-E 2, enforcing unethical biases
- “A Categorical Archive of ChatGPT Failures” documents problems even from the best model we currently have
Got this from a friend? Subscribe here!
My Current Obsession
I am a huge fan of RSS. Just like this newsletter, there is no algorithm. Just a way to communicate updates. Since the official Python organiser RSS feed for conferences is deprecated, I added RSS to pythondeadlin.es. Very proud of that, and I also added links to the CfP directly, trying to make this website the most useful I can.
Thing I Like
You know when a realization just changes your entire view on an activity or fact? Well, I train at a normal commercial gym, which barely has a deadlift platform. Loading up plates on the sides is a hassle and I was sad I didn’t have the fancy helpers I had in my Edinburgh gym. The moment I realised I can just buy a Deadlift Wedge and put it in my gym bag… My mind was absolutely blown. Silly, I know, but it’s true.
Hot off the Press
I don’t know why it feels like time is slipping through my fingers. But I hope one day I can write more again. I have so many ideas floating in my head, but they’re so ephemeral.
In Case You Missed It
With the changes to the pricing of the Twitter API, I reposted my TIL about changing from the Twitter API to retrieve Mastodon posts with code examples, and it’s currently the most visited post on my website.
Machine Learning Insights
Last week I asked, Name a few examples for kernels in an SVM or Gaussian Process, and here’s the gist of it:
Support Vector Machines (SVM) and Gaussian Process (GP) algorithms use kernels to project higher-dimensional data without suffering from the curse of dimensionality. Here are some instances of frequently employed kernels:
The most commonly used kernel is the RBF kernel. This is a radial function with the origin as its centre. It’s the standard in scikit-learn’s SVC()
class.
Then there’s a straightforward linear combination of the inputs is a linear kernel and its extension, the polynomial combination of the inputs with a given degree.
Note: The problem and the data determine which kernel to use. SVM and GP algorithms will perform and generalize differently depending on the kernel used.
Data Stories
There’s a certain power behind a persuasive data visualization.
Unfortunately, we can reinforce existing biases with simple visualizations. Depending on the framing of comparison, we can accidentally attribute a deficit to race. Framing leads to blaming.
This article goes into great detail on how comparing social inequalities can make things worse. But the article also shows very interesting ways how we can avoid enforcing stereotypes and making better visualizations to tell a fair story. A beautiful example is showing variability within data, contrary to some commonly used advice to generally use line and bar charts.
Source: Nightingale DVS collage (individual credit: Brookings, NCES, Wikipedia, CDC, The Atlantic, Vox, CNN Money, Wikipedia, McKinsey, Economic Policy Institute, Economic Policy Institute, US Census, US Sentencing Commission, CDC, Federal Reserve, CNN).
Question of the Week
- How can you normalize data with outliers?
Post them on Twitter and Tag me. I'd love to see what you come up with. Then I can include them in the next issue!
Tidbits from the Web
- ChatGPT can create crochet patterns. Trying them is hilarious.
- You can calculate “how deadly” an activity is, and a marathon has more micromorts than scuba diving.
- Hydrogen burns clear, so NASA developed a high-tech detector, namely sticking out a broom.
Jesper Dramsch is the creator of PythonDeadlin.es, ML.recipes, data-science-gui.de and the Latent Space Community.
I laid out my ethics including my stance on sponsorships, in case you're interested!