🦙 Pride is a great time to be an Ally-paca
Late to the Party 🎉 is about insights into real-world AI without the hype.
Hello internet,
it’s the weekend! Let’s celebrate with some machine learning!
The Latest Fashion
- A catalog of transformer models and how they work
- 13 Freelance developer portfolios to inspire you
- In defense of prompt engineering by Simon Willison
Got this from a friend? Subscribe here!
My Current Obsession
I got a bit obsessed with certifications this week and created a webpage that lists all the different certifications I got. I will be taking the mental health first aider certification this month to be even better prepared for community building. That’s quite exciting.
Recently I have been listening to the podcast “Tech Won’t save us”, which has been a refreshing break from this yelling match about AI.
Finally, I got off my butt and went back to climbing! I have kind of fallen off the wagon for any sort of movement, which was pretty devastating. So feeling my arms tired while I type this is just lovely.
Yesterday I went to an improv show as well. So overall a really eventful week!
This weekend I will finally edit my new Skillshare class on chatGPT for creatives and content creators. Want a sneak peek?
Thing I Like
I bought a big hammock / swinging chair for my balcony, and it’s been lovely to chill outside during the hot times.
Hot off the Press
I published another short! Do you actually need a GPU for ML?
I wrote a thing for pythondeadlin.es about where to get your Python conference information from, because there are two other really valuable resources!
In Case You Missed It
My Notion tasklist hack for ADHDers has become quite popular again!
Machine Learning Insights
Last week I asked, In the context of prompt engineering, what are some potential ethical concerns that may arise when using language models, and how would you address them?, and here’s the gist of it:
Prompt engineering plays a crucial role in using language models effectively. However, we need to be aware of and address potential ethical concerns associated with their usage. Here are a few key concerns and possible mitigation strategies:
Bias and Fairness
Language models learn from the data they are trained on, which can inadvertently contain biases present in the data. This can lead to biased or unfair outputs, perpetuating societal biases or stereotypes. To address this, it's important to carefully curate training data, ensure diverse representation, and regularly evaluate and mitigate biases in the models' responses. In our prompts, we can specifically prime the prompt to include more diverse viewpoints.
Misinformation and Manipulation
Language models can generate responses that appear factual but may be inaccurate or misleading. There is a risk of malicious actors exploiting this capability to spread misinformation or manipulate users. To tackle this, it is crucial to thoroughly fact-check the output.
Privacy and Data Security
Language models require vast amounts of data for training, which may include personal or sensitive information. It's essential to handle data responsibly, implement strong data privacy measures, and obtain appropriate consent when necessary. Also, models should be designed to avoid unintentionally disclosing sensitive information in their responses. As users, we need to be aware that the data we provide may be used by the app, so disclosing personally-identifiable information or confidential data is ill-advised.
Unintended Consequences
Language models can generate outputs that may have unintended consequences or implications. This includes generating harmful content, reinforcing harmful beliefs, or amplifying negative emotions. As the users of these models, we have to keep in mind what we’re using large language models for and remember to make the world a better place with what we do.
It is important to encourage users to provide feedback on the generated outputs, report any issues, and continuously update and refine the language models to improve performance and mitigate unintended consequences.
Data Stories
This visualization of the embedding space on a vision transformer is quite fun to watch.
We can already see some good separation due to the pre-training.
Then the embeddings move around during the fine-tuning process.
We are watching a 2D PCA of the actual embeddings, and I wonder if there is some extra separability in a third dimension.
Source: Medium check out the full description!
Question of the Week
- How would you communicate a machine learning solution to subject matter experts?
Post them on Mastodon and Tag me. I'd love to see what you come up with. Then I can include them in the next issue!
Tidbits from the Web
- Physics changed this week, we can measure a Gravitational Wave Background. Let Hank Green explain it!
- This paper outlines how the GWB could enable the evaluation of hypotheses on the new physics.
- I’ve been playing Connections daily, a little association word puzzle from NYT.
Jesper Dramsch is the creator of PythonDeadlin.es, ML.recipes, data-science-gui.de and the Latent Space Community.
I laid out my ethics including my stance on sponsorships, in case you're interested!