š¤ If I shenan once, I will shenanigan
In this issue, weāll have a weather ML Benchmark, learning Python, and graph data augmentation. A lot of personal updates, including a Nature paper and a talk at PyCon Germany. Lots of new content and why sentiment analysis across language barriers is so difficult.
Late to the Party š is about insights into real-world AI without the hype.
Hello internet,
Finally, another newsletter issue! Unfortunately, Iāve been sick for most of the year, so talk about a rough start.
In this issue, weāll have a weather ML Benchmark, learning Python, and graph data augmentation. A lot of personal updates, including a Nature paper and a talk at PyCon Germany. Lots of new content and why sentiment analysis across language barriers is so difficult.
So letās dive right in!
The Latest Fashion
- The WeatherBench benchmark was completely updated and upgraded!
- You can learn Python with VSCode in the browser!
- Grafog makes data augmentation for graphs in Pytorch Geometric easy
Worried these links might be sponsored? Fret no more. Theyāre all organic, as per my ethics.
My Current Obsession
Itās been a while, so I have to bust out some sub-headings!
Nature Geoscience Publication!
I'm absolutely thrilled to share that our Comment in Nature Geoscience has been published! This has been such a journey, and I couldn't be prouder of what we accomplished. Getting through the review process was challenging, but seeing our work finally published makes it all worthwhile. It was exciting working with such a diverse group of co-authors from the European Commission, over Fraunhofer, to Pinterest and Linkedin (and more!). You can find my research on my homepage as well.
Finnish Adventures and Aurora Magic
I recently took a vacation to Finland, and wow - the aurora borealis was truly magical! I've seen pictures and videos before, but nothing compares to witnessing those dancing lights across the night sky in person. If you ever get the chance to see the northern lights, drop everything and go. It's an experience that defies description.
I also went outside of my comfort zone and went Ice-karting, which was incredibly fun. It basically felt like real-life Mario Kart where I just drifted around all corners. Such a blast!
I was so inspired, I even made a few Tiktoks!
From Fairly Fit to Couch Potato
On a less exciting note, I've been battling the flu for three weeks now, and it's been brutal. It's amazing how quickly your body can change - one day I'm spontaneously running 10km in -10Ā°C weather, and the next I'm barely making it up the stairs.
Being sick really makes you appreciate good health! Canāt wait to get back on my feet and move again.
PyCon Germany Acceptance!
Some exciting news to end on: my talk at PyCon Germany has been accepted! I'll be presenting "Going Global: Taking code from research to operational open ecosystem for AI weather forecasting." The talk will cover our journey with Anemoi, which grew from experimental code by a small team to a robust ecosystem supporting 40+ developers across multiple international weather agencies.
I'll share our experiences scaling both the team and codebase, including how we transformed from research notebooks into a structured ecosystem of packages, managed over 300 configuration options with Hydra, and integrated modern ML tooling with traditional meteorological systems. I'll also dive into practical challenges like model sharding for global weather predictions, implementing flexible grids for regional services, and managing CI/CD across multiple packages.
If you're attending PyCon Germany, I'd love to see you there! If you're interested in how AI is transforming weather forecasting, I promise to share some valuable insights from the frontier of operational ML systems.
Thing I Like
In Finland, I went for a lot of runs and hikes and having ice cleats / crampons was really good. Didnāt have them on one day and fell hard twice. That was rough.
Hot off the Press
I wrote an article about explainable AI in geoscience and its impact on natural disaster management and climate change.
Or maybe youād enjoy this balanced approach to generative AI?
I made two video on Tiktok. Due to the Tiktok ban in the US, people felt the desire for #EuroTok and #WorldTok. I shared two videos from Finland, which was very fun. The first video was of the Finish snow and the second was of one of my runs!
In Case You Missed It
Iām trying to get my life back on track, so my popular blog post about an ADHD-friendly Notion task list might even be helpful to me.
On Socials
People on Mastodon enjoyed these deep learning puzzles. Bluesky liked Aquarel, which makes matplotlib theming easier. LinkedIn had some fun with āFriends donāt let friendsā for don'ts in data viz.
Chris Albon asked for ML recommendations on Bluesky, so I reshared my fairly popular starter pack.
I told the story of a Canadian influencer claiming āPrussian heritage", which was quite popular. However, my LinkedIn posts started getting a ton of AI-generated comments, which is fairly annoying.
Python Deadlines
The PyCon Austria is closing soon!
I found these conferences PyCon South Korea, DjangoCon Japan, Swiss Python Summit, and PyCon Sweden and new CFPs for PyCon Colombia and Python Brasil. And I got a lovely PR for PyData London 2025!
Machine Learning Insights
Last week I asked, What are the key considerations in using NLP models for sentiment analysis in diverse languages, and how do you address cultural nuances?, and hereās the gist of it:
When we use AI to understand how people feel in their writing (sentiment analysis), things get tricky when working with different languages. Let's break down what makes this challenging and how we can make it work better.
Language Structure Matters
Languages are built differently:
- In Turkish or Finnish, words get built by adding pieces together
- In Chinese, word order matters more than word endings
- In Russian, words change form based on how they're used
These differences mean we can't just translate everything to English and expect our AI to understand the feelings correctly. Fine nuances can change entirely if someone is being sarcastic for example.
Writing Systems Create Challenges
Different writing systems add more complications:
- Thai doesn't put spaces between words
- Vietnamese uses special marks above letters that change meaning
- Many languages don't use the Latin alphabet
This can already mean that the classical ātokenizersā we easily use in English donāt work out of the box.
Cultural Meanings Run Deep
This is the biggest challenge! The same words or phrases can mean very different things in different cultures:
- Some cultures express happiness in understated ways (in German the highest compliment is āCanāt complainā)
- Some use metaphors that make no sense when translated
- The strength of feelings (how positive or negative) varies wildly between cultures
For example, what sounds polite in Japanese might seem cold in English, or what's enthusiastically positive in American English might seem over-the-top in other cultures.
How We're Solving These Problems
Smarter AI Models
- Newer AI models (like XLM-RoBERTa) can work with many languages at once
- Some models can even work with languages they weren't explicitly trained on
Adding Cultural Awareness
- Having diverse teams who understand different cultures create the training data
- Creating specific guidelines for each language and culture
- Directly teaching models about cultural differences
Practical Steps
- Creating more training examples by using ātranslation pairsā of difficult examples
- Building special datasets for each language to test how well the models work
- Always having native speakers check the results
The Path Forward
Building truly effective sentiment analysis across languages requires bringing together language experts, cultural knowledge, and technical skills. We've made good progress with new AI models, but we still need to get better at understanding the cultural side of language.
The best systems combine advanced AI with carefully created resources for each language and culture. And we need to keep testing against diverse examples to make sure we're getting better at this challenging but important task.
Got this from a friend? Subscribe here!
Question of the Week
- How can GANs (Generative Adversarial Networks) be used to improve the realism of climate models?
Post them on Mastodon and Tag me. I'd love to see what you come up with. Then I can include them in the next issue!
Tidbits from the Web
- My brain currently.
- Why metalheads are so cute.
- Thereās important breaking news!
Jesper Dramsch is the creator of PythonDeadlin.es, ML.recipes, data-science-gui.de and the Latent Space Community.
I laid out my ethics including my stance on sponsorships, in case you're interested!