Late To The Party šŸŽ‰ logo

Late To The Party šŸŽ‰

Subscribe
Archives
March 1, 2025

😤 If I shenan once, I will shenanigan

In this issue, we’ll have a weather ML Benchmark, learning Python, and graph data augmentation. A lot of personal updates, including a Nature paper and a talk at PyCon Germany. Lots of new content and why sentiment analysis across language barriers is so difficult.


Late to the Party šŸŽ‰ is about insights into real-world AI without the hype.


Social Preview

Hello internet,

Finally, another newsletter issue! Unfortunately, I’ve been sick for most of the year, so talk about a rough start.

In this issue, we’ll have a weather ML Benchmark, learning Python, and graph data augmentation. A lot of personal updates, including a Nature paper and a talk at PyCon Germany. Lots of new content and why sentiment analysis across language barriers is so difficult.

So let’s dive right in!

The Latest Fashion

  • The WeatherBench benchmark was completely updated and upgraded!
  • You can learn Python with VSCode in the browser!
  • Grafog makes data augmentation for graphs in Pytorch Geometric easy

Worried these links might be sponsored? Fret no more. They’re all organic, as per my ethics.

My Current Obsession

It’s been a while, so I have to bust out some sub-headings!

Nature Geoscience Publication!

I'm absolutely thrilled to share that our Comment in Nature Geoscience has been published! This has been such a journey, and I couldn't be prouder of what we accomplished. Getting through the review process was challenging, but seeing our work finally published makes it all worthwhile. It was exciting working with such a diverse group of co-authors from the European Commission, over Fraunhofer, to Pinterest and Linkedin (and more!). You can find my research on my homepage as well.

Finnish Adventures and Aurora Magic

I recently took a vacation to Finland, and wow - the aurora borealis was truly magical! I've seen pictures and videos before, but nothing compares to witnessing those dancing lights across the night sky in person. If you ever get the chance to see the northern lights, drop everything and go. It's an experience that defies description.

I also went outside of my comfort zone and went Ice-karting, which was incredibly fun. It basically felt like real-life Mario Kart where I just drifted around all corners. Such a blast!

I was so inspired, I even made a few Tiktoks!

From Fairly Fit to Couch Potato

On a less exciting note, I've been battling the flu for three weeks now, and it's been brutal. It's amazing how quickly your body can change - one day I'm spontaneously running 10km in -10°C weather, and the next I'm barely making it up the stairs.

Being sick really makes you appreciate good health! Can’t wait to get back on my feet and move again.

PyCon Germany Acceptance!

Some exciting news to end on: my talk at PyCon Germany has been accepted! I'll be presenting "Going Global: Taking code from research to operational open ecosystem for AI weather forecasting." The talk will cover our journey with Anemoi, which grew from experimental code by a small team to a robust ecosystem supporting 40+ developers across multiple international weather agencies.

I'll share our experiences scaling both the team and codebase, including how we transformed from research notebooks into a structured ecosystem of packages, managed over 300 configuration options with Hydra, and integrated modern ML tooling with traditional meteorological systems. I'll also dive into practical challenges like model sharding for global weather predictions, implementing flexible grids for regional services, and managing CI/CD across multiple packages.

If you're attending PyCon Germany, I'd love to see you there! If you're interested in how AI is transforming weather forecasting, I promise to share some valuable insights from the frontier of operational ML systems.

Thing I Like

In Finland, I went for a lot of runs and hikes and having ice cleats / crampons was really good. Didn’t have them on one day and fell hard twice. That was rough.

Hot off the Press

I wrote an article about explainable AI in geoscience and its impact on natural disaster management and climate change.

Or maybe you’d enjoy this balanced approach to generative AI?

I made two video on Tiktok. Due to the Tiktok ban in the US, people felt the desire for #EuroTok and #WorldTok. I shared two videos from Finland, which was very fun. The first video was of the Finish snow and the second was of one of my runs!

In Case You Missed It

I’m trying to get my life back on track, so my popular blog post about an ADHD-friendly Notion task list might even be helpful to me.

On Socials

People on Mastodon enjoyed these deep learning puzzles. Bluesky liked Aquarel, which makes matplotlib theming easier. LinkedIn had some fun with ā€œFriends don’t let friendsā€ for don'ts in data viz.

Chris Albon asked for ML recommendations on Bluesky, so I reshared my fairly popular starter pack.

I told the story of a Canadian influencer claiming ā€œPrussian heritage", which was quite popular. However, my LinkedIn posts started getting a ton of AI-generated comments, which is fairly annoying.

Python Deadlines

The PyCon Austria is closing soon!

I found these conferences PyCon South Korea, DjangoCon Japan, Swiss Python Summit, and PyCon Sweden and new CFPs for PyCon Colombia and Python Brasil. And I got a lovely PR for PyData London 2025!

Machine Learning Insights

Last week I asked, What are the key considerations in using NLP models for sentiment analysis in diverse languages, and how do you address cultural nuances?, and here’s the gist of it:

When we use AI to understand how people feel in their writing (sentiment analysis), things get tricky when working with different languages. Let's break down what makes this challenging and how we can make it work better.

Language Structure Matters

Languages are built differently:

  • In Turkish or Finnish, words get built by adding pieces together
  • In Chinese, word order matters more than word endings
  • In Russian, words change form based on how they're used

These differences mean we can't just translate everything to English and expect our AI to understand the feelings correctly. Fine nuances can change entirely if someone is being sarcastic for example.

Writing Systems Create Challenges

Different writing systems add more complications:

  • Thai doesn't put spaces between words
  • Vietnamese uses special marks above letters that change meaning
  • Many languages don't use the Latin alphabet

This can already mean that the classical ā€œtokenizersā€ we easily use in English don’t work out of the box.

Cultural Meanings Run Deep

This is the biggest challenge! The same words or phrases can mean very different things in different cultures:

  • Some cultures express happiness in understated ways (in German the highest compliment is ā€œCan’t complainā€)
  • Some use metaphors that make no sense when translated
  • The strength of feelings (how positive or negative) varies wildly between cultures

For example, what sounds polite in Japanese might seem cold in English, or what's enthusiastically positive in American English might seem over-the-top in other cultures.

How We're Solving These Problems

Smarter AI Models

  • Newer AI models (like XLM-RoBERTa) can work with many languages at once
  • Some models can even work with languages they weren't explicitly trained on

Adding Cultural Awareness

  • Having diverse teams who understand different cultures create the training data
  • Creating specific guidelines for each language and culture
  • Directly teaching models about cultural differences

Practical Steps

  • Creating more training examples by using ā€œtranslation pairsā€ of difficult examples
  • Building special datasets for each language to test how well the models work
  • Always having native speakers check the results

The Path Forward

Building truly effective sentiment analysis across languages requires bringing together language experts, cultural knowledge, and technical skills. We've made good progress with new AI models, but we still need to get better at understanding the cultural side of language.

The best systems combine advanced AI with carefully created resources for each language and culture. And we need to keep testing against diverse examples to make sure we're getting better at this challenging but important task.

Got this from a friend? Subscribe here!

Question of the Week

  • How can GANs (Generative Adversarial Networks) be used to improve the realism of climate models?

Post them on Mastodon and Tag me. I'd love to see what you come up with. Then I can include them in the next issue!

Tidbits from the Web

  • My brain currently.
  • Why metalheads are so cute.
  • There’s important breaking news!

Jesper Dramsch is the creator of PythonDeadlin.es, ML.recipes, data-science-gui.de and the Latent Space Community.

I laid out my ethics including my stance on sponsorships, in case you're interested!

Read more:

  • šŸ Python 3.14 releases this year, making it Ļ€-thon

    In this issue, we talk about AI in wildlife conservation. The anti-woke tech bro. Hacking LLMs with invisible text and running AI weather models on open data! I also made some cute mascots for Pythondeadlines, which I’m proud of!

  • ✈ Home is Germany, but even abroad there are Gerfew

    In this edition, I cover a new hybrid weather AI model, a generalized graph RAG system, the limitations of generative AI, and my journey with real-time weather forecasting using AI, along with updates on my Latent Space event and some cute TikToks!

Don't miss what's next. Subscribe to Late To The Party šŸŽ‰:
Start the conversation:
GitHub YouTube LinkedIn